Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
INFORMATION DISCLOSURE STATEMENT
The information disclosure statement (IDS) submitted on 5/14/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
FOREIGN PRIORITY
A claim for foreign priority under 35 U.S.C § 119 (a) - (d), which was contained in the Declaration and Power of Attorney filed on 5/14/2024 has been acknowledged. Acknowledgement of claimed foreign priority and receipt of priority documents is reflected in form PTO-326 Office Action Summary.
CLAIM REJECTIONS - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-2, 4-9, 11-13, 16-18 & 20 are rejected under 35 U.S.C. 103 as being unpatentable over
Rong (U.S. Publication 2021/0383616) in view of REN et al. (U.S Publication 2022/0300681)
As to claims 1 & 13, Rong discloses a method of augmenting data performed by a computing device ([0024] discloses augmenting existing images with dynamic objects.), the method comprising: based on information about objects comprised in target data, extracting a region for object synthesis from a point cloud of the target data (202, Fig. 2 & [0137] discloses environment/target data including point clouds. 204, Fig. 2 & [0138] discloses selecting a synthesis region (insertion location) based on the scene and while avoiding conflicts with existing objects.); determining a target object based on location information about the extracted region (206, Fig. 2 & [0139] discloses object selection driven by insertion location.); based on a point cloud of the target object and the point cloud of the target data ([0035] discloses object point clouds.), and generating a synthetic image by synthesizing an image of the target object with an image of the target data based on the location information about the extracted region and the point cloud of the target object. (208, Fig. 2 & [0140] discloses geometry/depth processing from LIDAR. 201, Fig. 2 & [0141] discloses image synthesis via insertion. ).
Rong is silent to synthesizing the point cloud of the target object with the extracted region to generate a synthetic point cloud; wherein the synthetic point cloud and the synthetic image form an augmented training item.
However, REN discloses synthesizing the point cloud of the target object with the extracted region to generate a synthetic point cloud; ([0010] discloses object point clouds and injection into other point cloud frames and collision test. [0010] also discloses location/region sanity via collision test.) wherein the synthetic point cloud and the synthetic image form an augmented training item ([0009] discloses training samples via augmentation.).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong’s disclosure to include the above limitations in order to produce consistent multi modal (image and point cloud) augmented training items for training/testing perception models that consume Lidar and camera data, while leveraging the same insertion location/scene context and collision/occlusion reasoning taught by Rong in view of REN.
As to claims 2 & 14, Rong in view of REN discloses everything as disclosed in claims 1 & 13. In addition, Rong discloses wherein the region is extracted from the point cloud of the target data based on segmentation information about the point cloud of the target data. ([0159-0160] discloses receiving camera image data and Lidar point cloud data and outputting the object region of interest and a silhouette. See gathering the Lidar point inside the bounding box.)
As to claim 4, Rong in view of REN discloses everything as disclosed in claim 1 but is silent to wherein the location information about the extracted region comprises coordinate information, rotation information, and size information about the extracted region.
However, REN discloses wherein the location information about the extracted region comprises coordinate information, rotation information, and size information about the extracted region. ([0072] discloses anchor location specified based on a set of coordinates. [0072] discloses anchor location can include a bounding box that include size. [0103] may be rotated about vertical axis.)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong in view of REN’s disclosure to include the above limitations in order to place and orient the synthesized object accurately and maintain geometric realism.
As to claims 5 & 16, Rong in view of REN discloses everything as disclosed in claims 1 & 13. In addition, Rong discloses wherein the determining of the target object comprises, among objects of which a point cloud and location information are stored in a database, selecting, to be the target object, an object based on the object having location information that corresponds to the location information about the extracted region. ([0166] discloses segment retrieval process determines which set of object data is used as a new object. [0166] retrieving segments “based on one or more criteria, such as viewpoint, distance and occlusion.)
As to claim 6, Rong in view of REN discloses everything as disclosed in claim 5 but is silent to wherein the location information about the object stored in the database comprises distance information and angle information from an ego.
However, REN discloses wherein the location information about the object stored in the database comprises distance information and angle information from an ego. ([0099] discloses anchor location may be represented as distance and angle from an ego vehicle.)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong in view of REN’s disclosure to include the above limitations in order to match and place objects consistently in vehicle centric perception frames.
As to claims 7 & 17, Rong in view of REN discloses everything as disclosed in claims 1 & 13 but is silent to wherein the determining of the target object comprises correcting location information about the target object by rotationally transforming the location information about the target object based on the location information about the extracted region, wherein an angle between a first vector and a progress vector of the target object and distance information between the target object and the ego correspond to the location information about the extracted region, wherein the first vector is defined by a reference location of an ego and a reference location of the target object.
However, REN discloses herein the determining of the target object comprises correcting location information about the target object by rotationally transforming the location information about the target object based on the location information about the extracted region, wherein an angle between a first vector and a progress vector of the target object and distance information between the target object and the ego correspond to the location information about the extracted region, wherein the first vector is defined by a reference location of an ego and a reference location of the target object. ([0100] discloses “Angle between ego -> object vector and progress vector; distance to ego” angles between a vector from the ego vehicle location to the anchor location and the progress vector. [0103] discloses location information may be rotated about the vertical axis.)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong in view of REN’s disclosure to include the above limitations in order to align inserted object orientation with the scene’s ego centric geometry and motion direction.
As to claim 8, Rong in view of REN discloses everything as disclosed in claim 1. In addition, Rong discloses wherein the synthesizing of the image of the target object with the image of the target data is based on the image of the target data (210, Fig. 2 [0141] discloses augmenting one or more images of the environment to generate an initial augmented image with one or more object inserted.), the image of the target object ([0043] discloses environment data that can include camera captured imagery. [0157] discloses reconstructed object data can include images of the object.).
Rong in view of REN is silent to synthesis being based on the point cloud of the target object.
However, REN discloses synthesis being based on the point cloud of the target object. ([0028] discloses the system memory storing point cloud object instance and a target point cloud frame. [0039] discloses generating a 2D range image wherein each pixel corresponds to a point of the target point cloud frame, and projecting the transformed surface model onto the range image. )
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong in view of REN’s disclosure to include the above limitations in order to improve the geometric realism and sensor consistency of the synthesized object with environment data by explicitly basing the synthesis on the object point cloud geometry.
As to claims 9 & 18, Rong in view of REN discloses everything as disclosed in claims 1 & 13. In addition, Rong discloses wherein the synthesizing of the image of the target object with the image of the target data is based on the image of the target data, the image of the target object, and the point cloud of the target object. ([0159] discloses receiving camera image data and Lidar point cloud data. Outputting ROI and a silhouette. )
As to claims 11 & 20, Rong in view of REN discloses everything as disclosed in claims 1 & 13 respectively but is silent to generating at least one of a synthetic point cloud generated by synthesizing the point cloud of the target object with the point cloud of the target data or a synthetic image generated by synthesizing the image of the target object with the image of the target data as training data of a neural network.
However, REN discloses generating at least one of a synthetic point cloud generated by synthesizing the point cloud of the target object with the point cloud of the target data or a synthetic image generated by synthesizing the image of the target object with the image of the target data as training data of a neural network. ([0009] discloses surface models inject new point cloud object instances into a target point cloud frame and can be used as training data of a neural network. [0101] discloses injecting into the target point cloud frame at an arbitrary anchor location. )
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong in view of REN’s disclosure to include the above limitations in order to improve model robustness by increasing labeled training diversity.
As to claim 12, Rong in view of REN discloses everything as disclosed in claim 1. In addition, Rong discloses a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1. ([0062, 0065])
Claims 3, 10, 15 & 19 are rejected under 35 U.S.C. 103 as being unpatentable over
Rong (U.S. Publication 2021/0383616) in view of REN et al. (U.S Publication 2022/0300681)
as applied in claims 1 & 13 above further in view of Ning et al. (U.S. Publication 2022/0383041)
As to claims 3 & 15, Rong in view of REN discloses everything as disclosed in claims 1 & 13 but is silent to wherein the extracting of the region comprises: determining a class of an object to be synthesized with the target data; and extracting a region for object synthesis from the target data based on locations of objects comprised in the target data that are associated with the determined class.
However, Ning discloses wherein the extracting of the region comprises: determining a class of an object to be synthesized with the target data ([0076 discloses object label is category/type (class). [0082] discloses the label is again described as the objects category or name.); and extracting a region for object synthesis from the target data based on locations of objects comprised in the target data that are associated with the determined class. [0076] discloses the location aspect is the bounding box enclosing the object. [0082] discloses same pairing: bounding box location and object category/label. [0077] discloses bounding boxes are treated as object regions used in the pipeline)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong in view of REN’s disclosure to include the above limitations in order to automate class targeted synthetic augmentation (placing/synthesizing objects in the correct regions) and efficiently generate larger, properly localized training datasets for improved autonomous perception model performance.
As to claims 10 & 19, Rong in view of REN discloses everything as disclosed in claims 9 & 18 but is silent to wherein the determining of the in-painting region in the image of the target data comprises determining a region, corresponding to the extracted region, in the image of the target data to be the in-painting region based on a coordinate obtained by projecting a coordinate of a point cloud of the extracted region based on the image of the target data.
However, Ning discloses wherein the determining of the in-painting region in the image of the target data comprises determining a region, corresponding to the extracted region, in the image of the target data to be the in-painting region based on a coordinate obtained by projecting a coordinate of a point cloud of the extracted region based on the image of the target data. ([0091] discloses new pixel locations from projected 3D points. [0092] discloses marked keypoints are re-projected; their new pixel locations are used for the target view region. [0093] discloses linking pixels to point cloud point then projected pixel then uses projected pixels to define the target region.)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong in view of REN’s disclosure to include the above limitations in order to realistically fill/repair the region corresponding to the projected extracted object/occlusion in synthesized target imagery and thereby improve realism and training utility of the augmented data.
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Rong (U.S. Publication 2021/0383616) in view of REN et al. (U.S Publication 2022/0300681)
as applied in claim 13 above further in view of Rajpal et al. (U.S. Publication 2023/0054440)
As to claim 14, Rong in view of REN discloses everything as disclosed in claim 13 but is silent to wherein the instructions are further configured to cause the one or more processors to extract the region from the point cloud of the target data based on segmentation information about the point cloud of the target data.
However, Rajpal discloses wherein the instructions are further configured to cause the one or more processors to extract the region from the point cloud of the target data based on segmentation information about the point cloud of the target data. ([0015] discloses segmenting the points in the at least one-point cloud into the plurality of segments. [0015] discloses states the segmentation can be performed using a CRF. )
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Rong in view of REN’s disclosure to include the above limitations in order to reliably isolate and extract only the relevant region of the target data point cloud before performing the subsequent synthesis/processing steps.
CONCLUSION
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Stephen P Coleman whose telephone number is (571)270-5931. The examiner can normally be reached Monday-Thursday 8AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Moyer can be reached at (571) 272-9523. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Stephen P. Coleman
Primary Examiner
Art Unit 2675
/STEPHEN P COLEMAN/Primary Examiner, Art Unit 2675