Last updated: April 19, 2026

Application No. 18/320,265

OBJECT SEGMENTATION USING MACHINE LEARNING FOR AUTONOMOUS SYSTEMS AND APPLICATIONS

Non-Final OA §102§103

Filed

May 19, 2023

Examiner

RHIM, WOO CHUL

Art Unit

2676

Tech Center

2600 — Communications

Assignee

Nvidia Corporation

OA Round

3 (Non-Final)

Interview Optional

— +21.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 140 resolved cases, 2023–2026

Examiner Intelligence

RHIM, WOO CHUL View full profile →

Grants 80% — above average

Career Allow Rate

112 granted / 140 resolved

+18.0% vs TC avg

Strong +21% interview lift

Without

With

+21.4%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

28 currently pending

Career history

168

Total Applications

across all art units

Statute-Specific Performance

§101

7.4%

-32.6% vs TC avg

§103

47.1%

+7.1% vs TC avg

§102

23.2%

-16.8% vs TC avg

§112

19.0%

-21.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 140 resolved cases

Office Action

§102 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/20/2026 has been entered.

Response to Amendments 
Submission dated 01/20/2026 amends 1, 5, 9, 13, and 17.  Claims 1-20 are pending.
In view of the amendments, the previously set forth 101 rejections have been withdrawn.

Response to Arguments
Applicant's arguments filed with the submission have been fully considered but they are not persuasive. 
On pages 13-14 of the submission, the applicant argues that us patent application publication no. 2023/0099494 to Kocamaz et al. (hereinafter Kocamaz) does not teach the amended claim languages of the independent claims.  The examiner disagrees because, Kocamaz teaches:
determine, using one or more neural networks, one or more candidate shapes or masks corresponding to the object (see, e.g., pars. 70-71 and 75-76 and FIGS. 6 and 7, which teach determining one or more output masks, i.e., sets of pixels of the image, corresponding to the object);
determine target portions of the one or more candidate shapes or masks based at least on excluding at least a non-target portion of the one or more candidate shapes or masks that are outside of a region identified by the bounding shape (see, e.g., pars. 64-67, 72-74 and 77- 79 and FIGS. 5-7, which teach associating the pixel sets with the bounding shapes by determining portions of the first and second set of pixels that are within the bounding shape and portions that are outside of the bounding shape);
determine a segmentation mask corresponding to the object depicted in the image based at least on the one or more neural networks processing the image, bounding shape information corresponding to the bounding shape, and the target portions of the one or more candidate shapes or masks (see, e.g., pars. 35-37, 40-41, 53-57, 60, 65-67, and 69-79 and FIGS. 4-7, which teach determining the output mask to assigned to the object based on the machine learning model, the bounding shape and portions of the pixel sets that are within the bounding shape), the segmentation mask including less pixels of the image than the bounding shape (see, e.g., par. 64, which teaches removing pixels outside of a bounding shape, suggesting that the mask corresponding to the object would have less pixels of the image than the bounding shape).
For the aforementioned teachings of Kocamaz, the examiner finds the applicant’s arguments unpersuasive.

Claim Objections
Claims 5 and 13 are objected to because of the following informalities: 
Claims 5 and 13 each recites the phrase “corresponding to corresponding to first pixels” in lines 2-3.  The examiner believes that the part of the phrase “corresponding to” needs to be deleted.  Appropriate correction is required.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 2, 4-6, 8-10, 12-14, and 16-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kocamaz.
For claims 1 and 9, Kocamaz as applied discloses at least one processor (see, e.g., pars. 190-203 and FIGS. 8-10 and claims 1 and 12) comprising:
one or more circuits to: 
determine a bounding shape corresponding to an object depicted in an image (see, e.g., pars. 35-37, 52, 60, 65-67, 72, and 77 and FIGS. 2B-D, 3, 5A-B, 6, and 7,  which teach determining a bounding shape corresponding to a vehicle in an input image);
determine, using one or more neural networks, one or more candidate shapes or masks corresponding to the object (see, e.g., pars. 70-71 and 75-76 and FIGS. 6 and 7, which teach determining one or more output masks, i.e., sets of pixels of the image, corresponding to the object);
determine target portions of the one or more candidate shapes or masks based at least on excluding at least a non-target portion of the one or more candidate shapes or masks that are outside of a region identified by the bounding shape (see, e.g., pars. 64-67, 72-74 and 77- 79 and FIGS. 5-7, which teach associating the pixel sets with the bounding shapes by determining portions of the first and second set of pixels that are within the bounding shape and portions that are outside of the bounding shape);
determine a segmentation mask corresponding to the object depicted in the image based at least on the one or more neural networks processing the image, bounding shape information corresponding to the bounding shape, and the target portions of the one or more candidate shapes or masks (see, e.g., pars. 35-37, 40-41, 53-57, 60, 65-67, and 69-79 and FIGS. 4-7, which teach determining the output mask to assigned to the object based on the machine learning model, the bounding shape and portions of the pixel sets that are within the bounding shape), the segmentation mask including less pixels of the image than the bounding shape (see, e.g., par. 64, which teaches removing pixels outside of a bounding shape, suggesting that the mask corresponding to the object would have less pixels of the image than the bounding shape); and 
cause a machine to perform one or more planning, navigation, or control operations based at least on the segmentation mask (see, e.g., pars. 61-64, 68 and 80 and FIGS. 4 and 7, which teach performing one or more operations, such as vehicle control and navigation based on the output masks).

For claim 17, Kocamaz as applied discloses:
performing one or more operations by a machine based at least on a segmentation mask corresponding to an object (see, e.g., pars. 61-64, 68, and 80 and FIGS. 4 and 7, which teach performing one or more operations based on the output masks), 
the segmentation mask generated using (i) an image of the object, (ii) a bounding shape corresponding to the object, and (iii) target portions of one or more candidate shapes or masks (see, e.g., pars. 35-37, 40-41, 53-57, 60, 65-67, and 69-79 and FIGS. 4-7, which teach determining the output mask to assigned to the object based on the bounding shape and portions of the image pixel sets that are within the bounding shape), 
the segmentation mask including less pixels of the image than the bounding shape (see, e.g., par. 64, which teaches removing pixels outside of a bounding shape, suggesting that the mask corresponding to the object would have less pixels of the image than the bounding shape), 
the one or more candidate shapes or masks corresponding to the object determined using a neural network (see, e.g., pars. 70-71, 75-76 and FIGS. 6, 7, which teach determining one or more output masks, i.e., sets of pixels of the image, corresponding to the object), 
the target portions of the one or more candidate shapes or masks determined based at least on excluding at least a non-target portion of the one or more candidate shapes or masks that are outside of a region identified by the bounding shape (see, e.g., pars. 64-67, 72-74 and 77- 79 and FIGS. 5-7, which teach associating the pixel sets with the bounding shapes by determining portions of the first and second set of pixels that are within the bounding shape and portions that are outside of the bounding shape), and the neural network that was trained using ground truth data generated using bounding shape labels (see, e.g., pars. 30-40, 53-57, 60, 65-67, 70, and 75 and FIGS. 1, 2A-D, 4, 5A-B, 6, and 7, which teach a machine learning model trained using GT masks encoded with the bounding shape).

For claims 2 and 10, Kocamaz as applied discloses that the one or more planning, navigation, or control operations include at least one of:
operating a simulation using the segmentation mask; 
operating a machine perception system using the segmentation mask (see, e.g., pars. 68, 175 and 179 and claim 11); 
assigning the segmentation mask to a data structure comprising the image.

For claims 4 and 12, Kocamaz as applied discloses that the one or more neural networks are configured using training data comprising a plurality of training instances, at least one individual training instance of the plurality of training instances having a corresponding bounding shape and class indication (see, e.g., pars. 31-34 and FIGS. 1, 2A-C, which teach training a machine learning model using ground truth masks encoded with object class labels that include annotations corresponding to bounding shapes).

For claims 5 and 13, Kocamaz as applied teaches that the bounding shape surrounds the object in the image and a first additional portion of the image corresponding to corresponding to first pixels of the image within the bounding shape and outside an outline of the object (see, e.g., pars. 34-37 and FIGS. 2A-B and 5A-B, which teach that the bounding shapes includes the vehicles and areas surrounding the vehicles), 
the segmentation mask forms an outline of the object and a second additional portion of the image corresponding to second pixels of the image within the segmentation mask and outside the outline of the object (see, e.g., pars. 34-37 and FIGS. 2C-D, which teach that the segmentations masks include the vehicles and areas surrounding the vehicles), the second additional portion including less pixels than the first additional portion (see, e.g., par. 64, which teach removing pixels outside of a bounding shape, suggesting that the surrounding areas in the mask would have less pixels than the surrounding areas in the bounding shape)

For claims 6 and 14, Kocamaz as applied discloses that one or more parameters of the one or more neural networks are updated using the bounding shape and the segmentation mask (see, e.g., pars. 40 and 50, which teach updating parameters of the machine learning models using the output masks encoded with the bounding shapes).

For claims 8, 16 and 20, Kocamaz as applied discloses that the processor is comprised in at least one of:
a control system for an autonomous or semi-autonomous machine (see, e.g., pars. 19-20, 25, 59, and 81, FIGS. 8A-D and claim 11); 
a perception system for an autonomous or semi-autonomous machine ; 
a system for performing simulation operations; 
a system for performing digital twin operations; 
a system for performing light transport simulation; 
a system for performing collaborative content creation for 3D assets; 
a system for performing deep learning operations; 
a system implemented using an edge device; 
a system implemented using a robot; a system for performing conversational Al operations; 
a system implementing one or more large language models (LLMs); 
a system for generating synthetic data; 
a system incorporating one or more virtual machines (VMs); 
a system implemented at least partially in a data center; or 
a system implemented at least partially using cloud computing resources.

For claim 18, Kocamaz as applied discloses that the ground truth data includes segmentation masks that are automatically generated from the bounding shape labels (see, e.g., pars. 30-40, which does not mention human intervention and that the annotations used for training may be machine-automated; the examiner interprets the absence of human intervention and the automated annotating to suggests that GT masks are automatically generated from the annotations). 

For claim 19, Kocamaz as applied discloses that the one or more operations include at least one of a planning operation, a control operation, a navigation operation, or an actuation operation (see, e.g., pars. 4, 61 and 68, which teach a navigation operation).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3, 7, 11, and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kocamaz in view of us patent application publication no. 2021/0063578 to Wekel et al. (hereinafter Wekel).
For claims 3 and 11, while Kocamaz as applied does not explicitly teach, Wekel in the analogous art teaches that the one or more circuits are to receive the bounding shape as a three-dimensional shape, and convert the bounding shape to a two- dimensional shape corresponding to a frame of reference of the image (see, e.g., pars. 35-36 and 44 of Wekel, which teach receiving and projecting 3D LiDAR point cloud to 2D range image, which includes the instance segmentation mask corresponding to the bounding shapes of the image labels in the image).
It would have been obvious to modify Kocamaz to convert the bounding shapes as taught by Wekel because doing so would allow the location of the bounding shapes to be tracked during the conversion/transformation of the input data such that accurate ground truth data in the LiDAR range image domain may be generated (see, e.g., pars. 5, 24, 32, and 41 of Wekel).

For claims 7 and 15, while Kocamaz does not explicitly teach, Wekel in the analogous art teaches that the bounding shape information comprises (i) a data structure indicating a position of at least one of a corner or an edge of the bounding shape (see, e.g., pars. 35, 48 and 82 of Wekel, which teach that the bounding shape information includes pixel locations of one or more vertices of a bounding shape or offsets from an edge of a bonding shape) and (ii) an identifier of the object (see, e.g., par. 36, which teach that the bounding shape information includes the boundary contours encoded with unique instance information including an instance value for each separate object).
It would have been obvious to modify Kocamaz to use the bounding shape information as taught by Wekel because doing so would allow the location of the bounding shapes to be tracked during the conversion/transformation of the input data such that accurate ground truth data in the LiDAR range image domain may be generated (see, e.g., pars. 5, 24, 32, and 41 of Wekel).

Additional Citations
The following table lists several references that are relevant to the subject matter claimed and disclosed in this Application. The references are not relied on by the Examiner, but are provided to assist the Applicant in responding to this Office action.
Citation
Relevance
Radhakrishnan et al. (us pat. pub. 2022/0292306)
Describes training methods to generate a trained neural network that is robust to various environmental features. In an embodiment, training includes modifying images of a dataset and generating boundary boxes and/or other segmentation information for the modified images which is used to train a neural network.  
Chaubard (us pat. pub. 2020/0089997)
Describes a method for generating training examples for a product recognition model. In one embodiment, the method includes capturing images of a product using an array of cameras. A product identifier for the product is associated with each of the images. A bounding box for the product is identified in each of the images. The bounding boxes are smoothed temporally. A segmentation mask for the product is identified in each bounding box. The segmentation masks are optimized to generate an optimized set of segmentation masks. A machine learning model is trained using the optimized set of segmentation masks to recognize an outline of the product. The machine learning model is run to generate a set of further-optimized segmentation masks. The bounding box and further-optimized segmentation masks from each image are stored in a master training set with its product identifier as a training example to be used to train a product recognition model.

Table 1

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WOO RHIM whose telephone number is (571)272-6560. The examiner can normally be reached Mon - Fri 9:30 am - 6:00 pm et.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henok Shiferaw can be reached at 571-272-4637. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WOO C RHIM/Examiner, Art Unit 2676

Read full office action

Prosecution Timeline

May 19, 2023

Application Filed

Jul 23, 2025

Non-Final Rejection — §102, §103

Oct 15, 2025

Applicant Interview (Telephonic)

Oct 15, 2025

Examiner Interview Summary

Oct 28, 2025

Response Filed

Nov 14, 2025

Final Rejection — §102, §103

Jan 20, 2026

Response after Non-Final Action

Jan 30, 2026

Request for Continued Examination

Feb 02, 2026

Response after Non-Final Action

Mar 13, 2026

Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/075,952

Patent 12601667

AUTOMATED TURF TESTING APPARATUS AND SYSTEM FOR USING SAME

2y 5m to grant Granted Apr 14, 2026

17/784,160

Patent 12596134

DEVICE, MOVEMENT SPEED ESTIMATION SYSTEM, FEEDING CONTROL SYSTEM, MOVEMENT SPEED ESTIMATION METHOD, AND RECORDING MEDIUM IN WHICH MOVEMENT SPEED ESTIMATION PROGRAM IS STORED

2y 5m to grant Granted Apr 07, 2026

18/210,943

Patent 12591997

ARRANGEMENT DEVICE AND METHOD

2y 5m to grant Granted Mar 31, 2026

18/198,440

Patent 12586169

Mass Image Processing Apparatus and Method

2y 5m to grant Granted Mar 24, 2026

18/144,652

Patent 12579607

DEMOSAICING METHOD AND APPARATUS FOR MOIRE REDUCTION

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

80%

Grant Probability

99%

With Interview (+21.4%)

2y 11m

Median Time to Grant

High

PTA Risk

Based on 140 resolved cases by this examiner. Grant probability derived from career allow rate.