Last updated: April 19, 2026

Application No. 18/428,388

SYSTEMS AND METHODS FOR GENERATING PERCEPTION DATA TO TRAIN OR EVALUATE THE PERFORMANCE OF MODELS USED TO CONTROL AN AUTONOMOUS ROBOT

Non-Final OA §103

Filed

Jan 31, 2024

Examiner

ISLAM, MEHRAZUL NMN

Art Unit

2662

Tech Center

2600 — Communications

Assignee

Toyota Jidosha Kabushiki Kaisha

OA Round

1 (Non-Final)

Interview Optional

— +28.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 50 resolved cases, 2023–2026

Examiner Intelligence

ISLAM, MEHRAZUL NMN View full profile →

Grants 58% of resolved cases

Career Allow Rate

29 granted / 50 resolved

-4.0% vs TC avg

Strong +28% interview lift

Without

With

+28.3%

Interview Lift

resolved cases with interview

Typical timeline

3y 4m

Avg Prosecution

46 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

9.2%

-30.8% vs TC avg

§103

68.6%

+28.6% vs TC avg

§102

4.1%

-35.9% vs TC avg

§112

15.2%

-24.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 50 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (“IDS”) filed on 03/12/2024 has been reviewed and the listed references have been considered.

Drawings
The 10-page drawings have been considered and placed on record in the file.
	
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 3-5, 7, 8, 10, 11, 13, 14, 16-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (US 2021/0302992 A1), in view of Lee et al. (US 2021/0358296 A1). 

	Regarding claim 1, Chen teaches, A system for (Chen, ¶0019: “a control system for controlling a motion of a vehicle”) generating perception data, (Chen, ¶0003: “the perception of environment… through two-dimensional (2D) object detection based on camera data”) the system comprising: a processor (Chen, ¶0039: “the control system 100 includes an image processor 106”) operating in an offline processing environment; (Chen, ¶0037: “control system 100 for controlling a motion of a vehicle 116”) and a memory storing machine-readable instructions that, (Chen, ¶0042: “control system 100 includes a memory 108 that stores instructions executable by a controller 104”) when executed by the processor, cause the processor to: (Chen, ¶0042: “controller 104 may be configured to execute the stored instructions in order to control operations”) extract first features (Chen, ¶0094: “multi-head neural network 110 executes feature extraction operation”) from a time sequence of perceptual sensor data to (Chen, ¶0051: “a sequence of video frames including the one or more objects in the environment captured by the camera”) generate a first set of bird's-eye-view (BEV) feature images; (Chen, ¶0007: “environmental state is detected based on a bird's eye view (BEV) map”) extract second features from the first set of BEV feature images (Chen, ¶0055: “extract the spatio-temporal features from the BEV maps”) using a BEV feature extractor that performs feature-level temporal aggregation including (Chen, ¶0055: “multi-head neural network 110 is configured to extract the spatio-temporal features from the BEV maps”).  However, Chen does not explicitly teach, both forward recurrence and backward recurrence to generate a second set of BEV feature images, wherein each BEV feature image in the second set of BEV feature images corresponds to a distinct time step in the time sequence of perceptual sensor data and incorporates information from all time steps in the time sequence of perceptual sensor data; and consume the second set of BEV feature images using one or more neural-network heads to perform one of: generating automatically labeled perception data to train one or more of an online perception model, an online prediction model, and an online planning model used to control an autonomous robot; and validating performance of an online autonomous stack used to control an autonomous robot.

	In an analogous field of endeavor, Lee teaches, both forward recurrence and backward recurrence (Lee, ¶0112: “flows in both the forward and the backward directions from the two bird's-eye view image embeddings can be calculated”) to generate a second set of BEV feature images, wherein each BEV feature image in the second set of BEV feature images corresponds to (Lee, ¶0109: “aggregator 552 receives all of the pillar features from the various points of all BeV images”) a distinct time step in the time sequence of perceptual sensor data and incorporates information from all time steps in the time sequence of perceptual sensor data; (Lee, ¶0105: “two birds-eye view images 531, 532… one representing the first point cloud (e.g., the point cloud at time t−1) and one representing the second point cloud (e.g., the point cloud at time t”) and consume the second set of BEV feature images using (Lee, ¶0107: “two birds-eye view images 531, 532 are aggregated to train classifiers for the features”) one or more neural-network heads (Lee, ¶0108: “classifiers may include, for example, Classification And Regression Tree (CART), K-nearest neighbor, neural network and mixture models”) to perform one of: generating automatically labeled perception data to train one or more of an online perception model, (Lee, ¶0102: “training data for the model may be autonomously labelled”) an online prediction model, and an online planning model  (Lee, ¶0008: “Online versions typically model this state…. with a recurrent network trained by self-supervised labeling to predict future states”) used to control an autonomous robot; (Lee, ¶0040: “The systems and methods disclosed herein may be implemented for use in scene flow estimation for robotics, autonomous vehicles and other automated technologies”) and validating performance of an online autonomous stack used to control an autonomous robot. (Lee, ¶0005: “develop a new evaluation by looking at the 3D reconstruction quality of dynamic models”).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen using the teachings of Lee to introduce generating automatic labels from forward and backward recurrences to train a classification model.  A person skilled in the art would be motivated to combine the known elements as described above and achieve the predictable result of detecting the surrounding of an autonomous robot for automatic operation.  Therefore, it would have been obvious to combine the analogous arts Chen and Lee to obtain the invention in claim 1.  

	Regarding claim 3, Chen in view Lee teaches, The system of claim 1, wherein the time sequence of perceptual sensor data includes one or more of camera images, Light Detection and Ranging (LIDAR) data, radar data, sonar data, map data, and audio data. (Chen, ¶0050: “a plurality of sensors on the vehicle 116 such as a light detection and ranging (LiDAR) sensor, a radio detection and ranging (RADAR) sensor, a camera, and the like”).

	Regarding claim 4, Chen in view Lee teaches, The system of claim 1, wherein the one or more neural-network heads include (Chen, ¶0040: “The multi-head neural network 110 includes”) one or more of a three-dimensional (3D) detection head, (Chen, ¶0062: “the classification head may execute… 3D object detection based on LiDAR data, or fusion-based detection”) a 3D semantic-occupancy head, an occupancy-flow head, (Chen, ¶0066: “motion state estimation head outputs occupancy information indicating a state of each of the one or more objects”) a map-elements head, (Chen, ¶0058: “cell classification head executes BEV map segmentation”) an instance-segmentation head, a panoptic-segmentation head, (Chen, ¶0059: “the semantic segmentation corresponds to classification of every pixel of the BEV maps into a corresponding class”) a drivable- surface-estimation head, (Chen, ¶0004: “OGM pipeline can be utilized to specify a future drivable space and thereby provide support for motion planning”) and an elevation-estimation head. (Chen, ¶0054: “converts the 3D voxel lattice into a 2D pseudo-image with a height dimension”).

	Regarding claim 5, Chen in view Lee teaches, The system of claim 1, wherein the autonomous robot is an autonomous vehicle. (Chen, ¶0038: “The vehicle 116 may be an autonomous vehicle or a semi-autonomous vehicle”).

	Regarding claim 7, Chen in view Lee teaches, The system of claim 1, wherein the online autonomous stack includes perception, prediction, and planning models. (Chen, ¶0002: “autonomous systems such as autonomous vehicles… facilitates motion planning… (1) perception, which identifies the foreground objects from the background; and (2) motion prediction”).

Regarding claim 8, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 1.  Therefore, the recited instructions of the computer-readable medium of claim 8 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 1.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 1, apply to this claim.  In addition, Chen teaches, A non-transitory computer-readable medium for generating perception data and storing instructions that, when executed by a processor, cause the processor to: (Chen, ¶0109: “the program code or code segments to perform the necessary tasks may be stored in a machine readable medium. A processor(s) may perform the necessary tasks”).

Regarding claim 10, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 4.  Therefore, the recited instructions of the computer-readable medium of claim 10 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 4.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 1, apply to this claim.  

Regarding claim 11, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 5.  Therefore, the recited instructions of the computer-readable medium of claim 11 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 5.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 1, apply to this claim.  

Regarding claim 13, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 7.  Therefore, the recited instructions of the computer-readable medium of claim 13 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 7.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 1, apply to this claim.  

Regarding claim 14, it recites a method with steps corresponding to the elements of the system recited in claim 1.  Therefore, the recited steps of the method claim 14 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 1.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 1, apply to this claim.  In addition, Chen teaches, A method (Chen, ¶00111: “Embodiments of the present disclosure may be embodied as a method”).

Regarding claim 16, it recites a method with steps corresponding to the elements of the system recited in claim 3.  Therefore, the recited steps of the method claim 16 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 3.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 1, apply to this claim.  
Regarding claim 17, it recites a method with steps corresponding to the elements of the system recited in claim 4.  Therefore, the recited steps of the method claim 17 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 4.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 1, apply to this claim.  

Regarding claim 18, it recites a method with steps corresponding to the elements of the system recited in claim 5.  Therefore, the recited steps of the method claim 18 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 5.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 5, apply to this claim.  

Regarding claim 20, it recites a method with steps corresponding to the elements of the system recited in claim 7.  Therefore, the recited steps of the method claim 20 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 7.  Additionally, the rationale and motivation to combine Chen and Lee presented in rejection of claim 1, apply to this claim.  

Claims 2, 6, 9, 12, 15 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (US 2021/0302992 A1), in view of Lee et al. (US 2021/0358296 A1) and in further view of Beaver et al. (US 2023/0274541 A1).

	Regarding claim 2, Chen in view Lee teaches, The system of claim 1.  However, the combination of Chen and Lee does not explicitly teach, wherein the BEV feature extractor includes one of a plurality of Gated Recurrent Units (GRUs), a plurality of Long Short-Term Memory (LSTM) networks, and a plurality of transformer networks to perform the feature-level temporal aggregation including both forward recurrence and backward recurrence.

	In an analogous field of endeavor, Beaver teaches, wherein the BEV feature extractor includes one of a plurality of Gated Recurrent Units (GRUs), a plurality of Long Short-Term Memory (LSTM) networks, and a plurality of transformer networks to perform the feature-level temporal aggregation including both forward recurrence and backward recurrence. (Beaver, ¶0019: “a sequence-to-sequence model such a recurrent neural network (RNN), long short-term memory (LSTM) network, gated recurrent unit (GRU) network, and/or a Bidirectional Encoder Representations from Transformers (BERT) transformer network, may be used to process individual pixels or groups of pixels as a sequence”).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen in view of Lee using the teachings of Beaver to introduce a plurality of extractor models.  A person skilled in the art would be motivated to combine the known elements as described above and achieve the predictable result of detecting the surrounding of an autonomous robot using extracted features.  Therefore, it would have been obvious to combine the analogous arts Chen, Lee and Beaver to obtain the invention in claim 2. 

	Regarding claim 6, Chen in view Lee teaches, The system of claim 1.  However, the combination of Chen and Lee does not explicitly teach, wherein the autonomous robot is one of a search and rescue robot, a delivery robot, an aerial drone, and an indoor robot.

	In an analogous field of endeavor, Beaver teaches, wherein the autonomous robot is one of a search and rescue robot, a delivery robot, an aerial drone, and an indoor robot. (Beaver, ¶0030: “An individual robot 108.sub.1-M may take various forms, such as an unmanned aerial vehicle 108-1”).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen in view of Lee using the teachings of Beaver to introduce an unmanned aerial vehicle.  A person skilled in the art would be motivated to combine the known elements as described above and achieve the predictable result of autonomously operating an aerial drone using the perception data.  Therefore, it would have been obvious to combine the analogous arts Chen, Lee and Beaver to obtain the invention in claim 6. 
Regarding claim 9, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 2.  Therefore, the recited instructions of the computer-readable medium of claim 9 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 2.  Additionally, the rationale and motivation to combine Chen, Lee and Beaver presented in rejection of claim 2, apply to this claim.  

Regarding claim 12, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 6.  Therefore, the recited instructions of the computer-readable medium of claim 12 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 6.  Additionally, the rationale and motivation to combine Chen, Lee and Beaver presented in rejection of claim 6, apply to this claim.  

Regarding claim 15, it recites a method with steps corresponding to the elements of the system recited in claim 2.  Therefore, the recited steps of the method claim 15 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 2.  Additionally, the rationale and motivation to combine Chen, Lee and Beaver presented in rejection of claim 2, apply to this claim.  

Regarding claim 19, it recites a method with steps corresponding to the elements of the system recited in claim 6.  Therefore, the recited steps of the method claim 19 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 6.  Additionally, the rationale and motivation to combine Chen, Lee and Beaver presented in rejection of claim 6`, apply to this claim.  

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MEHRAZUL ISLAM whose telephone number is (571)270-0489. The examiner can normally be reached Monday-Friday: 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Saini Amandeep can be reached on (571) 272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MEHRAZUL ISLAM/Examiner, Art Unit 2662                                                                                                                                                                                                        

/AMANDEEP SAINI/Supervisory Patent Examiner, Art Unit 2662

Read full office action

Prosecution Timeline

Jan 31, 2024

Application Filed

Mar 07, 2026

Non-Final Rejection — §103

Mar 20, 2026

Interview Requested

Apr 14, 2026

Applicant Interview (Telephonic)

Apr 14, 2026

Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

17/373,920

Patent 12602808

METHOD FOR INSPECTING AN OBJECT

2y 5m to grant Granted Apr 14, 2026

17/956,175

Patent 12592075

REMOTE SENSING FOR INTELLIGENT VEGETATION TRIM PREDICTION

2y 5m to grant Granted Mar 31, 2026

18/092,195

Patent 12579695

Method of Generating Target Image Data, Electrical Device and Non-Transitory Computer Readable Medium

2y 5m to grant Granted Mar 17, 2026

18/276,381

Patent 12524900

METHOD FOR IMPROVING ESTIMATION OF LEAF AREA INDEX IN EARLY GROWTH STAGE OF WHEAT BASED ON RED-EDGE BAND OF SENTINEL-2 SATELLITE IMAGE

2y 5m to grant Granted Jan 13, 2026

17/386,003

Patent 12489964

PATH PLANNING

2y 5m to grant Granted Dec 02, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

58%

Grant Probability

86%

With Interview (+28.3%)

3y 4m

Median Time to Grant

Low

PTA Risk

Based on 50 resolved cases by this examiner. Grant probability derived from career allow rate.