Last updated: April 19, 2026

Application No. 17/524,185

MACHINE LEARNING METHODS AND APPARATUS RELATED TO PREDICTING MOTION(S) OF OBJECT(S) IN A ROBOT'S ENVIRONMENT BASED ON IMAGE(S) CAPTURING THE OBJECT(S) AND BASED ON PARAMETER(S) FOR FUTURE ROBOT MOVEMENT IN THE ENVIRONMENT

Final Rejection §103

Filed

Nov 11, 2021

Examiner

NGUYEN, HENRY K

Art Unit

2121

Tech Center

2100 — Computer Architecture & Software

Assignee

Google LLC

OA Round

7 (Final)

Interview Optional

— +31.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 158 resolved cases, 2023–2026

Examiner Intelligence

NGUYEN, HENRY K View full profile →

Grants 57% of resolved cases

Career Allow Rate

90 granted / 158 resolved

+2.0% vs TC avg

Strong +31% interview lift

Without

With

+31.4%

Interview Lift

resolved cases with interview

Typical timeline

4y 7m

Avg Prosecution

26 currently pending

Career history

184

Total Applications

across all art units

Statute-Specific Performance

§101

21.6%

-18.4% vs TC avg

§103

51.4%

+11.4% vs TC avg

§102

7.7%

-32.3% vs TC avg

§112

14.0%

-26.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 158 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Acknowledgement is made of Applicant’s claim amendments on 12/24/2025. The claim amendments are entered. Presently, claims 1-5, 8-16, 18, and 20 remain pending. Claim 18 has been amended.
Response to Arguments
Applicant’s arguments with respect to claim 18 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Laurent et al. (US-20150148953-A1) in view of Song al. (US-20160184990-A1) and Morikawa et al. (US-20060155664-A1).
Regarding Claim 18,
Laurent teaches a method implemented by one or more processors, comprising:
determining, a goal state of an object in an environment of a robot (para [0146] In one or more implementations, the predicted sensory outcome may correspond to a position of the object in a subsequent frame (e.g., the frame 342 in FIG. 3A), rotation of target/objects due to the motion of the robotic platform, sound resulting from motion of the robotic platform or its interaction with the environment, or changes in illumination as a result of motion of the robotic platform, effectuated by the adaptive apparatus 220 of FIG. 2. para [0059] In one or more implementations, the input signal 102 may comprise a target motion trajectory. The motion trajectory may be used to predict a future state of the robot on the basis of a current state and the target state.), 
generating candidate robot movement parameters, the candidate robot movement parameters defining at least a portion of a candidate movement performable by the robot in the environment (para [0059] In one or more implementations, the input signal 102 may comprise a target motion trajectory. The motion trajectory may be used to predict a future state of the robot on the basis of a current state and the target state. And Para [0150] In one or more implementations, the prior context may comprise one or more of sensory prior input (e.g., 202 in FIG. 2, frames 330, 334 in FIG. 3A), characteristics identified in the sensory input (e.g., motion of the object 332 to location 336 in FIG. 3A, periphery portions 314, 416 in FIG. 3B), a motor command and/or action indication (e.g., 206 in FIG. 2), state of the robotic platform (e.g., as state determined using the feedback 112 in FIG. 1), and/or other prior context.); 
generating at least one predicted image that predicts a predicted state of the object were the at least the portion of candidate movement performed in the environment by the robot (para [0076] An object may be present in the camera 1166 field of view. The (prior) frame 330 at time t-1 may comprise object representation 332 obtained prior to the commencement of the action 340. The current frame 334 may comprise object representation 336 obtained based on the action 340. The predictor 212 of the apparatus 220 in FIG. 2 may predict location of the object 338 in a subsequent frame 342 at time (t+1). And para [0081] By way of a non-limiting example, the apparatus 220 may be trained to predict appearance of the object 332 in one or more subsequent frames (e.g., 334, 342) based on the action 340.), 
generating the predicted image comprising:
applying the current image and the candidate robot movement parameters as input to a trained neural network (para [0150] At operation 802 of method 800, illustrated in FIG. 8 a predicted context may be determined based on a prior context. In one or more implementations, the prior context may comprise one or more of sensory prior input (e.g., 202 in FIG. 2, frames 330, 334 in FIG. 3A), characteristics identified in the sensory input (e.g., motion of the object 332 to location 336 in FIG. 3A, periphery portions 314, 416 in FIG. 3B), a motor command and/or action indication (e.g., 206 in FIG. 2), state of the robotic platform (e.g., as state determined using the feedback 112 in FIG. 1), and/or other prior context. Frames (i.e. image) and motor command (i.e. candidate action). Para [0072] “During training, the controller may receive a copy of the planned and/or executed motor command and sensory information obtained based on the robot's response to the command.” para [0164] In some implementations, control by the discrepancy-guided learning of sensory context input into the predictor may enable filtering out of irrelevant (e.g., not target) state indication from input into the predictor thereby enabling faster learning and/or generalization of predictor learning.);
providing the one or more control commands to one or more of the actuators of the robot to perform the candidate movement, wherein providing the one or more control commands to one or more of the actuators of the robot causes the robot to perform the candidate movement (Para [0076] The frame 334 observed at time t may be associated with a motor command 340 issued at time t-1. In one or more implementations, the motor command 340 may comprise "pan right" instructions to the camera 1166 motor controller.).
Laurent does not explicitly disclose
determining, based on user interface input, a goal state of an object in an environment of a robot, wherein the object is in addition to the robot, wherein the user interface input is through an interface that displays a current image capturing a portion of the environment and the object in a current state, and wherein the user interface input used in determining the goal state of the object, is a manipulation of the current image that is capturing the portion of the environment and the object in the current state;
prior to providing one or more control commands to one or more actuators of the robot to perform the candidate movement, determining, based on comparing the goal state that is based on the user interface input and that is of the object that is in the environment of the robot, to the predicted state of the object, to perform the candidate movement;
However, Song (US 20160184990 A1) teaches
determining, based on user interface input, a goal state of an object in an environment of a robot, wherein the object is in addition to the robot, wherein the user interface input is through an interface that displays a current image capturing a portion of the environment and the object in a current state (para [0078] “Referring to FIG. 1 again, the robot vision image 120 may include two objects (e.g., a bottle and a cup), and by clicking on an object, the user may determine an object to which an action is intended to be performed by controlling the right robotic aim 220 and/or the left robotic arm 230. Assuming that the user clicks on an object (e.g., the bottle) in the robot vision image 120, the remote control device 100 may convert the operation into a target selection command and transmit the target selection command to the robot 200. Then, the control module may spot the target in response to the target selection command.”), and wherein the user interface input used in determining the goal state of the object, is a manipulation of the current image that is capturing the portion of the environment and the object in the current state (para [0078]-[0082]);
Laurent and Song are analogous because they are both directed to the field of robotics used for controlling objects.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the robot system of Laurent with the user interface for controlling the robot of Song.
In this way, the robot is able to combine the user's intention and the autonomous control of the robot to improve the movement of the robot, thereby improving a working efficiency of the robot (Song para [0026]).
Morikawa (US 20060155664 A1) teaches
prior to providing one or more control commands to one or more actuators of the robot to perform the candidate movement, determining, based on comparing the goal state that is of the object that is in the environment of the robot, to the predicted state of the object, to perform the candidate movement (para [0062] In this case, for example, if a number for the number of prediction steps with which the state value takes the maximum value and a prediction state are used for action determination, the ball location B2 is selected as the target state. This is the same meaning that a state when the ball B is in a closest location to the lower surface is set as the target state. In this manner, if action determination is performed by predicting a future state change which is not influenced by a self-action and setting as the target state one of predicted states which is suitable for action determination, a current state can be more accurately determined. This section discloses predicting several predicted states and choosing the highest valued predicted state as the target state. In order to determine which predicted state has the highest value, all predicted states must be compared to one another. The highest value becomes the target state.  para [0064], para [0069]-[0070]. Also see figure 4 and figure 5. para [0076] As has been described, according to this embodiment, from a result of a prediction of a state change with respect to the environment 11, a future state suitable for action determination is determined as a target state with reference to a state value. Then, based on the target state, a self-action is determined. And para [0111]);
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the teachings of Laurent with the predicted state and target state of Morikawa.
Doing so would allow for the accuracy of the action determination to be improved, compared to known action determination made by predicting a future state (Morikawa para [0062].)

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Laurent/Song/Morikawa, as applied above, and further in view of Hoffman et al. (“Perception through visuomotor anticipation in a mobile robot”) and Jaderberg et al. (“Spatial transformer networks”).
Regarding Claim 20,
Laurent, Song, and Morikawa teach the method of claim 18. 
	Laurent, Song, and Morikawa do not explicitly disclose 
wherein generating the at least one predicted image further comprises: 
generating at least one predicted transformation of the current image, the predicted transformation being generated based on the application of the current image and the candidate robot movement parameters to the trained neural network; and 
transforming the current image based on the at least one predicted transformation to generate at least one predicted image.
However, Hoffman teaches
generating at least one predicted transformation of the current image (fig. 5, pg. 23; Instead, here, given only the current image, a forward model predicts the next image.), the predicted transformation being generated based on the application of the current image and the candidate robot movement parameters to the trained neural network (pg. 24, col. 1, paragraph 2; Given this image and the motor commands provided by the movement plan, a forward model predicts the next image, which is denoised subsequently. And pg. 25; The forward model predicts an image given the current processed image and the wheel velocities. Each image pixel was predicted using an MLP.); and 
Laurent, Song, Morikawa, and Hoffman are analogous because they are both directed to the same field of endeavor of neural networks used for predicting images to control robot actions.
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine Laurent et al.’s method of predicting an image to control a robot using a neural network with Hoffman’s method of predicting an image to control a robot using a neural network.
Doing so would allow for denoising of the predicted image (Hoffman pg. 32, col. 2, paragraph 6; An image was predicted by computing each pixel with a multilayer perceptron given as input the previous image and a movement command. This prediction, however, was noisy. Thus, a denoising method was introduced that greatly reduced the noise within an image (Fig. 7).).
Jaderberg teaches
transforming the current image based on the at least one predicted transformation to generate at least one predicted image (figure 1; pg. 2; “(a) The input to the spatial transformer network is an image of an MNIST digit that is distorted with random translation, scale, rotation, and clutter. (b) The localisation network of the spatial transformer predicts a transformation to apply to the input image. (c) The output of the spatial transformer, after applying the transformation.” And pg. 3 section 3.2; “To perform a warping of the input feature map, each output pixel is computed by applying a sampling kernel centered at a particular location in the input feature map (this is described fully in the next section)… For clarity of exposition, assume for the moment that Tθ is a 2D affine transformation Aθ. We will discuss other transformations below. In this affine case, the pointwise transformation is …where (x t i , yt i ) are the target coordinates of the regular grid in the output feature map, (x s i , ys i ) are the source coordinates in the input feature map that define the sample points, and Aθ is the affine transformation matrix.).
	Laurent, Song, Morikawa, Hoffman, and Jaderberg are analogous because they are both directed towards neural networks used for image prediction.
	It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the neural network of Hoffman with the spatial transformers of Jaderberg.
	The spatial transformer can crop out and scale-normalize appropriate regions of the image to simplify the subsequent classification task, and lead to superior classification performance (Jaderberg pg. 2, section 1;).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217. The examiner can normally be reached Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached at 5712723768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HENRY NGUYEN/Examiner, Art Unit 2121

Read full office action

Prosecution Timeline

Nov 11, 2021

Application Filed

Feb 25, 2023

Non-Final Rejection — §103

Jun 23, 2023

Examiner Interview Summary

Jun 23, 2023

Applicant Interview (Telephonic)

Jun 30, 2023

Response Filed

Oct 27, 2023

Non-Final Rejection — §103

Feb 05, 2024

Applicant Interview (Telephonic)

Feb 05, 2024

Response Filed

Feb 06, 2024

Examiner Interview Summary

May 01, 2024

Final Rejection — §103

Jul 15, 2024

Response after Non-Final Action

Jul 15, 2024

Examiner Interview Summary

Jul 15, 2024

Applicant Interview (Telephonic)

Jul 27, 2024

Response after Non-Final Action

Aug 13, 2024

Request for Continued Examination

Aug 19, 2024

Response after Non-Final Action

Oct 25, 2024

Non-Final Rejection — §103

Feb 04, 2025

Examiner Interview Summary

Feb 04, 2025

Applicant Interview (Telephonic)

Feb 05, 2025

Response Filed

May 22, 2025

Final Rejection — §103

Jul 21, 2025

Applicant Interview (Telephonic)

Jul 21, 2025

Examiner Interview Summary

Jul 28, 2025

Response after Non-Final Action

Aug 14, 2025

Request for Continued Examination

Aug 22, 2025

Response after Non-Final Action

Sep 20, 2025

Non-Final Rejection — §103

Dec 23, 2025

Examiner Interview Summary

Dec 23, 2025

Applicant Interview (Telephonic)

Dec 24, 2025

Response Filed

Mar 24, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

16/561,896

Patent 12585933

TRANSFER LEARNING WITH AUGMENTED NEURAL NETWORKS

2y 5m to grant Granted Mar 24, 2026

19/115,468

Patent 12572776

Method, System, and Computer Program Product for Universal Depth Graph Neural Networks

2y 5m to grant Granted Mar 10, 2026

15/225,806

Patent 12547484

Methods and Systems for Modifying Diagnostic Flowcharts Based on Flowchart Performances

2y 5m to grant Granted Feb 10, 2026

17/153,453

Patent 12541676

NEUROMETRIC AUTHENTICATION SYSTEM

2y 5m to grant Granted Feb 03, 2026

18/509,585

Patent 12505470

SYSTEMS, METHODS, AND STORAGE MEDIA FOR TRAINING A MACHINE LEARNING MODEL

2y 5m to grant Granted Dec 23, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

8-9

Expected OA Rounds

57%

Grant Probability

88%

With Interview (+31.4%)

4y 7m

Median Time to Grant

High

PTA Risk

Based on 158 resolved cases by this examiner. Grant probability derived from career allow rate.

MACHINE LEARNING METHODS AND APPARATUS RELATED TO PREDICTING MOTION(S) OF OBJECT(S) IN A ROBOT'S ENVIRONMENT BASED ON IMAGE(S) CAPTURING THE OBJECT(S) AND BASED ON PARAMETER(S) FOR FUTURE ROBOT MOVEMENT IN THE ENVIRONMENT

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email