Last updated: April 19, 2026

Application No. 17/855,240

SIMULATED TRAINING FOR REINFORCEMENT LEARNING

Final Rejection §103

Filed

Jun 30, 2022

Examiner

PAULINO, LENIN

Art Unit

2197

Tech Center

2100 — Computer Architecture & Software

Assignee

Microsoft Technology Licensing, LLC

OA Round

6 (Final)

Interview Optional

— +25.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 327 resolved cases, 2023–2026

Examiner Intelligence

PAULINO, LENIN View full profile →

Grants 57% of resolved cases

Career Allow Rate

186 granted / 327 resolved

+1.9% vs TC avg

Strong +25% interview lift

Without

With

+25.3%

Interview Lift

resolved cases with interview

Typical timeline

4y 2m

Avg Prosecution

34 currently pending

Career history

361

Total Applications

across all art units

Statute-Specific Performance

§101

21.1%

-18.9% vs TC avg

§103

57.5%

+17.5% vs TC avg

§102

8.4%

-31.6% vs TC avg

§112

7.2%

-32.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 327 resolved cases

Office Action

§103

DETAILED ACTION

Claims 1-20 are pending. Claims 1, 8 and 16 have been amended.

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This non-final office action is in response to the applicant’s response received on 07/09/2025, for the final office action mailed on 04/09/2025.

Examiner’s Notes

Examiner has cited particular columns and line numbers, paragraph numbers, or figures in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant, in preparing the responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.



Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 07/09/2025 has been entered.

Response to Arguments
Applicant's arguments filed 11/13/2025 regarding rejection made under 35 U.S.C. § 101 have been fully considered and are persuasive. Examiner respectfully withdraws rejection made under 35 U.S.C. § 101. 

Applicant's arguments filed 11/13/2025 regarding rejection made under 35 U.S.C. § 103 have been fully considered but they are moot in view of new ground(s) rejection. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3-5, 8, 16, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Szabo et al. (US-PGPUB-NO: 2012/0047200 A1) hereinafter Szabo, in further view of Gene et al. (US-PGPUB-NO: 2020/0167687 A1) hereinafter Gene, Lin et al. (US-PAT-NO: 10,467,029 B1) hereinafter Lin, Shin et al. (US-PAT-NO: 11,983,928 B1) hereinafter Shin and Baphna et al. (US-PGPUB-NO: 2018/0197428 A1) hereinafter Baphna.

As per claim 1, Szabo teaches a method of training a reinforcement-learning model in a simulation comprising: receiving user-interface interaction data for a software, the user-interface interaction data indication user actions taken in a user interface of the software (see Szabo paragraph [0029], “The recording device 124 connects to the client devices 122, records the user interactions on the client devices 122, and stores the recorded user interactions in the mouse and keyboard events database 126 (step 204 in FIG. 2)”); building an action model of the software using the user-interface interaction data (see Szabo paragraph [0029], “The extracting device 128 connects to the user plane traffic database 116 and the cell level events measurement database 118, extracts usage scenarios from real network traffic in the real network 104, and stores the extracted usage scenarios in the user and traffic models database 130 (step 206 in FIG. 2)”; generating a simulated first user interface from the action model (see Szabo paragraph [0029], “The graphical user interface testing tool 132 emulates user actions on the client device(s) 122 according to a specific user scenario by utilizing the recorded user interactions obtained from the recording database 124 and the extracted usage scenarios obtained from the user and traffic models database 130 to generate additional real traffic 144 in the real network 104 (step 208 in FIG. 2)”.
Szabo does not explicitly teach providing the simulated first user interface to the reinforcement-learning model to train the reinforcement-learning model to perform a task in the software; selecting, by the reinforcement-learning model, an action to take in the simulated first user interface; generating an updated reinforcement-learning model by training the reinforcement-learning model using the action and the reward; and storing the updated reinforcement-learning model. However, Gene teaches providing the simulated first user interface to the reinforcement-learning model (see Gene paragraph [0072], “In an embodiment, the simulation application container 308 can provide simulation data to multiple training application containers 306. For instance, each training application container 306 may utilize different hyperparameters and/or different machine learning techniques to train a reinforcement learning model using the simulation data from the simulation application container 308”) to train the reinforcement-learning model to perform a task in the software (see Gene paragraph [0077], “The model training application 412 may utilize the data as input to update the reinforcement learning model for the robotic device application being simulated, resulting in an updated reinforcement learning model 414. As the model training application 412 updates the reinforcement learning model 414, the training application container 410 may transmit the updated reinforcement learning model 414 to the simulation application container 402. This may cause the system simulation agent 404 to update its reinforcement learning model 406 and use the updated reinforcement learning model to perform another simulation of the robotic device application and generate more simulation data 416”; selecting, by the reinforcement-learning model, an action to take in the simulated first user interface to execute the task, where the action is performed without user intervention (see Gene paragraph [0075], “Similar to the value function described above, the system simulation agent 404 may select the action to be performed at random if it is the initial action to be selected based on the initial state of the simulation environment. The action may be selected at random since the reinforcement learning model 406 has not been updated to provide the sufficient guidance for selecting an action that would result in a higher reward value in accordance with the reinforcement function”); generating an updated reinforcement-learning model by training the reinforcement-learning model using the action and the reward (see Gene paragraph [0040], showing the training providing an updated reinforcement learning model to obtain new state-action-reward data that may be used to continue updating the reinforcement learning model); and storing the updated reinforcement-learning model (see Gene paragraph [0042], showing code repository to store an updated application and the reinforcement learning model).
Szabo and Gene are analogous art because they are in the same field of endeavor of software development. Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Szabo’s teaching of a graphical user interface testing tool configured to construct a validation trace using a real network without causing privacy concerns with Gene’s teaching of using reinforcement learning model to operate within a simulation environment to incorporate updated the reinforcement learning model in order to generate updated actions and a rewards within the system.
Szabo modified with Gene do not explicitly teach wherein the action model is provided an image of a first user interface and state telemetry data for a simulated first user interface obtained from the user-interface interaction data. However, Lin teaches wherein the action model is provided an image of a first user interface and state telemetry data for a simulated first user interface obtained from the user-interface interaction data (see Lin [column 2, lines 48-57], showing telemetry data along with GUI elements being used for generating predicted GUI elements and also see Lin [column 7, lines 33-49], showing predictive GUI is being determined by likelihood of subsequent actions (i.e., simulated user action) by analyzing interaction history data which includes GUI data and telemetry data (see Lin [column 4, lines 11-17], showing action data includes interaction history data describing sequences of type of interaction the user had with an application (some aspect of the GUI)).
Szabo, Gene and Lin are analogous art because they are in the same field of endeavor of software development. Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Szabo’s teaching of a graphical user interface testing tool configured to construct a validation trace using a real network without causing privacy concerns and Gene’s teaching of using reinforcement learning model to operate within a simulation environment and Lin’s teaching of predictive graphical user interfaces to incorporate the use of telemetry data and actions based on user interface interactions and providing the data to a model as taught in Szabo in order to build upon user actions and telemetry data.
Szabo modified with Gene and Lin do not explicitly teach determining a set of similar tasks based on a similarity analysis of the task and the simulated first user interface by at least: (i) comparing an image of the simulated first user interface to images of user interfaces associated with other tasks using a visual similarity metric or (ii) comparing a language encoding of a task description associated with the task and the simulated first user interface to language encodings of other task descriptions. However, Shin teaches determining a set of similar tasks based on a similarity analysis of the task and the simulated first user interface by at least: (i) comparing an image of the simulated first user interface to images of user interfaces associated with other tasks using a visual similarity metric (see Shin [columns 4-5, lines 65-24], “The one or more modules may extract data associated with the region of the current image that includes the detected object and compare the extracted data with corresponding data associated with existing targets (e.g., extracted from prior images depicting the environment). The object localization module may calculate the visual similarity metric value by determining a visual similarity between the detected object and an existing target based on the data extracted by the data extraction module and may calculate a similarity metric value based on the determined visual similarity. The object localization module may provide an indication of the image region that includes the detected object, the image region that is expected to include an existing target, and the calculated visual similarity metric value to the data association module, as described above”) or (ii) comparing a language encoding of a task description associated with the task and the simulated first user interface to language encodings of other task descriptions.
Szabo, Gene, Lin and Shin are analogous art because they are in the same field of endeavor of software development. Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Szabo’s teaching of a graphical user interface testing tool configured to construct a validation trace using a real network without causing privacy concerns, Gene’s teaching of using reinforcement learning model to operate within a simulation environment and Lin’s teaching of predictive graphical user interfaces with Shin’s teaching of configuring an object tracker for an intelligent video analytics system in order to incorporate the use of a visual similarity metric used on images to determine similarity between images, see Shin [column 5, lines 13-24], “In another example, a visual feature-based tracker may also implement a data extraction application module, which may be configured to extract one or more visual features from an image depicting a detected object (i.e., using a feature extraction model) and calculate a feature similarity metric value based on the extracted features and features associated with an existing target extracted from prior image depicting the environment. The data extraction module may provide an indication of the image region that includes the detected object and the calculated feature similarity metric to the data association application module, as described above.”
Szabo modified with Gene, Lin and Shin do not explicitly teach where the action causes the action model to generate a simulated second user interface that results from taking the action through the simulated first user interface and determining a reward associated with the action based on a state associated with the simulated second user interface based on the at least one task of the set of similar tasks being achieved. However, Baphna teaches where the action causes the action model to generate a simulated second user interface that results from taking the action through the simulated first user interface (see Baphna paragraph [0102], showing a hint being rendered based on the simulated actions taken) and determining a reward associated with the action based on a state associated with the simulated second user interface (see Baphna paragraph [0105], showing a scoring module configured to score challenge based the path and actions taken by the user and different dimensions) based on the at least one task of the set of similar tasks being achieved (see Baphna paragraph [0108], “The gaming module 222 is further configured to compare, in real-time, the user's scores during the user's performance of the one or more simulated tasks with score for other users who are performing the same or substantially similar simulated tasks”).
Szabo, Gene, Lin, Shin and Baphna are analogous art because they are in the same field of endeavor of software development. Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Szabo’s teaching of a graphical user interface testing tool configured to construct a validation trace using a real network without causing privacy concerns, Gene’s teaching of using reinforcement learning model to operate within a simulation environment, Lin’s teaching of predictive graphical user interfaces and Shin’s teaching of configuring an object tracker for an intelligent video analytics system with Baphna’s teaching of a machine learning system for processing and analyzing behavioral and factual data, and progress of a user to incorporate the use of actions being conducted by the simulated learning platform in order to render and display a second user interface as a hint in order to help out with the use of the learning tool.

As per claim 3, Szabo modified with Gene, Lin, Shin and Baphna teaches generating, from the action model, a simulated second user interface that results from taking the action through the simulated first user interface; providing the simulated second user interface to the reinforcement-learning model; and selecting, by the reinforcement-learning model, an additional action to take through the simulated second user interface (see Gene paragraph [0040], showing the training providing an updated reinforcement learning model to obtain new state-action-reward data that may be used to continue updating the reinforcement learning model).

As per claim 4, Szabo modified with Gene, Lin, Shin and Baphna teaches wherein the user-interface interaction data is generated during training a different reinforcement-learning model to perform a task in the software (see Gene paragraph [0072], showing different hyperparameter and different machine learning techniques being used to train a reinforcement learning model using the simulation data and also see Gene paragraph [0087], showing the use of different reinforcement learning models in the simulation environment).

As per claim 5, Szabo modified with Gene, Lin, Shin and Baphna teaches further comprising training the updated reinforcement-learning model to learn a task through interaction with an instance of the software (see Szabo paragraph [0030], “Thus, when a new application is added to the real network 104 or one of the GUI's applications 140a, 140b and 140c has changed significantly, a user 141 will interact with the client device 122 and use the new or changed application while the recording device 124 records the user interactions with the GUI 142 (step 204 in FIG. 2)”).

As per claim 8, this is the computer system comprising: a processor; and memory configured to provide computer program instructions to the processor, the computer program instructions including a reinforcement-learning model simulator (see Gene paragraph [0034], “In an embodiment, the custom-designed reinforcement function is stored as computer-executable code 110 in a data object within an object-based data storage service 106. The object-based data storage service may be a service provided by a computing resource service provider. The object-based data storage service may be implemented on a computer system, or abstraction thereof (such as one or more virtual machines operating via a hypervisor), implemented using hardware and software, and may comprise one or more processors and memory that stores executable instructions whose execution by the one or more processors causes the computer system to perform operations described herein”) claim to method claim 1. Therefore, it is rejected for the same reasons as above. 

As per claim 16, this is the computer storage medium computer-useable instructions that, when used by a computing device, cause the computing device to perform operations (see Gene paragraph [0121], “In an embodiment, each server typically includes an operating system that provides executable program instructions for the general administration and operation of that server and includes a computer-readable storage medium (e.g., a hard disk, random access memory, read only memory, etc.) storing instructions that, if executed by a processor of the server, cause or otherwise allow the server to perform its intended functions (e.g., the functions are performed as a result of one or more processors of the server executing instructions stored on a computer-readable storage medium)”) claim to method claim 1. Therefore, it is rejected for the same reasons as above.

As per claim 18, Szabo modified with Gene, Lin, Shin and Baphna teaches wherein the training is performed with a batch of actions and rewards produced by selecting actions in a simulated user interface (see Gene paragraph [0035], showing simulation parameter may include a batch size for the simulation).

As per claim 20, Szabo modified with Gene, Lin, Shin and Baphna teaches wherein the user-interface interaction data is generated during training a different reinforcement-learning model to perform a task in the software (see Gene paragraph [0072], showing different hyperparameter and different machine learning techniques being used to train a reinforcement learning model using the simulation data and also see Gene paragraph [0087], showing the use of different reinforcement learning models in the simulation environment).

Claim(s) 2, 12 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Szabo (US-PGPUB-NO: 2012/0047200 A1), Gene (US-PGPUB-NO: 2020/0167687 A1), Lin (US-PAT-NO: 10,467,029 B1), Shin (US-PAT-NO: 11,983,928 B1) and Baphna et al. (US-PGPUB-NO: 2018/0197428 A1), in further view of Gene et al. (US-PGPUB-NO: 2023/0419113 A1) hereinafter Gene II.

As per claim 2, Szabo modified with Gene, Lin, Shin and Baphna do not explicitly teach wherein the method further comprises inputting an image of the simulated first user interface to the reinforcement-learning model. However, Gene II teaches wherein the method further comprises inputting an image of the simulated first user interface to the reinforcement-learning model (see Gene II paragraph [0019], showing training data input for the model may be multi-modal and include visual representations of the environment).
Szabo, Gene, Lin, Shin, Baphna and Gene II are analogous art because they are in the same field of endeavor of software development. Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Szabo’s teaching of a graphical user interface testing tool configured to construct a validation trace using a real network without causing privacy concerns, Gene’s teaching of using reinforcement learning model to operate within a simulation environment, Lin’s teaching of predictive graphical user interfaces, Shin’s teaching of configuring an object tracker for an intelligent video analytics system and Baphna’s teaching of a machine learning system for processing and analyzing behavioral and factual data, and progress of a user with Gene II’s teaching neural network based reinforcement learning model with one or more attention layers being trained to incorporate visual representation of an environment as training data for a model.

As per claim 12, Szabo modified with Gene, Lin, Shin, Baphna and Gene II teaches wherein the reinforcement-learning model simulator is further configured to input an image of the simulated first user interface to the reinforcement-learning model (see Gene II paragraph [0019], showing training data input for the model may be multi-modal and include visual representations of the environment).

As per claim 15, Szabo modified with Gene, Lin, Shin, Baphna and Gene II teaches wherein the reinforcement-learning model is a proximal policy optimization model (see Gene II paragraph [0046], showing a clipped proximated policy optimization algorithm being used).

Claim(s) 6, 7, 9-11, 13, 17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Szabo (US-PGPUB-NO: 2012/0047200 A1), Gene (US-PGPUB-NO: 2020/0167687 A1), Lin (US-PAT-NO: 10,467,029 B1), Shin (US-PAT-NO: 11,983,928 B1) and Baphna (US-PGPUB-NO: 2018/0197428 A1), in further view of Romdhana (“Deep Reinforcement Learning for Black-box Testing of Android Apps), 2022, hereinafter Romdhana.

As per claim 6, Szabo modified with Gene, Lin, Shin, Baphna Harris does not explicitly teach wherein the action model does not represent every possible action in the software because the user-interface interaction data does not include a recorded interaction with every possible action in the software. However, Romdhana teaches wherein the action model does not represent every possible action in the software because the user-interface interaction data does not include a recorded interaction with every possible action in the software (see Romdhana [5.2 Representative Family of Models, Paragraph 2], showing the simplest scenario for the app model of Player which the app Player is one of four applications in which models were configured for a wide number of activities and reflect generalization which will not include every possible action within an application).
Szabo, Gene, Lin, Shin, Baphna and Romdhana are analogous art because they are in the same field of endeavor of software development. Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Szabo’s teaching of a graphical user interface testing tool configured to construct a validation trace using a real network without causing privacy concerns, Gene’s teaching of using reinforcement learning model to operate within a simulation environment, Lin’s teaching of predictive graphical user interfaces, Baphna’s teaching of a machine learning system for processing and analyzing behavioral and factual data, and progress of a user and Shin’s teaching of configuring an object tracker for an intelligent video analytics system with Romdhana’s teaching of using reinforcement learning for testing applications to incorporate minimal data or a simple scenario for using an application which can be used to represent user interactions.

As per claim 7, Szabo modified with Gene, Lin, Shin, Baphna and Romdhana teaches wherein the reinforcement-learning model is a deep Q network model (see Romdhana [2.3 Deep Reinforcement Learning, Paragraph 1], showing Deep Q-Networks as a deep reinforcement learning algorithm).

As per claim 9, Szabo modified with Gene, Lin, Shin, Baphna and Romdhana teaches wherein the user-interface interaction data is generated by a random talk test of the software (see Romdhana [2.3 Deep Reinforcement Learning, Paragraph 3], shows taking random samples from memory in order to have more efficient learning).

As per claim 10, Szabo modified with Gene, Lin, Shin, Baphna and Romdhana teaches wherein the user-interface interaction data is generated through a replay test of the software when events in the replay test are predicted to perform a task with above a threshold confidence (see Romdhana [2.3 Deep Reinforcement Learning], Paragraph 2, showing experience replay storing an agents experience).

As per claim 11, Szabo modified with Gene, Lin, Shin, Baphna and Romdhana teaches wherein the simulated first user interface is provided to the reinforcement-learning model in a form that will be used during operation of the reinforcement-learning model with an instance of the software (see Romdhana [1 Introduction, Paragraph 5], showing FATE is a simulation environment that supports fast assessment of Android testing algorithms by running synthetic Android apps).

As per claim 13, Szabo modified with Gene, Lin, Shin, Baphna and Romdhana teaches wherein the action model does not represent every possible action in the software because the user-interface interaction data does not include a recorded interaction with every possible action in the software (see Romdhana [5.2 Representative Family of Models, Paragraph 2], showing the simplest scenario for the app model of Player).

As per claim 17, Szabo modified with Gene, Lin, Shin, Baphna and Romdhana teaches further comprising training the updated reinforcement-learning model to learn a task through interaction with an instance of the software (see Romdhana [1 Introduction, Paragraph 4 and 5], showing the simulation environment is used for hyperparameter tuning).


As per claim 19, Szabo modified with Gene, Lin, Shin, Baphna and Romdhana teaches wherein the reinforcement- learning model includes a convolutional layer (see Romdhana [2.3 Deep Reinforcement Learning, paragraph 2], showing DQN using convolutional neural networks).

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Szabo (US-PGPUB-NO: 2012/0047200 A1), Gene (US-PGPUB-NO: 2020/0167687 A1), Lin (US-PAT-NO: 10,467,029 B1), Shin (US-PAT-NO: 11,983,928 B1) and Baphna et al. (US-PGPUB-NO: 2018/0197428 A1), in further view of Pang et al. (A Simulator for Reinforcement Learning Training In The Recommendation Field, 2020) hereinafter Pang.

As per claim 14, Szabo modified with Gene, Lin, Shin and Baphna do not explicitly teach wherein the reward is +5 if the action completes a task or -0.1 if the action does not complete the task. However, Pang teaches wherein the reward is +5 if the action completes a task or -0.1 if the action does not complete the task (see Pang [Reward Calculation Function], showing a reward calculation function used to calculate rewards based on factors).
Szabo, Gene, Lin, Shin, Baphna and Pang are analogous art because they are in the same field of endeavor of software development. Therefore, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Szabo’s teaching of a graphical user interface testing tool configured to construct a validation trace using a real network without causing privacy concerns, Gene’s teaching of using reinforcement learning model to operate within a simulation environment, Lin’s teaching of predictive graphical user interfaces, Shin’s teaching of configuring an object tracker for an intelligent video analytics system and Baphna’s teaching of a machine learning system for processing and analyzing behavioral and factual data, and progress of a user with Pang’s teaching a rating model based on attention mechanism and realizing the reward of the simulator to the actions of the Deep Reinforcement Learning model based on recommendations to incorporate recommendations to a reward of a simulator to actions based on an agent along with a formula.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
 Fu et al. (US-PGPUB-NO: 2023/0394758 A1) teaches training model for controlling object by acquiring an interaction sequence generated by an interaction between a first virtual object and a second virtual object in a virtual environment.
 Lee et al. (US-PGPUB-NO: 2021/0187733 A1) teaches training a hierarchical reinforcement learning model for robotic control.
 Campos et al. (US-PGPUB-NO: 2018/0293498 A1) teaches hierarchical decomposition reinforcement learning technique to train one or more AI objects as concept nodes composed in a hierarchical graph incorporated into an AI model.
  Eskonen et al. (Automating GUI Testing with Image-Based Deep Reinforcement Learning), 2020, teaches automating GUI testing with image-based deep reinforcement learning.
 Wellens et al. (US-PGPUB-NO: 2023/0297742 A1) teaches artificial intelligence-based system implementing proxy models for physics-based simulators.
 Kuffner et al. (US-PGPUB-NO: 2018/0056505 A1) teaches generating instructions for a robotic system to carry out a task.
 Harris et al. (US-PGPUB-NO: 2021/0264448 A1) teaches privacy preserving AI derived simulated world.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LENIN PAULINO whose telephone number is (571)270-1734. The examiner can normally be reached Week 1: Mon-Thu 7:30am - 5:00pm Week 2: Mon-Thu 7:30am - 5:00pm and Fri 7:30am - 4:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bradley Teets can be reached on (571) 272-3338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LENIN PAULINO/Examiner, Art Unit 2197                                                                                                                                                                                                        

/BRADLEY A TEETS/Supervisory Patent Examiner, Art Unit 2197

Read full office action

Prosecution Timeline

Jun 30, 2022

Application Filed

Feb 02, 2024

Non-Final Rejection — §103

Mar 27, 2024

Applicant Interview (Telephonic)

Apr 01, 2024

Examiner Interview Summary

May 08, 2024

Response Filed

Jun 12, 2024

Final Rejection — §103

Jun 24, 2024

Interview Requested

Sep 10, 2024

Response after Non-Final Action

Sep 18, 2024

Examiner Interview (Telephonic)

Sep 20, 2024

Response after Non-Final Action

Sep 30, 2024

Request for Continued Examination

Oct 09, 2024

Response after Non-Final Action

Oct 24, 2024

Non-Final Rejection — §103

Nov 01, 2024

Interview Requested

Nov 13, 2024

Applicant Interview (Telephonic)

Nov 15, 2024

Examiner Interview Summary

Jan 31, 2025

Response Filed

Apr 04, 2025

Final Rejection — §103

May 07, 2025

Applicant Interview (Telephonic)

May 14, 2025

Examiner Interview Summary

Jul 09, 2025

Request for Continued Examination

Jul 16, 2025

Response after Non-Final Action

Aug 11, 2025

Non-Final Rejection — §103

Aug 13, 2025

Interview Requested

Sep 16, 2025

Applicant Interview (Telephonic)

Sep 18, 2025

Examiner Interview Summary

Nov 13, 2025

Response Filed

Jan 15, 2026

Final Rejection — §103

Feb 04, 2026

Interview Requested

Feb 12, 2026

Applicant Interview (Telephonic)

Feb 24, 2026

Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

18/320,142

Patent 12596635

BLACK-BOX FUZZING TESTING METHOD AND APPARATUS

2y 5m to grant Granted Apr 07, 2026

18/382,985

Patent 12541449

AUTOMATIC GENERATION OF ASSERT STATEMENTS FOR UNIT TEST CASES

2y 5m to grant Granted Feb 03, 2026

17/898,419

Patent 12524217

SYSTEMS AND METHODS FOR AUTOMATED RETROFITTING OF CUSTOMIZED CODE OBJECTS

2y 5m to grant Granted Jan 13, 2026

18/416,960

Patent 12517811

METHOD, SYSTEM AND DEVICE FOR GENERATING TEST CASE FOR AUTOMOTIVE CYBERSECURITY DETECTION

2y 5m to grant Granted Jan 06, 2026

18/420,858

Patent 12505029

SYSTEMS, METHODS, AND GRAPHICAL USER INTERFACES FOR GENERATING A COMPUTER-EXECUTABLE USABILITY STUDY APPLICATION

2y 5m to grant Granted Dec 23, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

7-8

Expected OA Rounds

57%

Grant Probability

82%

With Interview (+25.3%)

4y 2m

Median Time to Grant

High

PTA Risk

Based on 327 resolved cases by this examiner. Grant probability derived from career allow rate.

SIMULATED TRAINING FOR REINFORCEMENT LEARNING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email