Last updated: April 19, 2026
Application No. 18/738,598
HUMAN-IN-LOOP ROBOT TRAINING AND TESTING SYSTEM WITH GENERATIVE ARTIFICIAL INTELLIGENCE (AI)

Non-Final OA §103
Filed
Jun 10, 2024
Examiner
KENIRY, HEATHER J
Art Unit
3657
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Acumino
OA Round
1 (Non-Final)
Interview Optional

— +22.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 102 resolved cases, 2023–2026
Examiner Intelligence

KENIRY, HEATHER J View full profile →
Grants 78% — above average
Career Allow Rate
80 granted / 102 resolved
+26.4% vs TC avg
Strong +22% interview lift
Without
With
+22.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
32 currently pending
Career history
134
Total Applications
across all art units
Statute-Specific Performance

§101
13.1%
-26.9% vs TC avg
§103
50.8%
+10.8% vs TC avg
§102
14.8%
-25.2% vs TC avg
§112
18.9%
-21.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 102 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This is the first Office action on the merits. Claims 1-20 are currently pending and addressed below.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/25/2024 has been received. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
	
Claim Objections
Claims 14-20 are objected to because of the following informalities:
It appears that claims 14-20 have been erroneously claimed as dependent on the independent method claim 1 rather than the independent system claim 13. Clarification or correction on the record is earnestly solicited.
Claims 1 and 13 are objected to because of the following informalities:
“generative AI models” should be corrected to read “generative AI (artificial intelligence) models”
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Regarding claim 13, “computation device” will be interpreted under 112(f) because of the following three-prong analysis:
Prong 1: The claim uses the nonce term “device”.
Prong 2: The claim uses functional language to modify the nonce term.
Prong 3: Sufficient structure for performing the function is not recited within the claim.
This limitation is being interpreted according to the specification (paragraph 0046) as a processor.
Regarding claim 13, “data collection device” will be interpreted under 112(f) because of the following three-prong analysis:
Prong 1: The claim uses the nonce term “device”.
Prong 2: The claim uses functional language to modify the nonce term.
Prong 3: Sufficient structure for performing the function is not recited within the claim.
This limitation is being interpreted according to the specification (paragraphs 0014-0016) as cameras and sensors.
Regarding claim 13, “sensing devices” will be interpreted under 112(f) because of the following three-prong analysis:
Prong 1: The claim uses the nonce term “devices”.
Prong 2: The claim uses functional language to modify the nonce term.
Prong 3: Sufficient structure for performing the function is not recited within the claim.
This limitation is being interpreted according to the specification (paragraph 0012) as a camera.
Regarding claim 13, “mixed reality device” will be interpreted under 112(f) because of the following three-prong analysis:
Prong 1: The claim uses the nonce term “device”.
Prong 2: The claim uses functional language to modify the nonce term.
Prong 3: Sufficient structure for performing the function is not recited within the claim.
This limitation is being interpreted according to the specification (paragraph 0048) as “a virtual reality/augmented reality (VR/AR) device or other human-usable interface for providing a mixed reality view to a human, e.g., mixed reality glasses”.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-11, 13-17, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Song et al. (US 20220096197 A1), hereinafter Song in view of Rose et al. (US 20240253224 A1), hereinafter Rose and Bank et al. (US 20200030979 A1), hereinafter Bank.
Regarding claim 1, Song teaches:
1. A method for testing and/or improving a robot control system, the method comprising:
…
converting the high-level instructions into human-operated robot tasks; (Paragraph 0027, "During an example procedure (surgery), the patient 6 is prepped and draped in a sterile fashion to achieve anesthesia. Initial access to the surgical site may be performed manually while the arms of the robotic system 1 are in a stowed configuration or withdrawn configuration (to facilitate access to the surgical site.) Once access is completed, initial positioning or preparation of the robotic system 1 including its arms 4 may be performed. For example, the remote operator 9 at the user console 2 or the bedside operator 8 may use the handheld UIDs 14 to move the arm 4 from the stowed configuration to a preparation position above the patient 6 during the pre-operative setup. Alternatively, a surgeon or bedside personnel with a direct view of the table 5 may wear the AR headset disclosed herein to receive guidance on moving the arm 4. For example, the AR headset may render a virtual image of the actual stowed configuration of the arm 4, a virtual image of the desired preparation position, and a series of waypoints to guide the surgeon or bedside personnel on how to move the arm 4 from the current stowed configuration to the preparation position.")
providing the human-operated robot tasks to a mixed reality device worn by a human data collector, the mixed reality device rendering the human-operated robot tasks (Paragraph 0021, "Disclosed is an augmented reality (AR) headset that provides a wearer with spatial, system, and temporal contextual information of components of a surgical robotic system to guide the wearer in configuring and troubleshooting the surgical robotic system prior to, during, or after surgery. The spatial context information may be rendered to display spatially-fixed 3D-generated virtual models of the robotic arms, instruments, bed, and other components of the surgical robotic system that match the real-time actual position or orientation of the surgical robotic system in the AR headset's coordinate frame. A simultaneous localization and mapping (SLAM) algorithm may run on the AR headset to localize the position and orientation of the AR headset so the virtual models of the surgical robotic system are rendered to maintain the actual position and orientation of the surgical robotic system as the wearer moves about in the operating room. In one embodiment, virtual models representing the desired or target position and orientation of the robotic arm may be rendered to overlay the actual position and orientation of the robotic arm. The virtual models may be used to guide the wearer of the AR headset to move the robotic arm from the current to the target position and orientation.") in a manner that shows the human data collector how to perform the human-operated robot task; (Paragraph 0027, "During an example procedure (surgery), the patient 6 is prepped and draped in a sterile fashion to achieve anesthesia. Initial access to the surgical site may be performed manually while the arms of the robotic system 1 are in a stowed configuration or withdrawn configuration (to facilitate access to the surgical site.) Once access is completed, initial positioning or preparation of the robotic system 1 including its arms 4 may be performed. For example, the remote operator 9 at the user console 2 or the bedside operator 8 may use the handheld UIDs 14 to move the arm 4 from the stowed configuration to a preparation position above the patient 6 during the pre-operative setup. Alternatively, a surgeon or bedside personnel with a direct view of the table 5 may wear the AR headset disclosed herein to receive guidance on moving the arm 4. For example, the AR headset may render a virtual image of the actual stowed configuration of the arm 4, a virtual image of the desired preparation position, and a series of waypoints to guide the surgeon or bedside personnel on how to move the arm 4 from the current stowed configuration to the preparation position.")
…
Song does not specifically teach utilizing a library of prompt templates and a generative AI model to trigger a robot to perform tasks according to the library of prompts or using feedback from the mixed reality device to update the control of the robot. However, Rose, in the same field of endeavor of robotics, teaches:
… providing at a computation device a library of prompt templates that define one or more steps of one or more robot control tasks; (Paragraph 0032, "In some implementations, a robot system or control module may employ a finite Instruction Set comprising generalized reusable work primitives that can be combined (in various combinations and/or permutations) to execute a task. For example, a robot control system may store a library of reusable work primitives each corresponding to a respective basic sub-task or sub-action that the robot is operative to autonomously perform (hereafter referred to as an Instruction Set). A work objective may be analyzed to determine a sequence (i.e., a combination and/or permutation) of reusable work primitives that, when executed by the robot, will complete the work objective. The robot may execute the sequence of reusable work primitives to complete the work objective. In this way, a finite Instruction Set may be used to execute a wide range of different types of tasks and work objectives across a wide range of industries.")
providing a prompt to one or more generative AI models, the one or more generative AI models generating high-level instructions, the high-level instructions configured to trigger, when executed, a robot to perform the one or more robot control tasks comprising one or more steps defined by the library of prompt templates; (Paragraph 0049, "In some implementations of the present systems, methods, control modules, and computer program products, an LLM is used to assist in determining a sequence of reusable work primitives (hereafter “Instructions”), selected from a finite library of reusable work primitives (hereafter “Instruction Set”), that when executed by a robot will cause or enable the robot to complete a task. In some implementations, an LLM is used to assist in determining a “workflow”. For example, a robot control system may take a Natural Language (NL) command as input and return a Task Plan formed of a sequence of allowed Instructions drawn from an Instruction Set whose completion achieves the intent of the NL input. Throughout this specification and the appended claims, unless the specific context requires otherwise a Task Plan may comprise, or consist of, a workflow depending on the specific implementation. Take as an exemplary application the task of “kitting” a chess set comprising sixteen white chess pieces and sixteen black chess pieces. A person could say, or type, to the robot, e.g., “Put all the white pieces in the right hand bin and all the black pieces in the left hand bin” and an LLM could support a fully autonomous system that converts this input into a sequence of allowed Instructions that successfully performs the task. In this case, the LLM may help to allow the robot to perform general tasks specified in NL. General tasks include but are not limited to all work in the current economy.") …
However, Bank, in the same field of endeavor of robotics, teaches:
… receiving feedback data in response to the human data collector attempting to perform the human-operated robot tasks; (Paragraph 0032, "The MR simulation may be rerun to test the adjusted application, and with additional iterations as necessary, until the simulated operation of the robotic device is successful. The above simulation provides an example of programming the robotic device to learn various possible paths, such as a robotic arm with gripper, for which instructions are to be executed for motion control in conjunction with feedback from various sensor inputs.") and
updating the robot control system using the feedback data. (Paragraphs 0031-0032, "For the example simulation, the real object 214 may act as an obstacle for the virtual workpiece 222. For an initial programming of a spatial-related application, the MR device 115 may be used to observe the path of the virtual workpiece as it travels along a designed path 223 to a target 225. The user 101 may initially setup the application program using initial parameters to allow programming of the application. For example, the initial spatial parameters and constraints may be estimated with the knowledge that adjustments can be made in subsequent trials until a spatial tolerance threshold is met. Entry of the initial parameters may be input via the GUI device 105 or by an interface of the MR device 115. Adjustments to spatial parameters of the virtual robotic unit 231 and object 222 may be implemented using an interface tool displayed by the MR device 115. For example, spatial and orientation coordinates of the virtual gripper 224 may be set using a visual interface application running on the MR device 115. As the simulated operation is executed, one or more application modules 122 may receive inputs from the sensors, compute motion of virtual robotic unit 231 and/or gripper 224, monitor for obstacles based on additional sensor inputs, receive coordinates of obstacle object 214 from the vision system 212, and compute a new path if necessary. Should the virtual workpiece fail to follow the path 223 around the obstacle object 214, the user may interrupt the simulation, such as by using a hand gesture with the MR device 115, then modify the application using GUI 105 to make necessary adjustments.
The MR simulation may be rerun to test the adjusted application, and with additional iterations as necessary, until operation of the simulated robotic unit 231 is successful according to constraints, such as a spatial tolerance threshold. For example, the path 223 may be required to remain within a set of spatial boundaries to avoid collision with surrounding structures. As another example, the placement of object 222 may be constrained by a spatial range surrounding target location 225 based on coordination with subsequent tasks to be executed upon the object 222.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic control system and methods of operating as taught by Song with the ability to use a library of tasks to control the robot to achieve a high level command as taught by Rose and to utilize feedback data to update the control of the robot as taught by Bank. This would allow the system to improve operations and act more efficiently as well as effectively while performing target operations.
Regarding claim 2, where all the limitations of claim 1 are discussed above, Song further teaches:
2. The method of claim 1, wherein the rendering of the human-operated robot tasks at the mixed reality device (Paragraph 0021, "Disclosed is an augmented reality (AR) headset that provides a wearer with spatial, system, and temporal contextual information of components of a surgical robotic system to guide the wearer in configuring and troubleshooting the surgical robotic system prior to, during, or after surgery. The spatial context information may be rendered to display spatially-fixed 3D-generated virtual models of the robotic arms, instruments, bed, and other components of the surgical robotic system that match the real-time actual position or orientation of the surgical robotic system in the AR headset's coordinate frame. A simultaneous localization and mapping (SLAM) algorithm may run on the AR headset to localize the position and orientation of the AR headset so the virtual models of the surgical robotic system are rendered to maintain the actual position and orientation of the surgical robotic system as the wearer moves about in the operating room. In one embodiment, virtual models representing the desired or target position and orientation of the robotic arm may be rendered to overlay the actual position and orientation of the robotic arm. The virtual models may be used to guide the wearer of the AR headset to move the robotic arm from the current to the target position and orientation.") includes one or more of (1) input/output variables such as location of objects, (Paragraph 0035, "FIG. 2 shows the information exchange between an augmented reality headset and a surgical robotic system for the AR headset to display spatial, system, and temporal information of the components of the surgical robotic system based on establishing a common or global coordinate frame between the AR headset and the surgical robotic system using image sensors, in accordance with aspects of the subject technology. The AR headset may have one or more cameras that capture color and depth information of real scene objects. For example, the AR headset may have RGB and depth (RGBD) sensors to capture color images and depth-image information of the arms 4 and table 5 of the surgical robotic system 1 from the perspective of the wearer of the AR headset. The RGBD image captured by the AR headset is thus an image of the real-scene arms 4 and table 5 based on the coordinate frame of the AR headset. In one embodiment, the AR headset may run an object recognition algorithm to recognize the arms 4 and table 5. The surgical robotic system 1 may have a suite of RGBD sensors installed at various locations to capture color images and depth information of the configuration of the arms 4 and table 5. The RGBD images captured by the surgical robotic system 1 are thus the images of the arms 4 and the 5 based on the coordinate frame of the surgical robotic system 1. For the AR headset to render virtual recreation of the arms 4 and table 5 that matches the real-time real-scene positions and orientations of the arms 4 and table 5, or to render virtual images of the arms 4 and table 5 that may be fused with the real-time real-scene positions and orientations of the arms 4 and table 5, a common coordinate frame may be established between the AR headset and the surgical robotic system 1. In one embodiment, the surgical robotic system 1 may have other types of sensors such as infrared sensors to capture images and other information of the arms 4 and table 5 of the surgical robotic system 1 or the patient.") (2) visual/voice instructions for display on the mixed reality device, and (3) checkable events or metrics as a result of tasks that can be examined/scored by the mixed reality device or by external sensors/systems. (Paragraph 0042, "FIG. 4 shows the actual pose and the target pose of a robotic arm that may be used to generate waypoints as rendered on the augmented reality headset to guide medical staff in moving the robotic arm from the actual pose to the target pose, in accordance with aspects of the subject technology. The arm in its current pose 309 may be rendered as a virtual image or projected as a real-scene object captured by the RGBD sensor. The arm is also rendered as a virtual arm 311 in its target pose. The arm is to be moved along a trajectory 310 to the target pose of the virtual arm 311. The AR headset may render the virtual arm 311 at the same anchor/mount point as the arm in its current pose 309, thus giving the real and virtual robotic arms a common anchored reference point. In one embodiment, the AR headset may render the trajectory 310 as a series of waypoints or a series of virtual images of the arm. The bedside personnel wearing the AR headset may maneuver the arm from the current pose 309 along the waypoints or to align with the virtual images of the trajectory 310 until the arm finally aligns with the virtual arm 311 in its target pose. When the arm is at or within a tolerance of the target pose, the AR headset may respond with an indication to the user such as by highlighting the arm. The AR headset may also render visual cues or generate audio cues to help the user to maneuver the arm along the trajectory 310.")
Regarding claim 3, where all the limitations of claim 1 are discussed above, Song does not specifically teach generating a prompt from a combination of a template form the library and a user input. However, Rose, in the same field of endeavor of robotic control, teaches:
3. The method of claim 1, further comprising generating the prompt from a combination of a prompt template from the library of prompt templates (Paragraph 0037, "In accordance with the present robots, systems, control modules, computer program products, and methods, a catalog of reusable work primitives may be defined, identified, developed, or constructed such that any given work objective across multiple different work objectives may be completed by executing a corresponding workflow comprising a particular combination and/or permutation of reusable work primitives selected from the catalog of reusable work primitives. Once such a catalog of reusable work primitives has been established, one or more robot(s) may be trained to autonomously or automatically perform each individual reusable work primitive in the catalog of reusable work primitives without necessarily including the context of: i) a particular workflow of which the particular reusable work primitive being trained is a part, and/or ii) any other reusable work primitive that may, in a particular workflow, precede or succeed the particular reusable work primitive being trained. In this way, a semi-autonomous robot may be operative to autonomously or automatically perform each individual reusable work primitive in a catalog of reusable work primitives and only require instruction, direction, or guidance from another party (e.g., from an operator, user, or pilot) when it comes to deciding which reusable work primitive(s) to perform and/or in what order. In other words, an operator, user, pilot, or LLM module may provide a workflow consisting of reusable work primitives to a semi-autonomous robot system and the semi-autonomous robot system may autonomously or automatically execute the reusable work primitives according to the workflow to complete a work objective. For example, a semi-autonomous humanoid robot may be operative to autonomously look left when directed to look left, autonomously open its right end effector when directed to open its right end effector, and so on, without relying upon detailed low-level control of such functions by a third party. Such a semi-autonomous humanoid robot may autonomously complete a work objective once given instructions regarding a workflow detailing which reusable work primitives it must perform, and in what order, in order to complete the work objective. Furthermore, in accordance with the present robots, systems, methods, control modules and computer program products, a robot system may operate fully autonomously if it is trained or otherwise configured to (e.g. via consultation with an LLM module, which can be included in the robot system) analyze a work objective and independently define a corresponding workflow itself by deconstructing the work objective into a set of reusable work primitives from a library of reusable work primitives that the robot system is operative to autonomously perform.") and a user prompt. (Paragraph 00146, "Various implementations of the present systems, methods, control modules, and computer program products involve using NL expressions (descriptions) (e.g., via a NL prompt, which may be entered directly in text by a user or may be spoken vocally by a user and converted to text by an intervening voice-to-text system) to control functions and operations of a robot, where an LLM module may provide an interface between the NL expressions and the robot control system. This framework can be particularly advantageous when certain elements of the robot control architecture employ programming and/or instructions that can be expressed in NL. A suitable, but non-limiting, example of this is the aforementioned Instruction Set. For example, as mentioned earlier, a task plan output of an LLM module can be parsed (e.g., autonomously by the robot control system) by looking for a word match to Instruction Set commands, and the arguments of the Instruction Set can be found by string matching within the input NL prompt (e.g. by a text-string matching module as discussed earlier). In some implementations, a 1-1 map may be generated between the arguments used in the robot control system and NL variants, in order to increase the chance of the LLM module processing the text properly. For example, even though an object is represented in the robot control system (e.g., in a world model environment portion of the robot control system) as chess_pawn_54677, it may be referred to in the NL prompt as “chess pawn 1”. In this case, if the returned task plan contains the phrase “grasp chess pawn 1”, this may be matched to Instruction Set “grasp” and the object “chess pawn 1” so the phrase may be mapped to grasp(chess_pawn_54677). Such parsing and/or word matching (e.g. the text-string matching module) can be employed in any of the situations discussed herein where robot language is converted to natural language or vice-versa.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic control system and methods as taught by Song with the ability to control the system and generate instructions based on the library of operations as well as a user input as taught by Rose. This would allow a user to control the system using natural language inputs necessitating less training before they are able to operate the system effectively.
Regarding claim 4, where all the limitations of claim 1 are discussed above, Song does not specifically teach using high level instructions which correspond to pre-defined low level libraries. However, Rose, in the same field of endeavor of robotics, teaches:
4. The method of claim 1, wherein the high-level instructions call predefined low-level libraries. (Paragraph 0037, "In accordance with the present robots, systems, control modules, computer program products, and methods, a catalog of reusable work primitives may be defined, identified, developed, or constructed such that any given work objective across multiple different work objectives may be completed by executing a corresponding workflow comprising a particular combination and/or permutation of reusable work primitives selected from the catalog of reusable work primitives. Once such a catalog of reusable work primitives has been established, one or more robot(s) may be trained to autonomously or automatically perform each individual reusable work primitive in the catalog of reusable work primitives without necessarily including the context of: i) a particular workflow of which the particular reusable work primitive being trained is a part, and/or ii) any other reusable work primitive that may, in a particular workflow, precede or succeed the particular reusable work primitive being trained. In this way, a semi-autonomous robot may be operative to autonomously or automatically perform each individual reusable work primitive in a catalog of reusable work primitives and only require instruction, direction, or guidance from another party (e.g., from an operator, user, or pilot) when it comes to deciding which reusable work primitive(s) to perform and/or in what order. In other words, an operator, user, pilot, or LLM module may provide a workflow consisting of reusable work primitives to a semi-autonomous robot system and the semi-autonomous robot system may autonomously or automatically execute the reusable work primitives according to the workflow to complete a work objective. For example, a semi-autonomous humanoid robot may be operative to autonomously look left when directed to look left, autonomously open its right end effector when directed to open its right end effector, and so on, without relying upon detailed low-level control of such functions by a third party. Such a semi-autonomous humanoid robot may autonomously complete a work objective once given instructions regarding a workflow detailing which reusable work primitives it must perform, and in what order, in order to complete the work objective. Furthermore, in accordance with the present robots, systems, methods, control modules and computer program products, a robot system may operate fully autonomously if it is trained or otherwise configured to (e.g. via consultation with an LLM module, which can be included in the robot system) analyze a work objective and independently define a corresponding workflow itself by deconstructing the work objective into a set of reusable work primitives from a library of reusable work primitives that the robot system is operative to autonomously perform.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic control system and methods of operating as taught by Song with the low level operation libraries which may be used to accomplish high level tasks as taught by Rose. This allows the system to re-use may low level operations and efficiently program new tasks/actions in order for the system to complete high level operations.
Regarding claim 5, where all the limitations of claim 4 are discussed above, Song does not specifically teach the libraries including computer vision, motion planning, or motion execution. However, Rose, in the same field of endeavor of robotics, teaches:
5. The method of claim 4, wherein the predefined low-level libraries include one or more of computer vision libraries, motion planning libraries, (Paragraph 0152, "The various implementations described herein include systems, methods, control modules, and computer program products for leveraging one or more LLM(s) in a robot control system, including for example establishing an NL interface between the LLM(s) and the robot control system and calling the LLM(s) to help autonomously instruct the robot what to do. Example applications of this approach include task planning, motion planning, reasoning about the robot's environment (e.g., “what could I do now?”), and so on. Such implementations are particularly well-suited in robot control systems for which at least some control parameters and/or instructions (e.g., the Instruction Set described previously) are amenable to being specified in NL. Thus, some implementations may include converting or translating robot control instructions and/or parameters into NL for communicating such with the LLM(s) via the NL interface.") and motion execution libraries. (Paragraph 0078, "FIG. 2 is a flowchart diagram which illustrates an exemplary method 200 of operation of a robot system. Method 200 in FIG. 2 is similar in at least some respects to the method 100 of FIG. 1. In general, method 200 in FIG. 2 describes detailed implementations by which method 100 in FIG. 1 can be achieved. Method 200 is a method of operation of a robot system (such as robot system 700 discussed with reference to FIG. 7). In general, throughout this specification and the appended claims, a method of operation of a robot system is a method in which at least some, if not all, of the various acts are performed by the robot system. For example, certain acts of a method of operation of a robot system may be performed by at least one processor or processing unit (hereafter “processor”) of the robot system communicatively coupled to a non-transitory processor-readable storage medium of the robot system (collectively a robot controller of the robot system) and, in some implementations, certain acts of a method of operation of a robot system may be performed by peripheral components of the robot system that are communicatively coupled to the at least one processor, such as one or more physically actuatable components (e.g., arms, legs, end effectors, grippers, hands), one or more sensors (e.g., optical sensors, audio sensors, tactile sensors, haptic sensors), mobility systems (e.g., wheels, legs), communications and networking hardware (e.g., receivers, transmitters, transceivers), and so on. The non-transitory processor-readable storage medium of the robot system may store data (including, e.g., at least one library of reusable work primitives and at least one library of associated percepts) and/or processor-executable instructions that, when executed by the at least one processor, cause the robot system to perform the method and/or cause the at least one processor to perform those acts of the method that are performed by the at least one processor. The robot system may communicate, via communications and networking hardware communicatively coupled to the robot system's at least one processor, with remote systems and/or remote non-transitory processor-readable storage media. Thus, unless the specific context requires otherwise, references to a robot system's non-transitory processor-readable storage medium, as well as data and/or processor-executable instructions stored in a non-transitory processor-readable storage medium, are not intended to be limiting as to the physical location of the non-transitory processor-readable storage medium in relation to the at least one processor of the robot system and the rest of the robot hardware. In other words, a robot system's non-transitory processor-readable storage medium may include non-transitory processor-readable storage media located on-board a robot body of the robot system and/or non-transitory processor-readable storage media located remotely from the robot body, unless the specific context requires otherwise. Further, a method of operation of a robot system such as method 200 (or any of the other methods discussed herein) can be implemented as a robot control module or computer program product. Such a control module or computer program product comprises processor-executable instructions or data that, when the control module or computer program product is stored on a non-transitory processor-readable storage medium of the robot system, and the control module or computer program product is executed by at least one processor of the robot system, the control module or computer program product (or the processor-executable instructions or data thereof) cause the robot system to perform acts of the method.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic control system and methods of operating as taught by Song with the low level operation libraries defining motion planning and motion execution which may be used to accomplish high level tasks as taught by Rose. This allows the system to re-use may low level operations and efficiently program new tasks/actions in order for the system to complete high level operations.
Regarding claim 6, where all the limitations of claim 1 are discussed above, Song does not specifically teach replacing the task based on feedback data. However, Bank, in the same field of endeavor of robotics, teaches:
6. The method of claim 1, wherein the feedback data comprises an overwrite of one or more of the human-operated robot tasks by the human data collector. (Paragraphs 0030-0031, "For the example simulation, the real object 214 may act as an obstacle for the virtual workpiece 222. For an initial programming of a spatial-related application, the MR device 115 may be used to observe the path of the virtual workpiece as it travels along a designed path 223 to a target 225. The user 101 may initially setup the application program using initial parameters to allow programming of the application. For example, the initial spatial parameters and constraints may be estimated with the knowledge that adjustments can be made in subsequent trials until a spatial tolerance threshold is met. Entry of the initial parameters may be input via the GUI device 105 or by an interface of the MR device 115. Adjustments to spatial parameters of the virtual robotic unit 231 and object 222 may be implemented using an interface tool displayed by the MR device 115. For example, spatial and orientation coordinates of the virtual gripper 224 may be set using a visual interface application running on the MR device 115. As the simulated operation is executed, one or more application modules 122 may receive inputs from the sensors, compute motion of virtual robotic unit 231 and/or gripper 224, monitor for obstacles based on additional sensor inputs, receive coordinates of obstacle object 214 from the vision system 212, and compute a new path if necessary. Should the virtual workpiece fail to follow the path 223 around the obstacle object 214, the user may interrupt the simulation, such as by using a hand gesture with the MR device 115, then modify the application using GUI 105 to make necessary adjustments.
The MR simulation may be rerun to test the adjusted application, and with additional iterations as necessary, until operation of the simulated robotic unit 231 is successful according to constraints, such as a spatial tolerance threshold. For example, the path 223 may be required to remain within a set of spatial boundaries to avoid collision with surrounding structures. As another example, the placement of object 222 may be constrained by a spatial range surrounding target location 225 based on coordination with subsequent tasks to be executed upon the object 222." Examiner Note: The process is modified upon failure of the simulation. This anticipates the concept of replacing a portion of the plan based on feedback results.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic control system and methods as taught by Song with the ability to replace a task based on feedback data as taught by Bank. This would allow the system to fine-tune the operation of the robot for more effective control.
Regarding claim 7, where all the limitations of claim 1 are discussed above, Song does not specifically teach feedback data comprising a score. However, Bank, in the same field of endeavor of robotics, teaches:
7. The method of claim 1, wherein the feedback data comprises a score for one or more of the human-operated robot tasks provided by the human data collector. (Paragraphs 0030-0031, "For the example simulation, the real object 214 may act as an obstacle for the virtual workpiece 222. For an initial programming of a spatial-related application, the MR device 115 may be used to observe the path of the virtual workpiece as it travels along a designed path 223 to a target 225. The user 101 may initially setup the application program using initial parameters to allow programming of the application. For example, the initial spatial parameters and constraints may be estimated with the knowledge that adjustments can be made in subsequent trials until a spatial tolerance threshold is met. Entry of the initial parameters may be input via the GUI device 105 or by an interface of the MR device 115. Adjustments to spatial parameters of the virtual robotic unit 231 and object 222 may be implemented using an interface tool displayed by the MR device 115. For example, spatial and orientation coordinates of the virtual gripper 224 may be set using a visual interface application running on the MR device 115. As the simulated operation is executed, one or more application modules 122 may receive inputs from the sensors, compute motion of virtual robotic unit 231 and/or gripper 224, monitor for obstacles based on additional sensor inputs, receive coordinates of obstacle object 214 from the vision system 212, and compute a new path if necessary. Should the virtual workpiece fail to follow the path 223 around the obstacle object 214, the user may interrupt the simulation, such as by using a hand gesture with the MR device 115, then modify the application using GUI 105 to make necessary adjustments.
The MR simulation may be rerun to test the adjusted application, and with additional iterations as necessary, until operation of the simulated robotic unit 231 is successful according to constraints, such as a spatial tolerance threshold. For example, the path 223 may be required to remain within a set of spatial boundaries to avoid collision with surrounding structures. As another example, the placement of object 222 may be constrained by a spatial range surrounding target location 225 based on coordination with subsequent tasks to be executed upon the object 222." Examiner Note: The failure or success of an operation may be understood to be a binary score result provided as feedback.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic control methods and system as taught by Song with the ability to monitor the operation and assign a “score” to the operation as taught by Bank. This would allow the system to fine-tune the operation of the robot for more effective control.
Regarding claim 8, where all the limitations of claim 1 are discussed above, Song further teaches:
8. The method of claim 1, wherein the human data collector performs the human-operated robot tasks using a human-machine interface. (Paragraph 0005, "During surgery, control of the robotic arms may require control inputs from a user (e.g., surgeon or other operator) via one or more user interface devices that translate manipulations or commands from the user into control of the robotic arms. For example, in response to user commands, a tool driver having one or more motors may actuate one or more degrees of freedom of a surgical tool when the surgical tool is positioned at the surgical site in the patient.")
Regarding claim 9, where all the limitations of claim 8 are discussed above, Song further teaches:
9. The method of claim 8, wherein the human-machine interface comprises at least one input interface that receives user input (Paragraph 0025, "Generally, a remote operator 9, such as a surgeon or another person, may use the user console 2 to remotely manipulate the arms 4 and/or the attached surgical tools 7, e.g., teleoperation. The user console 2 may be located in the same operating room as the rest of the system 1, as shown in FIG. 1. In other environments however, the user console 2 may be located in an adjacent or nearby room, or it may be at a remote location, e.g., in a different building, city, or country. The user console 2 may comprise a seat 10, foot-operated controls 13, one or more handheld user input devices, UID 14, and at least one user display 15 that is configured to display, for example, a view of the surgical site inside the patient 6. In the example user console 2, the remote operator 9 is sitting in the seat 10 and viewing the user display 15 while manipulating a foot-operated control 13 and a handheld UID 14 in order to remotely control the arms 4 and the surgical tools 7 (that are mounted on the distal ends of the arms 4).") and provides corresponding control commands to a microcontroller unit to control the operation of one or more robotic components. (Paragraph 0029, " In one embodiment, the remote operator 9 holds and moves the UID 14 to provide an input command to move a robot arm actuator 17 in the robotic system 1. The UID 14 may be communicatively coupled to the rest of the robotic system 1, e.g., via a console computer system 16. The UID 14 can generate spatial state signals corresponding to movement of the UID 14, e.g. position and orientation of the handheld housing of the UID, and the spatial state signals may be input signals to control a motion of the robot arm actuator 17. The robotic system 1 may use control signals derived from the spatial state signals, to control proportional motion of the actuator 17. In one embodiment, a console processor of the console computer system 16 receives the spatial state signals and generates the corresponding control signals. Based on these control signals, which control how the actuator 17 is energized to move a segment or link of the arm 4, 
Read full office action
Prosecution Timeline

Jun 10, 2024
Application Filed
Nov 17, 2025
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/215,235
Patent 12600035
INFORMATION PROCESSING METHOD, INFORMATION PROCESSING APPARATUS, ROBOT SYSTEM, MANUFACTURING METHOD OF PRODUCT, AND STORAGE MEDIUM
2y 5m to grant Granted Apr 14, 2026
18/407,440
Patent 12583123
ITERATIVE CONTROL OF ROBOT FOR TARGET OBJECT
2y 5m to grant Granted Mar 24, 2026
18/254,268
Patent 12576539
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
2y 5m to grant Granted Mar 17, 2026
17/802,937
Patent 12562076
LEARNING ASSISTANCE SYSTEM, LEARNING ASSISTANCE METHOD, AND LEARNING ASSISTANCE STORAGE MEDIUM
2y 5m to grant Granted Feb 24, 2026
18/231,888
Patent 12558780
MULTI-PURPOSE ROBOTS AND COMPUTER PROGRAM PRODUCTS, AND METHODS FOR OPERATING THE SAME
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
78%
Grant Probability
99%
With Interview (+22.1%)
2y 7m
Median Time to Grant
Low
PTA Risk
Based on 102 resolved cases by this examiner. Grant probability derived from career allow rate.