Last updated: April 19, 2026
Application No. 18/954,334
MANIPULATION TASK SOLVER

Non-Final OA §102§103
Filed
Nov 20, 2024
Examiner
KENIRY, HEATHER J
Art Unit
3657
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Honda Motor Co. Ltd.
OA Round
1 (Non-Final)
Interview Optional

— +22.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 102 resolved cases, 2023–2026
Examiner Intelligence

KENIRY, HEATHER J View full profile →
Grants 78% — above average
Career Allow Rate
80 granted / 102 resolved
+26.4% vs TC avg
Strong +22% interview lift
Without
With
+22.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
32 currently pending
Career history
134
Total Applications
across all art units
Statute-Specific Performance

§101
13.1%
-26.9% vs TC avg
§103
50.8%
+10.8% vs TC avg
§102
14.8%
-25.2% vs TC avg
§112
18.9%
-21.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 102 resolved cases
Office Action

§102 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This is the first Office action on the merits. Claims 1-20 are currently pending and addressed below.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/11/2024 has been received. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
The information disclosure statement (IDS) submitted on 12/11/2024 has been received. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
The information disclosure statement (IDS) submitted on 02/11/2025 has been received. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claim 20 is objected to because of the following informalities:
Claim 20 appears to be a replica of claim 6. It is unclear if this is meant to depend from claim 1 or if this was a typographical error and claim 20 is intended to depend from claim 11 or claim 16.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-2, 4-5, 11, 13-14,16, and 18-19 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Oleynik et al. (US 20160059412 A1), hereinafter Oleynik.
Regarding claim 1, Oleynik teaches:
1. A manipulation task solver system, comprising:
a robot appendage;
a sensor sensing an object associated with a task including two or more sub-tasks, a state of an environment, a state of the robot appendage, and an action associated with the robot appendage; (Paragraph 0440, "The robotic hand 72 includes a camera sensor 684, such as an RGB-D sensor, an imaging sensor or a visual sensing device, placed in or near the middle of the palm for detecting the distance and shape of an object, as well as the distance of the object, and for handling a kitchen tool. The imaging sensor 682f provides guidance to the robotic hand 72 in moving the robotic hand 72 towards the direction of the object and to make necessary adjustments to grab an object. In addition, a sonar sensor, such as a tactile pressure sensor, may be placed near the palm of the robotic hand 72, for detecting the distance and shape of the object. The sonar sensor 682f can also guide the robotic hand 72 to move toward the object. Each of the sonar sensors 682a, 682b, 682c, 682d, 682e, 682f, 682g includes ultrasonic sensors, laser, radio frequency identification (RFID), and other suitable sensors. In addition, each of the sonar sensors 682a, 682b, 682c, 682d, 682e, 682f, 682g serves as a feedback mechanism to determine whether the robotic hand 72 continues to exert additional pressure to grab the object at such point where there is sufficient pressure to grab and lift the object. In addition, the sonar sensor 682f in the palm of the robotic hand 72 provides tactile sensing function to handle a kitchen tool. For example, when the robotic hand 72 grabs a knife to cut beef, the amount of pressure that the robotic hand 72 exerts on the knife and applies to the beef, allows the tactile sensor to detect when the knife finishes slicing the beef, i.e., when the knife has no resistance. The distributed pressure is not only to secure the object, but also so as not to exert too much pressure so as to, for example, not to break an egg). Furthermore, each finger on the robotic hand 72 has a sensor on the finger tip, as shown by the first sensor 682a on the finger tip of the thumb, the second sensor 682b on the finger tip of the index finger, the third sensor 682c on the finger tip of the middle finger, the fourth sensor 682d on the finger tip of the ring finger, and the fifth sensor 682f on the finger tip of the pinky. Each of the sensors 682a, 682b, 682c, 682d, 682e provide sensing capability on the distance and shape of the object, sensing capability for temperature or moisture, as well as tactile feedback capability.")
a memory storing one or more instructions; and
a processor executing one or more of the instructions stored on the memory to perform: (Paragraph 0757, "The computer devices 16 may represent any or the entire server, or any network intermediary devices. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein. The example computer system 3624 includes a processor 3626 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), a main memory 3628 and a static memory 3630, which communicate with each other via a bus 3632. The computer system 3624 may further include a video display unit 3634 (e.g., a liquid crystal display (LCD)). The computer system 3624 also includes an alphanumeric input device 3636 (e.g., a keyboard), a cursor control device 3638 (e.g., a mouse), a disk drive unit 3640, a signal generation device 3642 (e.g., a speaker), and a network interface device 3648.")
implementing the task based on a high-level policy including two or more low-level policies; (Paaragraph 0024, "Implementation of the above movements (described by articulating joint positions and velocities) and environment interactions (described by joint/interface torques and forces) is achieved by having computer playback desirable values for all required variables (positions/velocities and forces/torques) and feeding these to a controller system that faithfully implements them on each joint as a function of time at each time step. These variables and their sequence and feedback loops (hence not just data files, but also control programs), to ascertain the fidelity of the commanded movement/interactions, are all described in data-files that are combined into multi-level MMLs, which can be accessed and combined in multiple ways to allow a humanoid robot to execute multiple actions, such as cooking a meal, playing a piece of classical music on a piano, lifting an infirm person into/out-of a bed, etc. There are MMLs that describe simple rudimentary movement/interactions, which are then used as building-blocks for ever higher-level MMLs that describe ever-higher levels of manipulation, such as ‘grasp’, ‘lift’, ‘cut’ to higher level primitives, such as ‘stir liquid in pot’/‘pluck harp-string to g-flat’ or even high-level actions, such as ‘make a vinaigrette dressing’/‘paint a rural Brittany summer landscape’/‘play Bach's Piano-concerto #1’, etc. Higher level commands are simply a combination towards a sequence of serial/parallel lower- and mid-level MM primitives that are executed along a common timed stepped sequence, which is overseen by a combination of a set of planners running sequence/path/interaction profiles with feedback controllers to ensure the required execution fidelity (as defined in the output data contained within each MM sequence).") and
implementing the two or more sub-tasks based on the two or more low-level policies, (Paragraph 0026, "Embodiments of the present disclosure are directed to the technical features relating to the ability of being able to create complex robotic humanoid movements, actions, and interactions with tools and the instrumented environment by automatically building movements for the humanoid; actions and behaviors of the humanoid based on a set of computer-encoded robotic movement and action primitives. The primitives are defined by motions/actions of articulated degrees of freedom that range in complexity from simple to complex, and which can be combined in any form in serial/parallel fashion. These motion-primitives are termed to be minimanipulations and each has a clear time-indexed command input-structure and output behavior/performance profile that is intended to achieve a certain function. Minimanipulations comprise a new way of creating a general programmable-by-example platform for humanoid robots. One or more minimanipulation electronic libraries provide a large suite of higher-level sensing-and-execution sequences that are common building blocks for complex tasks, such as cooking, taking care of the infirm, or other tasks performed by the next generation of humanoid robots. Another way would be (again by way of an automated computer-controlled process employing specialized algorithms) to learn from online data (videos, pictures, sound logs, etc.) how to build a required sequence of actionable sequences using existing low-level MMLs to build the proper sequence and combinations to generate a task-specific MML.") wherein a first low-level policy and a second low-level policy of the two or more low-level policies are trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") using different types of machine learning approaches or model-based control approaches. (Paragraph 0269, "Machine Learning—refers to the technology wherein a software component or program improves its performance based on experience and feedback. One kind of machine learning often used in robotics is reinforcement learning, where desirable actions are rewarded and undesirable ones are penalized. Another kind is case-based learning, where previous solutions, e.g. sequences of actions by a human teacher or by the robot itself are remembered, together with any constraints or reasons for the solutions, and then are applied or reused in new settings. There are also additional kinds of machine learning, such as inductive and transductive methods.")
Regarding claim 2, where all the limitations of claim 1 are discussed above, Oleynik further teaches:
2. The manipulation task solver system of claim 1, wherein the two or more sub-tasks include reaching for the object, grasping the object, or reorienting the object after the object is grasped. (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.")
Regarding claim 4, where all the limitations of claim 1 are discussed above, Oleynik further teaches:
4. The manipulation task solver system of claim 1, wherein the two or more sub-tasks include reaching for the object and wherein the first low-level policy is associated with reaching for the object (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and is trained based on a model-based control approach. (Paragraph 0344, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.")
Regarding claim 5, where all the limitations of claim 1 are discussed above, Oleynik further teaches:
5. The manipulation task solver system of claim 1, wherein the two or more sub-tasks include grasping the object and wherein the second low-level policy is associated with grasping the object (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and is trained based on a reinforcement learning approach or an imitation learning approach. (Paragraph 0269, "Machine Learning—refers to the technology wherein a software component or program improves its performance based on experience and feedback. One kind of machine learning often used in robotics is reinforcement learning, where desirable actions are rewarded and undesirable ones are penalized. Another kind is case-based learning, where previous solutions, e.g. sequences of actions by a human teacher or by the robot itself are remembered, together with any constraints or reasons for the solutions, and then are applied or reused in new settings. There are also additional kinds of machine learning, such as inductive and transductive methods.")
Regarding claim 11, Oleynik further teaches:
11. A manipulation task solver system, comprising:
a robot appendage including an actuator;
a sensor sensing an object associated with a task including three or more sub-tasks, a state of an environment, a state of the robot appendage, and an action associated with the robot appendage; (Paragraph 0440, "The robotic hand 72 includes a camera sensor 684, such as an RGB-D sensor, an imaging sensor or a visual sensing device, placed in or near the middle of the palm for detecting the distance and shape of an object, as well as the distance of the object, and for handling a kitchen tool. The imaging sensor 682f provides guidance to the robotic hand 72 in moving the robotic hand 72 towards the direction of the object and to make necessary adjustments to grab an object. In addition, a sonar sensor, such as a tactile pressure sensor, may be placed near the palm of the robotic hand 72, for detecting the distance and shape of the object. The sonar sensor 682f can also guide the robotic hand 72 to move toward the object. Each of the sonar sensors 682a, 682b, 682c, 682d, 682e, 682f, 682g includes ultrasonic sensors, laser, radio frequency identification (RFID), and other suitable sensors. In addition, each of the sonar sensors 682a, 682b, 682c, 682d, 682e, 682f, 682g serves as a feedback mechanism to determine whether the robotic hand 72 continues to exert additional pressure to grab the object at such point where there is sufficient pressure to grab and lift the object. In addition, the sonar sensor 682f in the palm of the robotic hand 72 provides tactile sensing function to handle a kitchen tool. For example, when the robotic hand 72 grabs a knife to cut beef, the amount of pressure that the robotic hand 72 exerts on the knife and applies to the beef, allows the tactile sensor to detect when the knife finishes slicing the beef, i.e., when the knife has no resistance. The distributed pressure is not only to secure the object, but also so as not to exert too much pressure so as to, for example, not to break an egg). Furthermore, each finger on the robotic hand 72 has a sensor on the finger tip, as shown by the first sensor 682a on the finger tip of the thumb, the second sensor 682b on the finger tip of the index finger, the third sensor 682c on the finger tip of the middle finger, the fourth sensor 682d on the finger tip of the ring finger, and the fifth sensor 682f on the finger tip of the pinky. Each of the sensors 682a, 682b, 682c, 682d, 682e provide sensing capability on the distance and shape of the object, sensing capability for temperature or moisture, as well as tactile feedback capability." As well as Paragraph 0017, “Broadly stated, a humanoid having a robot computer controller operated by robot operating system (ROS) with robotic instructions comprises a database having a plurality of electronic minimanipulation libraries, each electronic minimanipulation library including a plurality of minimanipulation elements. The plurality of electronic minimanipulation libraries can be combined to create one or more machine executable application-specific instruction sets, and the plurality of minimanipulation elements within a electronic minimanipulation library can be combined to create one or more machine executable application-specific instruction sets; a robotic structure having an upper body and a lower body connected to a head through an articulated neck, the upper body including torso, shoulder, arms, and hands; and a control system, communicatively coupled to the database, a sensory system, a sensor data interpretation system, a motion planner, and actuators and associated controllers, the control system executing application-specific instruction sets to operate the robotic structure.”)
a memory storing one or more instructions; and
a processor executing one or more of the instructions stored on the memory to perform: (Paragraph 0757, "The computer devices 16 may represent any or the entire server, or any network intermediary devices. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein. The example computer system 3624 includes a processor 3626 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), a main memory 3628 and a static memory 3630, which communicate with each other via a bus 3632. The computer system 3624 may further include a video display unit 3634 (e.g., a liquid crystal display (LCD)). The computer system 3624 also includes an alphanumeric input device 3636 (e.g., a keyboard), a cursor control device 3638 (e.g., a mouse), a disk drive unit 3640, a signal generation device 3642 (e.g., a speaker), and a network interface device 3648.")
implementing the task via the robot appendage and the actuator based on a high-level policy including three or more low-level policies; (Paaragraph 0024, "Implementation of the above movements (described by articulating joint positions and velocities) and environment interactions (described by joint/interface torques and forces) is achieved by having computer playback desirable values for all required variables (positions/velocities and forces/torques) and feeding these to a controller system that faithfully implements them on each joint as a function of time at each time step. These variables and their sequence and feedback loops (hence not just data files, but also control programs), to ascertain the fidelity of the commanded movement/interactions, are all described in data-files that are combined into multi-level MMLs, which can be accessed and combined in multiple ways to allow a humanoid robot to execute multiple actions, such as cooking a meal, playing a piece of classical music on a piano, lifting an infirm person into/out-of a bed, etc. There are MMLs that describe simple rudimentary movement/interactions, which are then used as building-blocks for ever higher-level MMLs that describe ever-higher levels of manipulation, such as ‘grasp’, ‘lift’, ‘cut’ to higher level primitives, such as ‘stir liquid in pot’/‘pluck harp-string to g-flat’ or even high-level actions, such as ‘make a vinaigrette dressing’/‘paint a rural Brittany summer landscape’/‘play Bach's Piano-concerto #1’, etc. Higher level commands are simply a combination towards a sequence of serial/parallel lower- and mid-level MM primitives that are executed along a common timed stepped sequence, which is overseen by a combination of a set of planners running sequence/path/interaction profiles with feedback controllers to ensure the required execution fidelity (as defined in the output data contained within each MM sequence).") and
implementing the three or more sub-tasks via the robot appendage and the actuator based on the three or more low-level policies, (Paragraph 0026, "Embodiments of the present disclosure are directed to the technical features relating to the ability of being able to create complex robotic humanoid movements, actions, and interactions with tools and the instrumented environment by automatically building movements for the humanoid; actions and behaviors of the humanoid based on a set of computer-encoded robotic movement and action primitives. The primitives are defined by motions/actions of articulated degrees of freedom that range in complexity from simple to complex, and which can be combined in any form in serial/parallel fashion. These motion-primitives are termed to be minimanipulations and each has a clear time-indexed command input-structure and output behavior/performance profile that is intended to achieve a certain function. Minimanipulations comprise a new way of creating a general programmable-by-example platform for humanoid robots. One or more minimanipulation electronic libraries provide a large suite of higher-level sensing-and-execution sequences that are common building blocks for complex tasks, such as cooking, taking care of the infirm, or other tasks performed by the next generation of humanoid robots. Another way would be (again by way of an automated computer-controlled process employing specialized algorithms) to learn from online data (videos, pictures, sound logs, etc.) how to build a required sequence of actionable sequences using existing low-level MMLs to build the proper sequence and combinations to generate a task-specific MML.") wherein a first low-level policy, a second low-level policy, and a third low-level policy of the three or more low-level policies are each trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") using different types of machine learning approaches or model-based control approaches. (Paragraph 0269, "Machine Learning—refers to the technology wherein a software component or program improves its performance based on experience and feedback. One kind of machine learning often used in robotics is reinforcement learning, where desirable actions are rewarded and undesirable ones are penalized. Another kind is case-based learning, where previous solutions, e.g. sequences of actions by a human teacher or by the robot itself are remembered, together with any constraints or reasons for the solutions, and then are applied or reused in new settings. There are also additional kinds of machine learning, such as inductive and transductive methods.")
Regarding claim 13, where all the limitations of claim 11 are discussed above, Oleynik further teaches:
13. The manipulation task solver system of claim 11, wherein the three or more sub-tasks include reaching for the object and wherein the first low-level policy is associated with reaching for the object (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and is trained based on a model-based control approach. (Paragraph 0344, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.")
Regarding claim 14, where all the limitations of claim 11 are discussed above, Oleynik further teaches:
14. The manipulation task solver system of claim 11, wherein the three or more sub-tasks include grasping the object and wherein the second low-level policy is associated with grasping the object (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and is trained based on a reinforcement learning approach or an imitation learning approach. (Paragraph 0269, "Machine Learning—refers to the technology wherein a software component or program improves its performance based on experience and feedback. One kind of machine learning often used in robotics is reinforcement learning, where desirable actions are rewarded and undesirable ones are penalized. Another kind is case-based learning, where previous solutions, e.g. sequences of actions by a human teacher or by the robot itself are remembered, together with any constraints or reasons for the solutions, and then are applied or reused in new settings. There are also additional kinds of machine learning, such as inductive and transductive methods.")
Regarding claim 16, Oleynik further teaches:
16. A computer-implemented method for manipulation task solving, comprising:
sensing an object associated with a task including two or more sub-tasks, a state of an environment, a state of a robot appendage, and an action associated with the robot appendage; (Paragraph 0440, "The robotic hand 72 includes a camera sensor 684, such as an RGB-D sensor, an imaging sensor or a visual sensing device, placed in or near the middle of the palm for detecting the distance and shape of an object, as well as the distance of the object, and for handling a kitchen tool. The imaging sensor 682f provides guidance to the robotic hand 72 in moving the robotic hand 72 towards the direction of the object and to make necessary adjustments to grab an object. In addition, a sonar sensor, such as a tactile pressure sensor, may be placed near the palm of the robotic hand 72, for detecting the distance and shape of the object. The sonar sensor 682f can also guide the robotic hand 72 to move toward the object. Each of the sonar sensors 682a, 682b, 682c, 682d, 682e, 682f, 682g includes ultrasonic sensors, laser, radio frequency identification (RFID), and other suitable sensors. In addition, each of the sonar sensors 682a, 682b, 682c, 682d, 682e, 682f, 682g serves as a feedback mechanism to determine whether the robotic hand 72 continues to exert additional pressure to grab the object at such point where there is sufficient pressure to grab and lift the object. In addition, the sonar sensor 682f in the palm of the robotic hand 72 provides tactile sensing function to handle a kitchen tool. For example, when the robotic hand 72 grabs a knife to cut beef, the amount of pressure that the robotic hand 72 exerts on the knife and applies to the beef, allows the tactile sensor to detect when the knife finishes slicing the beef, i.e., when the knife has no resistance. The distributed pressure is not only to secure the object, but also so as not to exert too much pressure so as to, for example, not to break an egg). Furthermore, each finger on the robotic hand 72 has a sensor on the finger tip, as shown by the first sensor 682a on the finger tip of the thumb, the second sensor 682b on the finger tip of the index finger, the third sensor 682c on the finger tip of the middle finger, the fourth sensor 682d on the finger tip of the ring finger, and the fifth sensor 682f on the finger tip of the pinky. Each of the sensors 682a, 682b, 682c, 682d, 682e provide sensing capability on the distance and shape of the object, sensing capability for temperature or moisture, as well as tactile feedback capability.")
implementing the task based on a high-level policy including two or more low-level policies; (Paaragraph 0024, "Implementation of the above movements (described by articulating joint positions and velocities) and environment interactions (described by joint/interface torques and forces) is achieved by having computer playback desirable values for all required variables (positions/velocities and forces/torques) and feeding these to a controller system that faithfully implements them on each joint as a function of time at each time step. These variables and their sequence and feedback loops (hence not just data files, but also control programs), to ascertain the fidelity of the commanded movement/interactions, are all described in data-files that are combined into multi-level MMLs, which can be accessed and combined in multiple ways to allow a humanoid robot to execute multiple actions, such as cooking a meal, playing a piece of classical music on a piano, lifting an infirm person into/out-of a bed, etc. There are MMLs that describe simple rudimentary movement/interactions, which are then used as building-blocks for ever higher-level MMLs that describe ever-higher levels of manipulation, such as ‘grasp’, ‘lift’, ‘cut’ to higher level primitives, such as ‘stir liquid in pot’/‘pluck harp-string to g-flat’ or even high-level actions, such as ‘make a vinaigrette dressing’/‘paint a rural Brittany summer landscape’/‘play Bach's Piano-concerto #1’, etc. Higher level commands are simply a combination towards a sequence of serial/parallel lower- and mid-level MM primitives that are executed along a common timed stepped sequence, which is overseen by a combination of a set of planners running sequence/path/interaction profiles with feedback controllers to ensure the required execution fidelity (as defined in the output data contained within each MM sequence).") and
implementing the two or more sub-tasks based on the two or more low-level policies, (Paragraph 0026, "Embodiments of the present disclosure are directed to the technical features relating to the ability of being able to create complex robotic humanoid movements, actions, and interactions with tools and the instrumented environment by automatically building movements for the humanoid; actions and behaviors of the humanoid based on a set of computer-encoded robotic movement and action primitives. The primitives are defined by motions/actions of articulated degrees of freedom that range in complexity from simple to complex, and which can be combined in any form in serial/parallel fashion. These motion-primitives are termed to be minimanipulations and each has a clear time-indexed command input-structure and output behavior/performance profile that is intended to achieve a certain function. Minimanipulations comprise a new way of creating a general programmable-by-example platform for humanoid robots. One or more minimanipulation electronic libraries provide a large suite of higher-level sensing-and-execution sequences that are common building blocks for complex tasks, such as cooking, taking care of the infirm, or other tasks performed by the next generation of humanoid robots. Another way would be (again by way of an automated computer-controlled process employing specialized algorithms) to learn from online data (videos, pictures, sound logs, etc.) how to build a required sequence of actionable sequences using existing low-level MMLs to build the proper sequence and combinations to generate a task-specific MML.") wherein a first low-level policy and a second low-level policy of the two or more low-level policies are trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") using different types of machine learning approaches or model-based control approaches. (Paragraph 0269, "Machine Learning—refers to the technology wherein a software component or program improves its performance based on experience and feedback. One kind of machine learning often used in robotics is reinforcement learning, where desirable actions are rewarded and undesirable ones are penalized. Another kind is case-based learning, where previous solutions, e.g. sequences of actions by a human teacher or by the robot itself are remembered, together with any constraints or reasons for the solutions, and then are applied or reused in new settings. There are also additional kinds of machine learning, such as inductive and transductive methods.")
Regarding claim 18, where all the limitations of claim 16 are discussed above, Oleynik further teaches:
18. The computer-implemented method for manipulation task solving of claim 16, wherein the two or more sub-tasks include reaching for the object and wherein the first low-level policy is associated with reaching for the object (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and is trained based on a model-based control approach. (Paragraph 0344, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.")
Regarding claim 19, where all the limitations of claim 16 are discussed above, Oleynik further teaches:
19. The computer-implemented method for manipulation task solving of claim 16, wherein the two or more sub-tasks include grasping the object and wherein the second low-level policy is associated with grasping the object (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and is trained based on a reinforcement learning approach or an imitation learning approach. (Paragraph 0269, "Machine Learning—refers to the technology wherein a software component or program improves its performance based on experience and feedback. One kind of machine learning often used in robotics is reinforcement learning, where desirable actions are rewarded and undesirable ones are penalized. Another kind is case-based learning, where previous solutions, e.g. sequences of actions by a human teacher or by the robot itself are remembered, together with any constraints or reasons for the solutions, and then are applied or reused in new settings. There are also additional kinds of machine learning, such as inductive and transductive methods.")

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 3, 12, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Oleynik in view of Bottero et al. (US 20240198518 A1), hereinafter Bottero.
Regarding claim 3, where all the limitations of claim 1 are discussed above, Oleynik further teaches:
3. The manipulation task solver system of claim 1, wherein the high-level policy is trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") …
Oleynik does not specifically teach the use of a long horizon task Markov Decision Process. However, Bottero, in the same field of endeavor of robotics, teaches:
… by formulating the task as a long-horizon task Markov Decision Process (MDP). (Paragraph 0043, "In the following, an agent is considered acting in an infinite-horizon MDP custom-character={M,S,p,ρ,r,γ} with finite state space |S|=S, finite action space |A|=A, unknown transition function p:S×A -> Δ(S) that maps states and actions to the S-dimensional probability simplex, an initial state distribution ρ:S -> [0,1], a known and bounded reward function r:S×A -> R, and a discount factor γ∈[0,1).")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to train using an infinite-horizon MDP as anticipated by Bottero. This would allow the system to efficiently train for long lasting high level tasks.
Regarding claim 12, where all the limitations of claim 11 are discussed above, Oleynik further teaches:
12. The manipulation task solver system of claim 11, wherein the high-level policy is trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") …
Oleynik does not specifically teach the use of a long horizon task Markov Decision Process. However, Bottero, in the same field of endeavor of robotics, teaches:
… by formulating the task as a long-horizon task Markov Decision Process (MDP). (Paragraph 0043, "In the following, an agent is considered acting in an infinite-horizon MDP custom-character={M,S,p,ρ,r,γ} with finite state space |S|=S, finite action space |A|=A, unknown transition function p:S×A -> Δ(S) that maps states and actions to the S-dimensional probability simplex, an initial state distribution ρ:S -> [0,1], a known and bounded reward function r:S×A -> R, and a discount factor γ∈[0,1).")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to train using an infinite-horizon MDP as anticipated by Bottero. This would allow the system to efficiently train for long lasting high level tasks.
Regarding claim 17, where all the limitations of claim 16 are discussed above, Oleynik further teaches:
17. The computer-implemented method for manipulation task solving of claim 16, wherein the high-level policy is trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") …
Oleynik does not specifically teach the use of a long horizon task Markov Decision Process. However, Bottero, in the same field of endeavor of robotics, teaches:
… by formulating the task as a long-horizon task Markov Decision Process (MDP). (Paragraph 0043, "In the following, an agent is considered acting in an infinite-horizon MDP custom-character={M,S,p,ρ,r,γ} with finite state space |S|=S, finite action space |A|=A, unknown transition function p:S×A -> Δ(S) that maps states and actions to the S-dimensional probability simplex, an initial state distribution ρ:S -> [0,1], a known and bounded reward function r:S×A -> R, and a discount factor γ∈[0,1).")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to train using an infinite-horizon MDP as anticipated by Bottero. This would allow the system to efficiently train for long lasting high level tasks.
Claim(s) 6-9, 15, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Oleynik in view of Horowitz et al. (US 20240198518 A1), hereinafter Horowitz.
Regarding claim 6, where all the limitations of claim 1 are discussed above, Oleynik further teaches:
6. The manipulation task solver system of claim 1, wherein the two or more sub-tasks include reorienting the object after the object is grasped (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and wherein a third low-level policy is associated with reorienting the object after the object is grasped (Paragraph 0326, "The standardized robotic kitchen module 50 has as one objective: the standardization of the kitchen module 50 and various components with the kitchen module itself to ensure consistency in both the chef kitchen 44 and the robotic kitchen 48 to maximize the preciseness of recipe replication while minimizing the risks of deviations from precise replication of a recipe dish between the chef kitchen 44 and the robotic kitchen 48. One main purpose of having the standardization of the kitchen module 50 is to obtain the same result of the cooking process (or the same dish) between a first food dish prepared by the chef and a subsequent replication of the same recipe process via the robotic kitchen. Conceiving a standardized platform in the standardized robotic kitchen module 50 between the chef kitchen 44 and the robotic kitchen 48 has several key considerations: same timeline, same program or mode, and quality check. The same timeline in the standardized robotic kitchen 50 where the chef prepares a food dish at the chef kitchen 44 and the replication process by the robotic hands in the robotic kitchen 48 refers to the same sequence of manipulations, the same initial and ending time of each manipulation, and the same speed of moving an object between handling operations. The same program or mode in the standardized robotic kitchen 50 refers to the use and operation of standardized equipment during each manipulation recording and execution step. The quality check refers to three-dimensional vision sensors in the standardized robotic kitchen 50, which monitor and adjust in real time each manipulation action during the food preparation process to correct any deviation and avoid a flawed result. The adoption of the standardized robotic kitchen module 50 reduces and minimizes the risks of not obtaining the same result between the chef's prepared food dish and the food dish prepared by the robotic kitchen using robotic arms and hands. Without the standardization of a robotic kitchen module and the components within the robotic kitchen module, the increased variations between the chef kitchen 44 and the robotic kitchen 48 increase the risks of not being able to obtain the same result between the chef's prepared food dish and the food dish prepared by the robotic kitchen because more elaborate and complex adjustment algorithms will be required with different kitchen modules, different kitchen equipment, different kitchenware, different kitchen tools, and different ingredients between the chef kitchen 44 and the robotic kitchen 48.") and is trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") …
Oleynik does not specifically teach using a knowledge distillation or teacher-student model approach in training. However, Horowitz, in the same field of endeavor of robotics, teaches:
… based on a knowledge distillation or teacher-student model approach. (Paragraph 0051, "In some embodiments, model training logic 202 is configured to enable both a scalable compute/storage framework for the development of large-scale machine learning models and a distributed sorting facility approach to ensure the broadest possible dataset for the training. In material sorting, the breadth of possible object types coupled with the domain of possible material characteristics for each object represents a vast data set that requires an innovative approach to data management and machine learning model training. Typical storage and computation available to a local object recognition system represents potential barriers to entirely local or on-facility systems. Furthermore, the data set available to an individual sorting facility is in itself limited to the subset of objects and characteristics available on a regular basis within that sorting facility. In some embodiments, model training logic 202 is configured to create an offline “parent” model against a very large and diverse dataset aggregated across multiple sorting facilities. The parent approach creates ongoing high-confidence machine learning models using virtually unlimited computational resources, regressive training, ensemble techniques (e.g., voting-by-consensus), all without the on-site latency constraints inherent in a live sorting environment at a particular sorting facility. The data set used for training is sourced across all child/sorting facility sites, in addition to including data from manufacturers of objects and any other available third-party sources. Once created, model training logic 202 is configured to dynamically propagate the parent machine learning model to compute nodes and/or sorting devices for real-time implementation at the sorting facilities. An advantage of this approach is that the compute nodes and/or sorting devices at the sorting facilities can use a variety of techniques (e.g., bounding box jitter, temporal disagreement, low confidence, etc.) to surface problem areas to the parent model. This, in turn, can then refine the model and provide the machine learning capabilities at the sorting facilities with high-quality corrections to its own predictions, enabling it to train and improve over time, based on the parent model's classifications. At this point, the sorting facility components can retrain the parent models against these failure or adverse scenarios, improving them over time. In some embodiments, the parent model that has been received at a sorting facility is retrained (e.g., at the cloud sorting server by model training logic 202 or by a compute node at the sorting facility) on a dataset that comprises primarily data from within that facility, or similar sorting facilities within the same geographic region, allowing the machine learning model to refine itself against the expected material within a facility or within a region. A further advantage is that the parent model at the cloud sorting server also improves with each failure case, as the parent model changes are propagated not just to the sorting facility experiencing the failure scenario, but to all sorting facilities. In some embodiments, the cloud and facility software architecture is configured to support a large set of output layers trained for each material characteristic of each target object. In some embodiments, a “noisy student” approach is taken to utilize the large quantities of data captured by components (e.g., object recognition devices, sorting devices, compute nodes) in the sorting facilities. In such embodiments, the core “teacher” model is trained by model training logic 202 on a known set of labeled data to build the “teacher” model with a configurable error threshold. At this point, one or more “student” models are created from the teacher model, and trained using the much larger data set encountered by many components in the sorting facilities. In this second training process, “noise” is added to the new data, requiring the student model to learn more general predictions, in order to compensate for the inconsistency in the data caused by noising. This results in a net improvement in object recognition accuracy and robustness. This process may be implemented one or more times (e.g., by model training logic 202) to reach a desired accuracy level, and the parent model can then be augmented with the student model. Note that as more data is gathered by the sorting facility components, this process may be run repeatedly by model training logic 202, resulting in both increased accuracy and increased model capabilities. An adjunct benefit of the parent-child model is the auto-learning capability inherent in this system. A baseline machine learning model can be created using sourced sample materials (e.g., from laboratories, reverse search, manual labeling, etc.). When this seed machine learning model is brought online, the base model is augmented with the data obtained from each sorting facility and as a result, each problem identification encountered at the sorting facility is presented as an opportunity to augment training of the base model. Model metadata (such as described below) is uploaded on a regular or continuous basis to the cloud sorting server. During anomalous events (e.g., difficult target identification, errors, etc.), metadata is augmented with full image, raw sensor data, and even video data associated with the event. This data can then be used to annotate the parent model, either manually (e.g., human intervention) or automatically (e.g., automatic retraining based on the new data). Given the large datasets involved, an optimization offered by this implementation is the ability to manage and support the system using only metadata (very small data structures), and only requiring large data transmissions during anomalies.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to use a teacher-student model during training as taught by Horowitz. This would allow for more efficient processing using the student model so that the system may operate efficiently while maintaining a high level of accuracy.
Regarding claim 7, where all the limitations of claim 6 are discussed above, Oleynik does not specifically teach using a teacher-student model approach. However, Horowitz, in the same field of endeavor of robotics, teaches:
7. The manipulation task solver system of claim 6, wherein the teacher-student model approach includes a teacher model and a student model. (Paragraph 0051, "In some embodiments, model training logic 202 is configured to enable both a scalable compute/storage framework for the development of large-scale machine learning models and a distributed sorting facility approach to ensure the broadest possible dataset for the training. In material sorting, the breadth of possible object types coupled with the domain of possible material characteristics for each object represents a vast data set that requires an innovative approach to data management and machine learning model training. Typical storage and computation available to a local object recognition system represents potential barriers to entirely local or on-facility systems. Furthermore, the data set available to an individual sorting facility is in itself limited to the subset of objects and characteristics available on a regular basis within that sorting facility. In some embodiments, model training logic 202 is configured to create an offline “parent” model against a very large and diverse dataset aggregated across multiple sorting facilities. The parent approach creates ongoing high-confidence machine learning models using virtually unlimited computational resources, regressive training, ensemble techniques (e.g., voting-by-consensus), all without the on-site latency constraints inherent in a live sorting environment at a particular sorting facility. The data set used for training is sourced across all child/sorting facility sites, in addition to including data from manufacturers of objects and any other available third-party sources. Once created, model training logic 202 is configured to dynamically propagate the parent machine learning model to compute nodes and/or sorting devices for real-time implementation at the sorting facilities. An advantage of this approach is that the compute nodes and/or sorting devices at the sorting facilities can use a variety of techniques (e.g., bounding box jitter, temporal disagreement, low confidence, etc.) to surface problem areas to the parent model. This, in turn, can then refine the model and provide the machine learning capabilities at the sorting facilities with high-quality corrections to its own predictions, enabling it to train and improve over time, based on the parent model's classifications. At this point, the sorting facility components can retrain the parent models against these failure or adverse scenarios, improving them over time. In some embodiments, the parent model that has been received at a sorting facility is retrained (e.g., at the cloud sorting server by model training logic 202 or by a compute node at the sorting facility) on a dataset that comprises primarily data from within that facility, or similar sorting facilities within the same geographic region, allowing the machine learning model to refine itself against the expected material within a facility or within a region. A further advantage is that the parent model at the cloud sorting server also improves with each failure case, as the parent model changes are propagated not just to the sorting facility experiencing the failure scenario, but to all sorting facilities. In some embodiments, the cloud and facility software architecture is configured to support a large set of output layers trained for each material characteristic of each target object. In some embodiments, a “noisy student” approach is taken to utilize the large quantities of data captured by components (e.g., object recognition devices, sorting devices, compute nodes) in the sorting facilities. In such embodiments, the core “teacher” model is trained by model training logic 202 on a known set of labeled data to build the “teacher” model with a configurable error threshold. At this point, one or more “student” models are created from the teacher model, and trained using the much larger data set encountered by many components in the sorting facilities. In this second training process, “noise” is added to the new data, requiring the student model to learn more general predictions, in order to compensate for the inconsistency in the data caused by noising. This results in a net improvement in object recognition accuracy and robustness. This process may be implemented one or more times (e.g., by model training logic 202) to reach a desired accuracy level, and the parent model can then be augmented with the student model. Note that as more data is gathered by the sorting facility components, this process may be run repeatedly by model training logic 202, resulting in both increased accuracy and increased model capabilities. An adjunct benefit of the parent-child model is the auto-learning capability inherent in this system. A baseline machine learning model can be created using sourced sample materials (e.g., from laboratories, reverse search, manual labeling, etc.). When this seed machine learning model is brought online, the base model is augmented with the data obtained from each sorting facility and as a result, each problem identification encountered at the sorting facility is presented as an opportunity to augment training of the base model. Model metadata (such as described below) is uploaded on a regular or continuous basis to the cloud sorting server. During anomalous events (e.g., difficult target identification, errors, etc.), metadata is augmented with full image, raw sensor data, and even video data associated with the event. This data can then be used to annotate the parent model, either manually (e.g., human intervention) or automatically (e.g., automatic retraining based on the new data). Given the large datasets involved, an optimization offered by this implementation is the ability to manage and support the system using only metadata (very small data structures), and only requiring large data transmissions during anomalies.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to use a teacher-student model during training as taught by Horowitz. This would allow for more efficient processing using the student model so that the system may operate efficiently while maintaining a high level of accuracy.
Regarding claim 8, where all the limitations of claim 7 are discussed above, Oleynik further teaches:
8. The manipulation task solver system of claim 7, wherein … based on a pose of the robot appendage, a velocity of the robot appendage, a torque associated with of the robot appendage, (Paragraph 0460, "FIG. 27 is a flow diagram illustrating the process 932 for testing and learning of minimanipulations. At step 934, the computer performs a food preparation task composition analysis in which each cooking operation (e.g. cracking an egg with a knife) is analyzed, decomposed, and constructed into a sequence of action primitives or minimanipulations. In one embodiment, a minimanipulation refers to a sequence of one or more action primitives that accomplish a basic functional outcome (e.g., the egg has been cracked, or a vegetable sliced) that advances toward a specific result in preparing a food dish. In this embodiment, a minimanipulation can be further described as a low-level minimanipulation or a high-level minimanipulation where a low-level minimanipulation refers to a sequence of action primitives that requires minimal interaction forces and relies almost exclusively on the use of the robotic apparatus 75, and a high-level minimanipulation refers to a sequence of action primitives requiring a substantial amount of interaction and interaction forces and control thereof. The process loop 936 focuses on minimanipulation and learning steps and comprises tests, which are repeated many times (e.g. 100 times) to ensure the reliability of minimanipulations. At step 938, the robotic food preparation engine 56 is configured to assess the knowledge of all possibilities to perform a food preparation stage or a minimanipulation, where each minimanipulation is tested with respect to orientations, positions/velocities, angles, forces, pressures, and speeds with a particular minimanipulation. A minimanipulation or an action primitive may involve the robotic hand 72 and a standard object, or the robotic hand 72 and a nonstandard object. At step 940, the robotic food preparation engine 56 is configured to execute the minimanipulation and determine if the outcome can be deemed successful or a failure. At step 942, the computer 16 conducts an automated analysis and reasoning about the failure of the minimanipulation. For example, the multimodal sensors may provide sensing feedback data on the success or failure of the minimanipulation. At step 944, the computer 16 is configured to make a real-time adjustment and adjusts the parameters of the minimanipulation execution process. At step 946, the computer 16 adds new information about the success or failure of the parameter adjustment to the minimanipulation library as a learning mechanism to the robotic food preparation engine 56.") one or more previous actions taken by the robot appendage, (Paragraph 0668, "A separate MM library access manager 3169 is responsible for checking-out proper libraries and their associated datasets (parameters, time-histories, performance metrics, etc.) 3169-1 to pass onto a remote robotic replication system, as well as checking back in updated minimanipulation motion primitives (parameters, performance metrics, etc.) 3169-2 based on learned and optimized minimanipulation executions by one or more same/different remote robotic systems. This ensures the library continually grows and is optimized by a growing number of remote robotic execution platforms.") tactile information associated with the robot appendage, (Paragraph 0670, "At a high level, this is achieved by downloading the task-descriptive libraries containing the complete set of minimanipulation datasets required by the robotic system, and providing them to a robot controller for execution. The robot controller generates the required command and motion sequences that the execution module interprets and carries out, while receiving feedback from the entire system to allow it to follow profiles established for joint and limb positions and velocities as well as (internal and external) forces and torques. A parallel performance monitoring process uses task-descriptive functional and performance metrics to track and process the robot's actions to ensure the required task-fidelity. A minimanipulation learning-and-adaptation process is allowed to take any minimanipulation parameter-set and modify it should a particular functional result not be satisfactory, to allow the robot to successfully complete each task or motion-primitive. Updated parameter data is then used to rebuild the modified minimanipulation parameter set for re-execution as well as for updating/rebuilding a particular minimanipulation routine, which is provided back to the original library routines as a modified/re-tuned library for future use by other robotic systems. The system monitors all minimanipulation steps until the final result is achieved and once completed, exits the robotic execution loop to await further commands or human input.") a pose of the object, a velocity of the object, a goal pose for the object or the robot appendage, and a distance from the goal pose. (Paragraph 0343, "FIG. 5D is a block diagram illustrating software elements for object-manipulation (or object handling) in the standardized robotic kitchen 50, which shows the structure and flow 250 of the object-manipulation portion of the robotic kitchen execution of a robotic script, using the notion of motion-replication coupled-with/aided-by minimanipulation steps. In order for automated robotic-arm/-hand-based cooking to be viable, it is insufficient to monitor every single joint in the arm and hands/fingers. In many cases just the position and orientation of the hand/wrist are known (and able to be replicated), but then manipulating an object (identifying location, orientation, pose, grab-location, grabbing-strategy and task-execution) requires that local-sensing and learned behaviors and strategies for the hand and fingers be used to complete the grabbing/manipulating task successfully. These motion-profiles (sensor-based/-driven) behaviors and sequences are stored within the mini hand-manipulation library software repository in the robotic-kitchen system. The human chef could be wearing complete arm-exoskeleton or an instrumented/target-fitted motion-vest allowing the computer via built-in sensors or though camera-tracking to determine the exact 3D position of the hands and wrists at all times. Even if the ten fingers on both hands had all their joints instrumented (more than 30 DoFs (Degrees of Freedom) for both hands and very awkward to wear and use, and thus unlikely to be used), a simple motion-based playback of all joint positions would not guarantee successful (interactive) object manipulation.")
Oleynik does not specifically teach using a teacher-student model for training. However, Horowitz, in the same field of endeavor of robotics, teaches:
… the teacher model is trained (Paragraph 0051, "In some embodiments, model training logic 202 is configured to enable both a scalable compute/storage framework for the development of large-scale machine learning models and a distributed sorting facility approach to ensure the broadest possible dataset for the training. In material sorting, the breadth of possible object types coupled with the domain of possible material characteristics for each object represents a vast data set that requires an innovative approach to data management and machine learning model training. Typical storage and computation available to a local object recognition system represents potential barriers to entirely local or on-facility systems. Furthermore, the data set available to an individual sorting facility is in itself limited to the subset of objects and characteristics available on a regular basis within that sorting facility. In some embodiments, model training logic 202 is configured to create an offline “parent” model against a very large and diverse dataset aggregated across multiple sorting facilities. The parent approach creates ongoing high-confidence machine learning models using virtually unlimited computational resources, regressive training, ensemble techniques (e.g., voting-by-consensus), all without the on-site latency constraints inherent in a live sorting environment at a particular sorting facility. The data set used for training is sourced across all child/sorting facility sites, in addition to including data from manufacturers of objects and any other available third-party sources. Once created, model training logic 202 is configured to dynamically propagate the parent machine learning model to compute nodes and/or sorting devices for real-time implementation at the sorting facilities. An advantage of this approach is that the compute nodes and/or sorting devices at the sorting facilities can use a variety of techniques (e.g., bounding box jitter, temporal disagreement, low confidence, etc.) to surface problem areas to the parent model. This, in turn, can then refine the model and provide the machine learning capabilities at the sorting facilities with high-quality corrections to its own predictions, enabling it to train and improve over time, based on the parent model's classifications. At this point, the sorting facility components can retrain the parent models against these failure or adverse scenarios, improving them over time. In some embodiments, the parent model that has been received at a sorting facility is retrained (e.g., at the cloud sorting server by model training logic 202 or by a compute node at the sorting facility) on a dataset that comprises primarily data from within that facility, or similar sorting facilities within the same geographic region, allowing the machine learning model to refine itself against the expected material within a facility or within a region. A further advantage is that the parent model at the cloud sorting server also improves with each failure case, as the parent model changes are propagated not just to the sorting facility experiencing the failure scenario, but to all sorting facilities. In some embodiments, the cloud and facility software architecture is configured to support a large set of output layers trained for each material characteristic of each target object. In some embodiments, a “noisy student” approach is taken to utilize the large quantities of data captured by components (e.g., object recognition devices, sorting devices, compute nodes) in the sorting facilities. In such embodiments, the core “teacher” model is trained by model training logic 202 on a known set of labeled data to build the “teacher” model with a configurable error threshold. At this point, one or more “student” models are created from the teacher model, and trained using the much larger data set encountered by many components in the sorting facilities. In this second training process, “noise” is added to the new data, requiring the student model to learn more general predictions, in order to compensate for the inconsistency in the data caused by noising. This results in a net improvement in object recognition accuracy and robustness. This process may be implemented one or more times (e.g., by model training logic 202) to reach a desired accuracy level, and the parent model can then be augmented with the student model. Note that as more data is gathered by the sorting facility components, this process may be run repeatedly by model training logic 202, resulting in both increased accuracy and increased model capabilities. An adjunct benefit of the parent-child model is the auto-learning capability inherent in this system. A baseline machine learning model can be created using sourced sample materials (e.g., from laboratories, reverse search, manual labeling, etc.). When this seed machine learning model is brought online, the base model is augmented with the data obtained from each sorting facility and as a result, each problem identification encountered at the sorting facility is presented as an opportunity to augment training of the base model. Model metadata (such as described below) is uploaded on a regular or continuous basis to the cloud sorting server. During anomalous events (e.g., difficult target identification, errors, etc.), metadata is augmented with full image, raw sensor data, and even video data associated with the event. This data can then be used to annotate the parent model, either manually (e.g., human intervention) or automatically (e.g., automatic retraining based on the new data). Given the large datasets involved, an optimization offered by this implementation is the ability to manage and support the system using only metadata (very small data structures), and only requiring large data transmissions during anomalies.") …
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to use a teacher-student model during training as taught by Horowitz. This would allow for more efficient processing using the student model so that the system may operate efficiently while maintaining a high level of accuracy.
Regarding claim 9, where all the limitations of claim 7 are discussed above, Oleynik further teaches:
9. The manipulation task solver system of claim 7, wherein … real-world demonstrations, (Paragraph 0018, "Minimanipulations comprise a new way of creating a general programmable-by-example platform for humanoid robots. The state of the art largely requires explicit development of control software by expert programmers for each and every step of a robotic action or action sequence. The exception to the above are for very repetitive low level tasks, such as factory assembly, where the rudiments of learning-by-imitation are present. A minimanipulation library provides a large suite of higher-level sensing-and-execution sequences that are common building blocks for complex tasks, such as cooking, taking care of the infirm, or other tasks performed by the next generation of humanoid robots. More specifically, unlike the previous art, the present disclosure provides the following distinctive features. First, a potentially very large library of pre-defined/pre-learned sensing-and-action sequences called minimanipulations. Second, each mini-manipulation encodes preconditions required for the sensing-and-action sequences to produce successfully the desired functional results (i.e. the postconditions) with a well-defined probability of success (e.g. 100% or 97% depending on the complexity and difficulty of the minimanipulation). Third, each minimanipulation references a set of variables whose values may be set a-priori or via sensing operations, before executing the minimanipulation actions. Fourth, each minimanipulation changes the value of a set of variables to represent the functional result (the postconditions) of executing the action sequence in the minimanipulation. Fifth, minimanipulations may be acquired by repeated observation of a human tutor (e.g. an expert chef) to determine the sensing-and-action sequence, and to determine the range of acceptable values for the variables. Sixth, minimanipulations may be composed into larger units to perform end-to-end tasks, such as preparing a meal, or cleaning up a room. These larger units are multi-stage applications of minimanipulations either in a strict sequence, in parallel, or respecting a partial order wherein some steps must occur before others, but not in a total ordered sequence (e.g. to prepare a given dish, three ingredients need to be combined in exact amounts into a mixing bowl, and then mixed; the order of putting each ingredient into the bowl is not constrained, but all must be placed before mixing). Seventh, the assembly of minimanipulations into end-to-end-tasks is performed by robotic planning, taking into account the preconditions and postconditions of the component minimanipulations. Eighth, case-based reasoning wherein observation of humans performing end-to-end tasks, or other robots doing so, or the same robot's past experience can be used to acquire a library of reusable robotic plans form cases (specific instances of performing an end-to-end task), both successful ones to replicate, and unsuccessful ones to learn what to avoid.") and one or more sensor inputs. (Paragraph 0460, "FIG. 27 is a flow diagram illustrating the process 932 for testing and learning of minimanipulations. At step 934, the computer performs a food preparation task composition analysis in which each cooking operation (e.g. cracking an egg with a knife) is analyzed, decomposed, and constructed into a sequence of action primitives or minimanipulations. In one embodiment, a minimanipulation refers to a sequence of one or more action primitives that accomplish a basic functional outcome (e.g., the egg has been cracked, or a vegetable sliced) that advances toward a specific result in preparing a food dish. In this embodiment, a minimanipulation can be further described as a low-level minimanipulation or a high-level minimanipulation where a low-level minimanipulation refers to a sequence of action primitives that requires minimal interaction forces and relies almost exclusively on the use of the robotic apparatus 75, and a high-level minimanipulation refers to a sequence of action primitives requiring a substantial amount of interaction and interaction forces and control thereof. The process loop 936 focuses on minimanipulation and learning steps and comprises tests, which are repeated many times (e.g. 100 times) to ensure the reliability of minimanipulations. At step 938, the robotic food preparation engine 56 is configured to assess the knowledge of all possibilities to perform a food preparation stage or a minimanipulation, where each minimanipulation is tested with respect to orientations, positions/velocities, angles, forces, pressures, and speeds with a particular minimanipulation. A minimanipulation or an action primitive may involve the robotic hand 72 and a standard object, or the robotic hand 72 and a nonstandard object. At step 940, the robotic food preparation engine 56 is configured to execute the minimanipulation and determine if the outcome can be deemed successful or a failure. At step 942, the computer 16 conducts an automated analysis and reasoning about the failure of the minimanipulation. For example, the multimodal sensors may provide sensing feedback data on the success or failure of the minimanipulation. At step 944, the computer 16 is configured to make a real-time adjustment and adjusts the parameters of the minimanipulation execution process. At step 946, the computer 16 adds new information about the success or failure of the parameter adjustment to the minimanipulation library as a learning mechanism to the robotic food preparation engine 56.")
Oleynik does not specifically teach using a teacher-student model for training. However, Horowitz, in the same field of endeavor of robotics, teaches:
… the student model is trained based on supervision from the teacher model, (Paragraph 0051, "In some embodiments, model training logic 202 is configured to enable both a scalable compute/storage framework for the development of large-scale machine learning models and a distributed sorting facility approach to ensure the broadest possible dataset for the training. In material sorting, the breadth of possible object types coupled with the domain of possible material characteristics for each object represents a vast data set that requires an innovative approach to data management and machine learning model training. Typical storage and computation available to a local object recognition system represents potential barriers to entirely local or on-facility systems. Furthermore, the data set available to an individual sorting facility is in itself limited to the subset of objects and characteristics available on a regular basis within that sorting facility. In some embodiments, model training logic 202 is configured to create an offline “parent” model against a very large and diverse dataset aggregated across multiple sorting facilities. The parent approach creates ongoing high-confidence machine learning models using virtually unlimited computational resources, regressive training, ensemble techniques (e.g., voting-by-consensus), all without the on-site latency constraints inherent in a live sorting environment at a particular sorting facility. The data set used for training is sourced across all child/sorting facility sites, in addition to including data from manufacturers of objects and any other available third-party sources. Once created, model training logic 202 is configured to dynamically propagate the parent machine learning model to compute nodes and/or sorting devices for real-time implementation at the sorting facilities. An advantage of this approach is that the compute nodes and/or sorting devices at the sorting facilities can use a variety of techniques (e.g., bounding box jitter, temporal disagreement, low confidence, etc.) to surface problem areas to the parent model. This, in turn, can then refine the model and provide the machine learning capabilities at the sorting facilities with high-quality corrections to its own predictions, enabling it to train and improve over time, based on the parent model's classifications. At this point, the sorting facility components can retrain the parent models against these failure or adverse scenarios, improving them over time. In some embodiments, the parent model that has been received at a sorting facility is retrained (e.g., at the cloud sorting server by model training logic 202 or by a compute node at the sorting facility) on a dataset that comprises primarily data from within that facility, or similar sorting facilities within the same geographic region, allowing the machine learning model to refine itself against the expected material within a facility or within a region. A further advantage is that the parent model at the cloud sorting server also improves with each failure case, as the parent model changes are propagated not just to the sorting facility experiencing the failure scenario, but to all sorting facilities. In some embodiments, the cloud and facility software architecture is configured to support a large set of output layers trained for each material characteristic of each target object. In some embodiments, a “noisy student” approach is taken to utilize the large quantities of data captured by components (e.g., object recognition devices, sorting devices, compute nodes) in the sorting facilities. In such embodiments, the core “teacher” model is trained by model training logic 202 on a known set of labeled data to build the “teacher” model with a configurable error threshold. At this point, one or more “student” models are created from the teacher model, and trained using the much larger data set encountered by many components in the sorting facilities. In this second training process, “noise” is added to the new data, requiring the student model to learn more general predictions, in order to compensate for the inconsistency in the data caused by noising. This results in a net improvement in object recognition accuracy and robustness. This process may be implemented one or more times (e.g., by model training logic 202) to reach a desired accuracy level, and the parent model can then be augmented with the student model. Note that as more data is gathered by the sorting facility components, this process may be run repeatedly by model training logic 202, resulting in both increased accuracy and increased model capabilities. An adjunct benefit of the parent-child model is the auto-learning capability inherent in this system. A baseline machine learning model can be created using sourced sample materials (e.g., from laboratories, reverse search, manual labeling, etc.). When this seed machine learning model is brought online, the base model is augmented with the data obtained from each sorting facility and as a result, each problem identification encountered at the sorting facility is presented as an opportunity to augment training of the base model. Model metadata (such as described below) is uploaded on a regular or continuous basis to the cloud sorting server. During anomalous events (e.g., difficult target identification, errors, etc.), metadata is augmented with full image, raw sensor data, and even video data associated with the event. This data can then be used to annotate the parent model, either manually (e.g., human intervention) or automatically (e.g., automatic retraining based on the new data). Given the large datasets involved, an optimization offered by this implementation is the ability to manage and support the system using only metadata (very small data structures), and only requiring large data transmissions during anomalies.") …
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to use a teacher-student model during training as taught by Horowitz. This would allow for more efficient processing using the student model so that the system may operate efficiently while maintaining a high level of accuracy.
Regarding claim 15, where all the limitations of claim 11 are discussed above, Oleynik further teaches:
15. The manipulation task solver system of claim 11, wherein the three or more sub-tasks include reorienting the object after the object is grasped (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and wherein the third low-level policy is associated with reorienting the object after the object is grasped (Paragraph 0326, "The standardized robotic kitchen module 50 has as one objective: the standardization of the kitchen module 50 and various components with the kitchen module itself to ensure consistency in both the chef kitchen 44 and the robotic kitchen 48 to maximize the preciseness of recipe replication while minimizing the risks of deviations from precise replication of a recipe dish between the chef kitchen 44 and the robotic kitchen 48. One main purpose of having the standardization of the kitchen module 50 is to obtain the same result of the cooking process (or the same dish) between a first food dish prepared by the chef and a subsequent replication of the same recipe process via the robotic kitchen. Conceiving a standardized platform in the standardized robotic kitchen module 50 between the chef kitchen 44 and the robotic kitchen 48 has several key considerations: same timeline, same program or mode, and quality check. The same timeline in the standardized robotic kitchen 50 where the chef prepares a food dish at the chef kitchen 44 and the replication process by the robotic hands in the robotic kitchen 48 refers to the same sequence of manipulations, the same initial and ending time of each manipulation, and the same speed of moving an object between handling operations. The same program or mode in the standardized robotic kitchen 50 refers to the use and operation of standardized equipment during each manipulation recording and execution step. The quality check refers to three-dimensional vision sensors in the standardized robotic kitchen 50, which monitor and adjust in real time each manipulation action during the food preparation process to correct any deviation and avoid a flawed result. The adoption of the standardized robotic kitchen module 50 reduces and minimizes the risks of not obtaining the same result between the chef's prepared food dish and the food dish prepared by the robotic kitchen using robotic arms and hands. Without the standardization of a robotic kitchen module and the components within the robotic kitchen module, the increased variations between the chef kitchen 44 and the robotic kitchen 48 increase the risks of not being able to obtain the same result between the chef's prepared food dish and the food dish prepared by the robotic kitchen because more elaborate and complex adjustment algorithms will be required with different kitchen modules, different kitchen equipment, different kitchenware, different kitchen tools, and different ingredients between the chef kitchen 44 and the robotic kitchen 48.") and is trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") …
Oleynik does not specifically teach using a knowledge distillation or teacher-student model approach in training. However, Horowitz, in the same field of endeavor of robotics, teaches:
… based on a knowledge distillation or teacher-student model approach. (Paragraph 0051, "In some embodiments, model training logic 202 is configured to enable both a scalable compute/storage framework for the development of large-scale machine learning models and a distributed sorting facility approach to ensure the broadest possible dataset for the training. In material sorting, the breadth of possible object types coupled with the domain of possible material characteristics for each object represents a vast data set that requires an innovative approach to data management and machine learning model training. Typical storage and computation available to a local object recognition system represents potential barriers to entirely local or on-facility systems. Furthermore, the data set available to an individual sorting facility is in itself limited to the subset of objects and characteristics available on a regular basis within that sorting facility. In some embodiments, model training logic 202 is configured to create an offline “parent” model against a very large and diverse dataset aggregated across multiple sorting facilities. The parent approach creates ongoing high-confidence machine learning models using virtually unlimited computational resources, regressive training, ensemble techniques (e.g., voting-by-consensus), all without the on-site latency constraints inherent in a live sorting environment at a particular sorting facility. The data set used for training is sourced across all child/sorting facility sites, in addition to including data from manufacturers of objects and any other available third-party sources. Once created, model training logic 202 is configured to dynamically propagate the parent machine learning model to compute nodes and/or sorting devices for real-time implementation at the sorting facilities. An advantage of this approach is that the compute nodes and/or sorting devices at the sorting facilities can use a variety of techniques (e.g., bounding box jitter, temporal disagreement, low confidence, etc.) to surface problem areas to the parent model. This, in turn, can then refine the model and provide the machine learning capabilities at the sorting facilities with high-quality corrections to its own predictions, enabling it to train and improve over time, based on the parent model's classifications. At this point, the sorting facility components can retrain the parent models against these failure or adverse scenarios, improving them over time. In some embodiments, the parent model that has been received at a sorting facility is retrained (e.g., at the cloud sorting server by model training logic 202 or by a compute node at the sorting facility) on a dataset that comprises primarily data from within that facility, or similar sorting facilities within the same geographic region, allowing the machine learning model to refine itself against the expected material within a facility or within a region. A further advantage is that the parent model at the cloud sorting server also improves with each failure case, as the parent model changes are propagated not just to the sorting facility experiencing the failure scenario, but to all sorting facilities. In some embodiments, the cloud and facility software architecture is configured to support a large set of output layers trained for each material characteristic of each target object. In some embodiments, a “noisy student” approach is taken to utilize the large quantities of data captured by components (e.g., object recognition devices, sorting devices, compute nodes) in the sorting facilities. In such embodiments, the core “teacher” model is trained by model training logic 202 on a known set of labeled data to build the “teacher” model with a configurable error threshold. At this point, one or more “student” models are created from the teacher model, and trained using the much larger data set encountered by many components in the sorting facilities. In this second training process, “noise” is added to the new data, requiring the student model to learn more general predictions, in order to compensate for the inconsistency in the data caused by noising. This results in a net improvement in object recognition accuracy and robustness. This process may be implemented one or more times (e.g., by model training logic 202) to reach a desired accuracy level, and the parent model can then be augmented with the student model. Note that as more data is gathered by the sorting facility components, this process may be run repeatedly by model training logic 202, resulting in both increased accuracy and increased model capabilities. An adjunct benefit of the parent-child model is the auto-learning capability inherent in this system. A baseline machine learning model can be created using sourced sample materials (e.g., from laboratories, reverse search, manual labeling, etc.). When this seed machine learning model is brought online, the base model is augmented with the data obtained from each sorting facility and as a result, each problem identification encountered at the sorting facility is presented as an opportunity to augment training of the base model. Model metadata (such as described below) is uploaded on a regular or continuous basis to the cloud sorting server. During anomalous events (e.g., difficult target identification, errors, etc.), metadata is augmented with full image, raw sensor data, and even video data associated with the event. This data can then be used to annotate the parent model, either manually (e.g., human intervention) or automatically (e.g., automatic retraining based on the new data). Given the large datasets involved, an optimization offered by this implementation is the ability to manage and support the system using only metadata (very small data structures), and only requiring large data transmissions during anomalies.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to use a teacher-student model during training as taught by Horowitz. This would allow for more efficient processing using the student model so that the system may operate efficiently while maintaining a high level of accuracy.
Regarding claim 20, where all the limitations of claim 1 are discussed above, Oleynik further teaches:
20. The manipulation task solver system of claim 1, wherein the two or more sub-tasks include reorienting the object after the object is grasped (Paragraph 0021, "Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.") and wherein a third low-level policy is associated with reorienting the object after the object is grasped (Paragraph 0326, "The standardized robotic kitchen module 50 has as one objective: the standardization of the kitchen module 50 and various components with the kitchen module itself to ensure consistency in both the chef kitchen 44 and the robotic kitchen 48 to maximize the preciseness of recipe replication while minimizing the risks of deviations from precise replication of a recipe dish between the chef kitchen 44 and the robotic kitchen 48. One main purpose of having the standardization of the kitchen module 50 is to obtain the same result of the cooking process (or the same dish) between a first food dish prepared by the chef and a subsequent replication of the same recipe process via the robotic kitchen. Conceiving a standardized platform in the standardized robotic kitchen module 50 between the chef kitchen 44 and the robotic kitchen 48 has several key considerations: same timeline, same program or mode, and quality check. The same timeline in the standardized robotic kitchen 50 where the chef prepares a food dish at the chef kitchen 44 and the replication process by the robotic hands in the robotic kitchen 48 refers to the same sequence of manipulations, the same initial and ending time of each manipulation, and the same speed of moving an object between handling operations. The same program or mode in the standardized robotic kitchen 50 refers to the use and operation of standardized equipment during each manipulation recording and execution step. The quality check refers to three-dimensional vision sensors in the standardized robotic kitchen 50, which monitor and adjust in real time each manipulation action during the food preparation process to correct any deviation and avoid a flawed result. The adoption of the standardized robotic kitchen module 50 reduces and minimizes the risks of not obtaining the same result between the chef's prepared food dish and the food dish prepared by the robotic kitchen using robotic arms and hands. Without the standardization of a robotic kitchen module and the components within the robotic kitchen module, the increased variations between the chef kitchen 44 and the robotic kitchen 48 increase the risks of not being able to obtain the same result between the chef's prepared food dish and the food dish prepared by the robotic kitchen because more elaborate and complex adjustment algorithms will be required with different kitchen modules, different kitchen equipment, different kitchenware, different kitchen tools, and different ingredients between the chef kitchen 44 and the robotic kitchen 48.") and is trained (Paragraph 0297, "The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.") …
Oleynik does not specifically teach using a knowledge distillation or teacher-student model approach in training. However, Horowitz, in the same field of endeavor of robotics, teaches:
… based on a knowledge distillation or teacher-student model approach. (Paragraph 0051, "In some embodiments, model training logic 202 is configured to enable both a scalable compute/storage framework for the development of large-scale machine learning models and a distributed sorting facility approach to ensure the broadest possible dataset for the training. In material sorting, the breadth of possible object types coupled with the domain of possible material characteristics for each object represents a vast data set that requires an innovative approach to data management and machine learning model training. Typical storage and computation available to a local object recognition system represents potential barriers to entirely local or on-facility systems. Furthermore, the data set available to an individual sorting facility is in itself limited to the subset of objects and characteristics available on a regular basis within that sorting facility. In some embodiments, model training logic 202 is configured to create an offline “parent” model against a very large and diverse dataset aggregated across multiple sorting facilities. The parent approach creates ongoing high-confidence machine learning models using virtually unlimited computational resources, regressive training, ensemble techniques (e.g., voting-by-consensus), all without the on-site latency constraints inherent in a live sorting environment at a particular sorting facility. The data set used for training is sourced across all child/sorting facility sites, in addition to including data from manufacturers of objects and any other available third-party sources. Once created, model training logic 202 is configured to dynamically propagate the parent machine learning model to compute nodes and/or sorting devices for real-time implementation at the sorting facilities. An advantage of this approach is that the compute nodes and/or sorting devices at the sorting facilities can use a variety of techniques (e.g., bounding box jitter, temporal disagreement, low confidence, etc.) to surface problem areas to the parent model. This, in turn, can then refine the model and provide the machine learning capabilities at the sorting facilities with high-quality corrections to its own predictions, enabling it to train and improve over time, based on the parent model's classifications. At this point, the sorting facility components can retrain the parent models against these failure or adverse scenarios, improving them over time. In some embodiments, the parent model that has been received at a sorting facility is retrained (e.g., at the cloud sorting server by model training logic 202 or by a compute node at the sorting facility) on a dataset that comprises primarily data from within that facility, or similar sorting facilities within the same geographic region, allowing the machine learning model to refine itself against the expected material within a facility or within a region. A further advantage is that the parent model at the cloud sorting server also improves with each failure case, as the parent model changes are propagated not just to the sorting facility experiencing the failure scenario, but to all sorting facilities. In some embodiments, the cloud and facility software architecture is configured to support a large set of output layers trained for each material characteristic of each target object. In some embodiments, a “noisy student” approach is taken to utilize the large quantities of data captured by components (e.g., object recognition devices, sorting devices, compute nodes) in the sorting facilities. In such embodiments, the core “teacher” model is trained by model training logic 202 on a known set of labeled data to build the “teacher” model with a configurable error threshold. At this point, one or more “student” models are created from the teacher model, and trained using the much larger data set encountered by many components in the sorting facilities. In this second training process, “noise” is added to the new data, requiring the student model to learn more general predictions, in order to compensate for the inconsistency in the data caused by noising. This results in a net improvement in object recognition accuracy and robustness. This process may be implemented one or more times (e.g., by model training logic 202) to reach a desired accuracy level, and the parent model can then be augmented with the student model. Note that as more data is gathered by the sorting facility components, this process may be run repeatedly by model training logic 202, resulting in both increased accuracy and increased model capabilities. An adjunct benefit of the parent-child model is the auto-learning capability inherent in this system. A baseline machine learning model can be created using sourced sample materials (e.g., from laboratories, reverse search, manual labeling, etc.). When this seed machine learning model is brought online, the base model is augmented with the data obtained from each sorting facility and as a result, each problem identification encountered at the sorting facility is presented as an opportunity to augment training of the base model. Model metadata (such as described below) is uploaded on a regular or continuous basis to the cloud sorting server. During anomalous events (e.g., difficult target identification, errors, etc.), metadata is augmented with full image, raw sensor data, and even video data associated with the event. This data can then be used to annotate the parent model, either manually (e.g., human intervention) or automatically (e.g., automatic retraining based on the new data). Given the large datasets involved, an optimization offered by this implementation is the ability to manage and support the system using only metadata (very small data structures), and only requiring large data transmissions during anomalies.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik with the ability to use a teacher-student model during training as taught by Horowitz. This would allow for more efficient processing using the student model so that the system may operate efficiently while maintaining a high level of accuracy.
Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Oleynik in view of Horowitz and in further view of Ozay et al. (US 20230351203 A1), hereinafter Ozay.
Regarding claim 10, where all the limitations of claim 1 are discussed above, Oleynik does not specifically teach using a teacher-student model wherein the student model is trained using fewer inputs than the teacher model. However, Ozay, in the same field of endeavor of robotics teaches:
10. The manipulation task solver system of claim 7, wherein the student model is trained based on fewer inputs than the teacher model. (Paragraph 0014, "According to an aspect of the disclosure, there is provided a system for knowledge distillation between machine learning, ML, models that perform object recognition, the system comprising: a pre-trained teacher ML model, trained using a first training dataset comprising a plurality of images of objects, the pre-trained teacher ML model comprising first model parameters; a pre-trained student ML model, trained using a second training dataset, where the second training dataset is a subset of the first training dataset or is a different training dataset, the pre-trained student ML model comprising second model parameters; a condenser machine learning, ML, model parameterised by a set of parameters; and at least one processor coupled to memory, for: inputting, into the condenser ML model, a third training dataset, the third training dataset comprising the first model parameters, the second model parameters, the first training dataset and the second training dataset; and training the condenser ML model, using the third training dataset, to learn a parameter mapping function that models a relationship between the first model parameters and the second model parameters, and to output the second model parameters from an input comprising the first model parameters.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the robotic system and methods of operation as taught by Oleynik in combination with Horowitz with the ability to train the student model to use less information than the teacher model in order to make a determination as taught by Ozay. This would ensure an efficient model which will not require unnecessary resources to operate.

Conclusion
The Examiner has cited particular paragraphs or columns and line numbers in the referencesapplied to the claims above for the convenience of the Applicant. Although the specified citations arerepresentative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested of the Applicant in preparing responses, to fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner. See MPEP 2141.02 [R-07.2015] VI. A prior art reference must be considered in its entirety, i.e., as a whole, including portions that would lead away from the claimed Invention. W.L. Gore & Associates, Inc. v. Garlock, Inc., 721 F.2d 1540, 220 USPQ 303 (Fed. Cir. 1983), cert, denied, 469 U.S. 851 (1984). See also MPEP §2123.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HEATHER KENIRY whose telephone number is (571)270-5468. The examiner can normally be reached M-F 7:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Adam Mott can be reached at (571) 270-5376. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/H.J.K./Examiner, Art Unit 3657                                                                                                                                                                                                        


/ADAM R MOTT/Supervisory Patent Examiner, Art Unit 3657
Read full office action
Prosecution Timeline

Nov 20, 2024
Application Filed
Feb 25, 2026
Non-Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/215,235
Patent 12600035
INFORMATION PROCESSING METHOD, INFORMATION PROCESSING APPARATUS, ROBOT SYSTEM, MANUFACTURING METHOD OF PRODUCT, AND STORAGE MEDIUM
2y 5m to grant Granted Apr 14, 2026
18/407,440
Patent 12583123
ITERATIVE CONTROL OF ROBOT FOR TARGET OBJECT
2y 5m to grant Granted Mar 24, 2026
18/254,268
Patent 12576539
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
2y 5m to grant Granted Mar 17, 2026
17/802,937
Patent 12562076
LEARNING ASSISTANCE SYSTEM, LEARNING ASSISTANCE METHOD, AND LEARNING ASSISTANCE STORAGE MEDIUM
2y 5m to grant Granted Feb 24, 2026
18/231,888
Patent 12558780
MULTI-PURPOSE ROBOTS AND COMPUTER PROGRAM PRODUCTS, AND METHODS FOR OPERATING THE SAME
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
78%
Grant Probability
99%
With Interview (+22.1%)
2y 7m
Median Time to Grant
Low
PTA Risk
Based on 102 resolved cases by this examiner. Grant probability derived from career allow rate.