DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
Pending
1-21
35 U.S.C. 102
1-2, 9, 11-12, 19, 21
35 U.S.C. 103
3-8, 10, 13-18, 20
Response to Amendment
This office action is in response to applicant’s arguments and amendments filed 12/19/2025, which are in response to USPTO Office Action mailed 09/23/2025. Applicant’s arguments and amendments have been considered with the results that follow: THIS ACTION IS MADE FINAL.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-2, 9, 11-12, 19, 21 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Hausman et al. (US 2023/0311335 A1, “Hausman”).
Regarding claim 1: Hausman teaches: A processing system comprising: ([0056]) memory comprising processor-executable instructions; and one or more processors configured to execute the processor-executable instructions and cause the processing system to: ([0109]; [0108])
access sensor data depicting a physical environment; ([0054])
generate a set of output affordance maps based on processing the sensor data using an ensemble machine learning model, ([0026]; [0066]-[0067]; [0059]; [0040]; [0046])
wherein each respective output affordance map of the set of output affordance maps indicates a respective probability that a first action can be performed at at least a first location in the physical environment using a respective set of action parameters; ([0066]-[0067]; [0069]; [0112]; [0042], [0045], [0049], [0065], [0068], [0071])
select, based on the set of output affordance maps, a first set of action parameters and the first location; and ([0072]; [0112]; [0042], [0045], [0049], [0065], [0068], [0071])
cause a device to perform the first action at the first location in accordance with the first set of action parameters ([0071]; [0102]; [0112]; [0124]-[0125]; [0042], [0045], [0049], [0065], [0068], [0071]).
Regarding claim 2: Hausman further teaches: The processing system of claim 1, wherein: the one or more processors are configured to further execute the processor-executable instructions to cause the processing system to generate a set of uncertainty maps based on the set of output affordance maps; ([0056], [0108], [0109], [0026], [0069])
to generate the set of uncertainty maps, the one or more processors are configured to execute the processor-executable instructions to cause the processing system to evaluate divergence between the set of output affordance maps; and ([0026]; [0066]-[0067]; [0046], [0069]; here, probability includes uncertainty)
the first set of action parameters and the first location are selected based further on the set of uncertainty maps ([0066]-[0067]; [0112], [0069]).
Regarding claim 9: Hausman further teaches: The processing system of claim 1, wherein the one or more processors are configured to further execute the processor-executable instructions to cause the processing system to: ([0056]; [0108]-[0109])
generate a success value based on a performance of the first action at the first location in accordance with the first set of action parameters; and ([0034]-[0035]; [0045])
update one or more parameters of the ensemble machine learning model based on the success value ([0035]; [0115]; [0093]; [0099]).
Regarding claim 11: Claim 11 corresponds in scope to claim 1 and is similarly rejected. Hausman further teaches: A processor-implemented method of selecting and performing actions using machine learning, comprising ([0056]; [0021]; [0117] machine learning model).
Regarding claim 12: Claim 12 corresponds in scope to claim 2 and is similarly rejected.
Regarding claim 19: Claim 19 corresponds in scope to claim 9 and is similarly rejected.
Regarding claim 21: Hausman further teaches: The processing system of claim 1, wherein to cause the device to perform the first action at the first location, the one or more processors are configured to execute the processor-executable instructions and cause the processing system to cause one or more robotic manipulators to move to the first location and perform the first action ([0112] in response to determining to implement the robotic skill, causing the robot to implement the robotic skill in the current environment. [0042] robotic skills, such as manipulation, navigation picking, placing and rearranging objects, opening and closing drawers, navigating to various locations, and placing objects in a specific configuration. [0053] Robot base with wheels for movement. robot arm with end effector. [0065] skill descriptions. [0081] considers both the world-grounding measures and task-grounding measures in selecting robotic skill (“pick up the pear”) and sends an indication of selected robotic skill. In response, controls robot based on selected robotic skill. control the robot using a grasping policy).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 3-8 and 13-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hausman et al. (US 2023/0311335 A1), and further in view of Mousavian et al. (US 2020/0361083 A1, “Mousavian”).
Regarding claim 3: Hausman further teaches: The processing system of claim 2, wherein, to generate the set of output affordance maps, the one or more processors are configured to execute the processor-executable instructions to cause the processing system to ([0056], [0108], [0109], [0026]):
generate [data] using a first encoder of the ensemble machine learning model ([0003]; [0059]; [0040]; [0046]);
generate a first plurality of interim affordance maps […] using a first decoder of the ensemble machine learning model ([0003]; [0059]; [0040]; [0046]; [0026]).
However, Hausman does not explicitly teach: generate a first latent tensor based on processing the sensor data using a first encoder of the ensemble machine learning model; generate a plurality of aggregated latent tensors based on combining each of a plurality of action parameter tensors with the first latent tensor; and generate a first plurality of interim affordance maps based on processing each of the plurality of aggregated latent tensors using a first decoder of the ensemble machine learning model.
Mousavian teaches: generate a first latent tensor based on processing the sensor data using a first encoder of the ensemble machine learning model ([0013] FIG. 10; [0071]; [0082]);
generate a plurality of aggregated latent tensors based on combining each of a plurality of action parameter tensors with the first latent tensor ([0111]; [0113]); and
generate a first plurality of interim affordance maps based on processing each of the plurality of aggregated latent tensors using a first decoder of the ensemble machine learning model ([0092]; [0093]; [0072]).
Hausman and Mousavian are analogous art to the claimed invention since they are from the similar field of training network models for autonomous systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify the invention of Hausman with the aspects of Mousavian to create, with a reasonable expectation for success, a processing system that generates a first latent tensor by processing sensor data with a first encoder, aggregated latent tensors from the combination of action parameter tensors and first latent tensor, and interim affordance maps from processing aggregated latent tensors using a first decoder. The motivation for modification would have been to improve the precision of the robot grasp samples, such as by moving them out of collision or ensuring that the gripper is properly aligned with the object (Mousavian, [0072]).
Regarding claim 4: Hausman-Mousavian further teach: The processing system of claim 3, wherein, to generate the set of output affordance maps, the one or more processors are configured to further execute the processor-executable instructions to cause the processing system to (Hausman: [0056], [0108], [0109], [0026]): generate a second plurality of interim affordance maps based on a plurality of decoders of the ensemble machine learning model (Hausman: [0003]; [0059]; [0046]; [0026]; Mousavian: [0072]; [0092]; [0093]); and generate the set of output affordance maps based on aggregating the first and second pluralities of interim affordance maps (Hausman: [0046]; [0026]; Mousavian: [0092]; [0111]). The motivation for modification would have been to improve the precision of the robot grasp samples, such as by moving them out of collision or ensuring that the gripper is properly aligned with the object (Mousavian, [0072]).
Regarding claim 5: Hausman-Mousavian further teach: The processing system of claim 4, wherein, to select the first set of action parameters and the first location, the one or more processors are configured to execute the processor-executable instructions to cause the processing system to (Hausman: [0056]; [0108]-[0109]) determine, based on the set of output affordance maps and the set of uncertainty maps, that performing the first action at the first location will maximize predicted success while minimizing uncertainty (Hausman: [0069]; [0045]; [0099]; Mousavian: [0013] FIG. 10; [0071]; [0111]). The motivation for modification would have been to improve the precision of the robot grasp samples, such as by moving them out of collision or ensuring that the gripper is properly aligned with the object (Mousavian, [0072]).
Regarding claim 6: Hausman-Mousavian further teach: The processing system of claim 3, wherein: the first decoder is selected, from a plurality of decoders, with at least an element of randomness (Housman: [0003]; [0069]; Mousavian: [0082]); and to select the first set of action parameters and the first location, the one or more processors are configured to further execute the processor-executable instructions to cause the processing system to (Hausman: [0056]; [0108]-[0109]) determine, based on the set of output affordance maps and the set of uncertainty maps, that performing the first action at the first location will maximize predicted success while maximizing uncertainty (Hausman: [0069]; [0045]; [0099]; Mousavian: [0013] FIG. 10; [0071]; [0111]). The motivation for modification would have been to improve the precision of the robot grasp samples, such as by moving them out of collision or ensuring that the gripper is properly aligned with the object (Mousavian, [0072]).
Regarding claim 7: Hausman-Mousavian further teach: The processing system of claim 3, wherein each of the plurality of action parameter tensors corresponds to at least one of: (i) an action orientation, (ii) an action force, or (iii) an action direction (Hausman: [0065]; [0126]; Mousavian: [0071]; [0111]; [0143]). The motivation for modification would have been to improve the precision of the robot grasp samples, such as by moving them out of collision or ensuring that the gripper is properly aligned with the object (Mousavian, [0072]).
Regarding claim 8: Hausman-Mousavian further teach: The processing system of claim 7, wherein the action orientation comprises a grasp orientation for a robotic grasper (Hausman: [0053]; [0065]; Mousavian: [0071]; [0111]). The motivation for modification would have been to improve the precision of the robot grasp samples, such as by moving them out of collision or ensuring that the gripper is properly aligned with the object (Mousavian, [0072]).
Regarding claim 13: Claim 13 corresponds in scope to claim 3 and is similarly rejected. The motivation for modification of Hausman with Mousavian is the same at that stated in claim 3.
Regarding claim 14: Claim 14 corresponds in scope to claim 4 and is similarly rejected. The motivation for modification of Hausman with Mousavian is the same at that stated in claim 4.
Regarding claim 15: Claim 15 corresponds in scope to claim 5 and is similarly rejected. The motivation for modification of Hausman with Mousavian is the same at that stated in claim 5.
Regarding claim 16: Claim 16 corresponds in scope to claim 6 and is similarly rejected. The motivation for modification of Hausman with Mousavian is the same at that stated in claim 6.
Regarding claim 17: Claim 17 corresponds in scope to claim 7 and is similarly rejected. The motivation for modification of Hausman with Mousavian is the same at that stated in claim 7.
Regarding claim 18: Claim 18 corresponds in scope to claim 8 and is similarly rejected. The motivation for modification of Hausman with Mousavian is the same at that stated in claim 8.
Claim(s) 10 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hausman et al. (US 2023/0311335 A1), and further in view of Pandarinath et al. (US 2021/0406695 A1, “Pandarinath”).
Regarding claim 10: Hausman further teaches: The processing system of claim 9, wherein, to update the one or more parameters of the ensemble machine learning model (see at least [0035], [0115], [0093], [0099]), the one or more processors are configured to further execute the processor-executable instructions to ([0056], [0108], [0109], [0026]).
However, Hausman does not explicitly teach: cause the processing system to perform a masked backpropagation operation based on the first location such that one or more other parameters of the ensemble machine learning model corresponding to locations other than the first location are not updated based on the success value.
Pandarinath teaches: cause the processing system to perform a masked backpropagation operation based on the first location such that one or more other parameters of the ensemble machine learning model corresponding to locations other than the first location are not updated based on the success value ([0052] FIG. 3).
Hausman and Pandarinath are analogous art to the claimed invention since they are from the similar field of training network models for autonomous systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify the invention of Hausman with the aspects of Pandarinath to create, with a reasonable expectation for success, a processing system that causes the processing system to perform a masked backpropagation operation based on the first location such that one or more other parameters of the ensemble machine learning model corresponding to locations other than the first location are not updated based on the success value, in order to update parameters of the ensemble machine learning model. The motivation for modification would have been to have models that have improved performance, better de-noising of the data, or enabling better prediction of other variables that are correlated with the data, can be trained in less time, do not require careful manual evaluation and intervention, and can eliminate the need for users to have specialized expertise to train networks (Pandarinath, [0006]).
Regarding claim 20: Claim 20 corresponds in scope to claim 10 and is similarly rejected. The motivation for modification of Hausman with Pandarinath is the same at that stated in claim 10.
Response to Arguments
Applicant's arguments filed 12/19/20025 have been fully considered but they are not persuasive.
Regarding the 101 Rejections:
Applicant’s arguments and amendments with respect to claims 1-20 (and new claim 21) have been fully considered and are persuasive. The rejection of claims 1-20 under 35 U.S.C. 101 has been withdrawn.
Regarding the 102/103 Rejections:
Applicant states:
Hausman does not anticipate or suggest "generating a set of output affordance maps based on processing the sensor data using an ensemble machine learning model, wherein each respective output affordance map of the set of output affordance maps indicates a respective probability that a first action can be performed at at least a first location in the physical environment using a respective set of action parameters" and "selecting, based on the set of output affordance maps, a first set of action parameters and the first location.” Notably, the measures are not indicative of a location in the physical environment at which an action can be performed. Rather, the measures are indicative of actions which can be performed with respect to an object. For example, in Figure 2B of Hausman, the listed actions (i.e., "Skill Descriptions 207") include "Go to the table", "Go to the sink", "Pick up a bottle," etc. None of the actions include a location in the physical environment at which an action is to be, or can be, performed. Rather, the actions are based on objects which can be interacted with, such as a table, sink, or bottle. However, a table, sink, or bottle are not locations in a physical environment (i.e., a robot cannot "Go to the table" at the table, nor can a robot "Go to the sink" at the sink). Thus, because objects are not equivalent to locations in a physical environment, Hausman does not teach "generating a set of output affordance maps based on processing the sensor data using an ensemble machine learning model, wherein each respective output affordance map of the set of output affordance maps indicates a respective probability that a first action can be performed at at least a first location in the physical environment using a respective set of action parameters," as recited in Claim 11 and similar features in Claim 1.
Examiner response:
Examiner respectfully disagrees with Applicant. Regarding the claim limitation "generating a set of output affordance maps based on processing the sensor data using an ensemble machine learning model, wherein each respective output affordance map of the set of output affordance maps indicates a respective probability that a first action can be performed at at least a first location in the physical environment using a respective set of action parameters”, Hausman recites:
[0042] Various robotic skills can be utilized in implementations disclosed herein, such as manipulation and navigation skills using a mobile manipulator robot. Such skills can include picking, placing and rearranging objects, opening and closing drawers, navigating to various locations, and placing objects in a specific configuration.
[0045] It is noted that the flexibility of implementations disclosed herein, allows mixing and matching of policies and affordances from different methods. For example: for pick manipulation skills a single multi-task, language-conditioned policy can be used; for place manipulation skills a scripted policy, with an affordance based on the gripper state, can be used; and for navigation policies a planning-based approach, which is aware of the location(s) where specific object(s) can be found and corresponding distance measure(s), can be used. In some implementations, in order to avoid a situation where a skill is chosen but has already been performed or will have no effect, a cap for the affordances can be set indicating that the skill has been completed and the reward received.
[0049] An example affordance function for “go to”/“navigate” skills follows. The affordance function of go to skills are based on the distanced (in meters) to the location.
[0065] For example, “go to the table” can be descriptive of a “navigate to table” skill that the robot can perform by utilizing a trained navigation policy with a navigation target of “table” (or of a location corresponding to a “table”). As another example, “go to the sink” can be descriptive of a “navigate to sink” skill that the robot can perform by utilizing the trained navigation policy with a navigation target of “sink” (or of a location corresponding to a “sink”).
[0068] As another example, a value function for a “navigate to [object/location]” robotic skill can define the world-grounding measure is a function of the distance between the robot and the object/location, as determined based on environmental state data 209A and robot state data 210A. or a “terminate” robotic skill (i.e., signifying task is complete) may always have a fixed measure, such as 0.1 or 0.2.
[0071] In response, the implementation engine 136 controls the robot 110 based on the selected robotic skill A. For example, the implementation engine 136 can control the robot using a navigation policy with a navigation target of “table” (or of a location corresponding to a “table”).
Each of the cited paragraphs state information regarding “locations” where actions can be performed in the physical environment using a respective set of action parameters. For example, as described in [0042], the robot can pick up an object in a first location, can move to a second location, and place the object in the second location. Further, the claim recites “a first action can be performed”, which can be as simple as the robot moving itself to the first location, where the ‘moving’ action is the first action to be performed. Thus, Hausman discloses "generating a set of output affordance maps based on processing the sensor data using an ensemble machine learning model, wherein each respective output affordance map of the set of output affordance maps indicates a respective probability that a first action can be performed at at least a first location in the physical environment using a respective set of action parameters”.
Applicant states:
Additionally, the Examiner maps Hausman's description of "multiplying the world-grounding measures 221A and the task-grounding measures 208A, and select[ing] the robot skill A based on it having the highest overall measures 212A" to the claimed "select, based on the set of output affordance maps, a first set of action parameters and the first location." Office Action, p. 9. However, as stated above, Hausman does not teach generating affordance maps, each indicating a respective probability that an action can be performed at a location. For the same reasons, it follows that Hausman cannot teach "selecting . . . a first set of action parameters and [a] first location" based on such affordance maps. Thus, Hausman also fails to teach "selecting, based on the set of output affordance maps, a first set of action parameters and the first location," as recited in Claim 11 and similar features recited in Claim 1. Accordingly, Applicant submits that Claims 1 and 11, as well as claims dependent thereon, are allowable and respectfully requests withdrawal of this rejection.
Examiner response:
Examiner respectfully disagrees with Applicant. As shown above, Hausman does teach generating affordance maps indicating respective probabilities that an action can be performed at a location. Regarding the limitation “selecting, based on the set of output affordance maps, a first set of action parameters and the first location”, Hausman recites:
[0066] For example, task-grounding measure A, “0.85”, reflects the probability of the word sequence “go to the table” in the LLM output 206A. As another example, task-grounding measure B, “0.20”, reflects the probability of the word sequence “go to the sink” in the LLM output 206A.
[0067] In generating the world-grounding measures 211A for at least some of the robotic skills, the world-grounding engine 134 can generate the world-grounding measure based on environmental state data 209A and, optionally, further based on robot state data 210A and/or corresponding ones of the skill descriptions 207.
[0069] In some implementations, the world-grounding engine 134 can additionally or alternatively, for some robotic skill(s), generate world-grounding measures based on a corresponding one of the value function model(s) 152 that is a trained value function model . . . For example, the world-grounding engine 134 can, in generating a world-grounding measure for a robotic skill, process, using the language-conditioned model, a corresponding one of the skill descriptions 207 for the robotic skill, along with the environmental state data 209A and optionally along with the robot state data 210A, to generate a value that reflects a probability of the robotic skill being successful based on the current state data . . . In some versions of those implementations, a candidate robotic action is also processed using the language conditioned model and along with the corresponding one of the skill descriptions 207, the environmental state data 209A, and optionally the robot state data 210A . . . In those versions, the world-grounding engine 134 can generate the world-grounding measure based on, for example, the generated value that reflects the highest probability of success . . . For example, N robotic actions can be randomly sampled initially and values generated for each, then N additional robotic actions sampled from around one of the initial N robotic actions based on that initial robotic action having the highest generated value.
[0112] In some implementations, a method implemented by one or more processors is provided that includes identifying an instruction and processing the instruction using a language model (LM) (e.g., a large language model (LLM)) to generate LM output . . . The generated LM output can mode a probability distribution, over candidate word compositions, that is dependent on the instruction. The method further includes identifying a robotic skill, that is performable by a robot, and a skill description that is a natural language description of the robotic skill. The method further includes generating, based on the LM output and the skill description, a task-grounding measure for the robotic skill. The task grounding measure can reflect a probability of the skill description in the probability distribution of the LM output. The method further includes generating, based on the robotic skill and current environmental state data, a world-grounding measure. for the robotic skill. The world grounding measure can reflect a probability of the robotic skill being successful based on the current environmental state data. The current environmental state data can include sensor data captured by one or more sensor components of the robot in a current environment of the robot. The method further includes determining, based on both the task-grounding measure and the world-grounding measure, to implement the robotic skill in lieu of additional robotic skills that are each performable by the robot. The method further includes, in response to determining to implement the robotic skill, causing the robot to implement the robotic skill in the current environment.
As described above, Hausman teaches: identifying a robotic skill is performable by a robot; generating a task-grounding measure for the robotic skill which reflects a probability of the skill description in the probability distribution of the LM output; using the robotic skill and current environment state data to generate a world-grounding measure for the robotic skill which reflects a probability of the robotic skill being successful based on the current environmental state data; current environmental state data includes sensor data from robot sensor components in a current environment of the robot; determining to implement the robotic skill based on both the task-grounding measure and the world-grounding measure; and in response to determining to implement the robotic skill, causing the robot to implement the robotic skill in the current environment.
The system determines the robot’s skills, the robot’s current state, the robot’s current environment, and uses this information to create a likelihood of successful completion of said robot skill if it were to be implemented. Based on this prediction calculation, a skill is selected which includes action parameters and a location. Then the skill is performed by the robot. That is, Hausman teaches “selecting, based on the set of output affordance maps, a first set of action parameters and the first location”.
Applicant states:
As another example of failing to describe each and every element as set forth in the claims, Applicant submits that Hausman does not anticipate or suggest "generating a set of uncertainty maps based on the set of output affordance maps, comprising evaluating divergence between the set of output affordance maps, wherein the first set of action parameters and the first location are selected based further on the set of uncertainty maps," as recited in Claim 12 and similar features recited in Claim 2. The Examiner maps Hausman's description of world-grounding measures and task-grounding measures to the claimed "uncertainty maps." Office Action, p. 10. Additionally, the Examiner maps Hausman's description of the measures being "based on probability of [a] corresponding skill description" to the claimed "evaluating divergence between the set of output affordance maps." Id. However, the measures in Hausman are not generated or based on "evaluating divergence between the set of output affordance maps." When the measures are generated, there is no input or influence from one measure on the other. That is, generating the task-grounding measure is independent from generating the world-grounding measure and vice versa. Thus, because the task-grounding measures and world-grounding measures are generated independently, Hausman fails to describe "generating a set of uncertainty maps based on the set of output affordance maps, comprising evaluating divergence between the set of output affordance maps, wherein the first set of action parameters and the first location are selected based further on the set of uncertainty maps," as recited in Claim 12 and similar features recited in Claim 2.
Examiner responds:
Examiner respectfully disagrees with applicant. Regarding the limitation "generating a set of uncertainty maps based on the set of output affordance maps, comprising evaluating divergence between the set of output affordance maps, wherein the first set of action parameters and the first location are selected based further on the set of uncertainty maps", Hausman recites, and Examiner’s interprets:
Para.
Hausman recites:
Examiner interpretation:
[0026]
In various implementations disclosed herein, a robot is equipped with a repertoire of learned robotic skills for atomic behaviors that are capable of low-level visuomotor control. Some of those implementations, instead of only prompting an LLM to simply interpret an FF NL high-level instruction, utilize the LLM output generated by the prompting to generate task-grounding measures that each quantify the likelihood that a corresponding robotic skill makes progress towards completing the high-level instruction.
The value describes the probability that a specific skill can help complete an overall goal.
Further, a corresponding affordance function (e.g., a learned value function) for each of the robotic skills can be utilized to generate a world-grounding measure, for the robotic skill, that that quantifies how likely it is to succeed from the current stat.
Each robot skill is defined with a value describing the probability of skill being helpful if that skill were to be implemented in the current environment with a current robot state.
Yet further, both the task-grounding measures and the world-grounding measures can be utilized in determining which robotic skill to implement next in achieving the task(s) reflected by the high-level instruction.
The probability that the skill can be performed successfully and the probability that skill can be performed to help in the specific environment for the specific robot state, are BOTH used to decide what robot skill to use.
In these and other manners, implementations leverage that the LLM output describes the probability that each skill contributes to completing the instruction, and the affordance function describes the probability that each skill will succeed—and combining the two provides the probability that each skill will perform the instruction successfully.
Combining the probability of skill success with the probability of helping, results in a probability (i.e., degree of uncertainty) that the skill performed successfully in that moment with the robot state and current environment will help (or hurt) the completion of the overall task.
The affordance functions enable real-world grounding to be considered in addition to the task-grounding of the LLM output, and constraining the completions to the skill descriptions enables the LLM output to be considered in a manner that is aware of the robot's capabilities (e.g., in view of its repertoire of learned robotic skills). Furthermore, this combination results in a fully explainable sequence of steps that the robot will execute to accomplish a high-level instruction (e.g., the descriptions of the robotic skill(s) selected for implementation) — an interpretable plan that is expressed through language.
The (probability + probability = uncertainty) calculation is performed for a variety of robot skills with respect to the current environment and robot state.
If (probability + probability = HIGH uncertainty), then shows a divergence in probability of success.
If (probability + probability = LOW uncertainty), then shows a convergence in probability of success.
[0069]
For example, the world-grounding engine 134 can, in generating a world-grounding measure for a robotic skill, process, using the language-conditioned model, a corresponding one of the skill descriptions 207 for the robotic skill, along with the environmental state data 209A and optionally along with the robot state data 210A, to generate a value that reflects a probability of the robotic skill being successful based on the current state data.
(world-grounding measure for skill) + (skill description) + (environmental state) + (robot state) = (probability of skill being successful based on current state)
In some versions of those implementations, a candidate robotic action is also processed using the language conditioned model and along with the corresponding one of the skill descriptions 207, the environmental state data 209A, and optionally the robot state data 210A.
(world-grounding measure for skill) + (skill description (robotic action)) + (environmental state) + (robot state) = (probability of skill being successful based on current state)
In some of those versions, multiple values are generated, where each is generated based on processing a different candidate robotic action, but utilizing the same corresponding one of the skill descriptions 207, the same environmental state data 209A, and optionally the same robot state data 210A.
(world-grounding measure for skill) + (skill description (robotic action)) + (environmental state) + (robot state) = (probability of skill being successful based on current state)
Robotic action is varied but the rest of the equation is the same during each iteration of action.
In those versions, the world-grounding engine 134 can generate the world-grounding measure based on, for example, the generated value that reflects the highest probability of success. In some of those versions, the different candidate robotic actions can be selected using, for example, the cross entropy method.
(world-grounding measure for skill (use high success probability when available)) + (skill description (robotic action)) + (environmental state) + (robot state) = (probability of skill being successful based on current state)
For example, N robotic actions can be randomly sampled initially and values generated for each, then N additional robotic actions sampled from around one of the initial N robotic actions based on that initial robotic action having the highest generated value. A trained value function model, used by the world-grounding engine 134 in generating world grounding measure(s) for a robotic skill, can also be utilized in actual implementation of the robotic skill by the robot 110.
(world-grounding measure for skill) + (skill description (N robotic actions)) + (environmental state) + (robot state) = (probability of skill being successful based on current state)
Pick highest probability of success and respective action
Perform calculation again but with N additional actions sampled based on knowing that the initial action had a previous highest success probability
Overall, Hausman discloses the mapping of uncertainty based on affordance maps, including comparing action-environment-state combinations that trend towards a successfully completed task with action-environment-state combinations that trend away (divergence between set of output affordance maps) from a successfully completed task. As such, Hausman does teach "generating a set of uncertainty maps based on the set of output affordance maps, comprising evaluating divergence between the set of output affordance maps, wherein the first set of action parameters and the first location are selected based further on the set of uncertainty maps". Thus, Examiner maintains the prior art rejections of the claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Florencio et al. US 20140249676 A1: Technologies pertaining to human-robot interaction are described herein. The robot includes a computer-readable memory that comprises a model that, with respect to successful completions of a task, is fit to observed data, where at least some of such observed data pertains to a condition that is controllable by the robot, such as position of the robot or distance between the robot and a human. A task that is desirably performed by the robot is to cause the human to engage with the robot. The model is updated while the robot is online, such that behavior of the robot adapts over time to increase the likelihood that the robot will successfully complete the task.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MADISON B EMMETT whose telephone number is (303)297-4231. The examiner can normally be reached Monday - Friday 9:00 - 5:00 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tommy Worden can be reached at (571)272-4876. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MADISON B EMMETT/Examiner, Art Unit 3658
/JASON HOLLOWAY/Primary Examiner, Art Unit 3658