Last updated: April 19, 2026
Application No. 18/425,253
ROBOT SYSTEMS, METHODS, CONTROL MODULES, AND COMPUTER PROGRAM PRODUCTS THAT LEVERAGE LARGE LANGUAGE MODELS

Non-Final OA §101§102§103
Filed
Jan 29, 2024
Examiner
EMMETT, MADISON B
Art Unit
3658
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Sanctuary Cognitive Systems Corporation
OA Round
1 (Non-Final)
Interview Optional

— +11.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 158 resolved cases, 2023–2026
Examiner Intelligence

EMMETT, MADISON B View full profile →
Grants 79% — above average
Career Allow Rate
125 granted / 158 resolved
+27.1% vs TC avg
Moderate +11% lift
Without
With
+11.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
35 currently pending
Career history
193
Total Applications
across all art units
Statute-Specific Performance

§101
19.2%
-20.8% vs TC avg
§103
45.3%
+5.3% vs TC avg
§102
26.1%
-13.9% vs TC avg
§112
8.2%
-31.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 158 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Pending
1-20
35 U.S.C. 101
1-20
35 U.S.C. 102
1, 6-7, 9-13, 18-20
35 U.S.C. 103
2-5, 8, 14-17


Priority
Applicant’s indication of Domestic Benefit/National Stage information based on provisional application 63/441,897 filed 01/30/2023 is acknowledged.

Information Disclosure Statement
The information disclosure statement(s) (IDS(s)) submitted on 04/02/2024 and 07/05/2024 is/are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement(s) is/are being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C 101 because the claimed invention is directed to an abstract idea without significantly more.

Claim 1 is rejected under 35 U.S.C 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites: 
“A method of operation of a robot system, the method comprising:
identifying, by the robot system, a person in an environment of the robot system;
accessing, by the robot system, information about the person;
generating a first natural language (NL) query by the robot system, the first NL query including a NL description of the information about the person, a NL description of contextual information, and a NL request for an outbound verbalization for the robot system to deliver to the person;
providing the first NL query to a large language model (LLM) module of the robot system;
receiving, from the LLM module, the outbound verbalization for the robot system to deliver to the person; and
delivering, by the robot system, the outbound verbalization to the person”. 

These limitations, as drafted, are simple processes that, under their broadest reasonable interpretation, cover performance of the mind, but for the recitation of “of a robot system; by the robot system; providing the first NL query to a large language model (LLM) module of the robot system; receiving, from the LLM module, the outbound verbalization for the robot system to deliver to the person; and delivering, by the robot system, the outbound verbalization to the person”. That is, other than reciting the underlined and italicized limitations above, nothing in the claim elements preclude the steps from being performed in the mind. For example, a human can, in their mind, perform a method of operation, the method comprising: identifying a person in an environment of the robot system; accessing information about the person; generating a first natural language (NL) query, the first NL query including a NL description of the information about the person, a NL description of contextual information, and a NL request for an outbound verbalization for the robot system to deliver to the person.
This judicial exception is not integrated into a practical application. The claim recites the additional elements underlined and italicized above. The of a robot system and by the robot system is/are recited at a high level of generality and merely link(s) the use of the abstract idea to a particular technological environment (see MPEP 2106.05(h)). The providing the first NL query, receiving the outbound verbalization, and delivering the outbound verbalization is/are recited at a high level of generality and amounts to mere data gathering, manipulation, and transmission, which is a form of insignificant extra-solution activity (see MPEP 2106.05(g)). Accordingly, even in combination, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of a robot system and by the robot system is/are no more than mere generic linking of the abstract idea to a technological environment, which cannot provide an inventive concept. The additional element of providing the first NL query, receiving the outbound verbalization, and delivering the outbound verbalization is/are mere data gathering, manipulation, and transmission, and is a well-understood, routine, and conventional function (see MPEP 2106.05(d) and see Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93), and thus is/are no more than insignificant extra-solution activity (see MPEP 2106.05(g) and see OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93). Thus, the limitations do not provide an inventive concept, and the claim contains ineligible subject matter.

Claim 13 is rejected under 35 U.S.C 101 because the claimed invention is directed to an abstract idea without significantly more. The claim recites: 
“A method of operation of a robot system, the method comprising:
receiving, by the robot system, an inbound verbalization from a person in an environment of the robot system;
identifying the person by the robot system;
accessing, by the robot system, information about the person;
generating a natural language (NL) query by the robot system, the NL query including a NL description of the information about the person, a NL description of contextual information, a NL transcription of the inbound verbalization received from the person by the robot system, and a NL request for a response verbalization for the robot system to deliver to the person;
providing the NL query to a large language model (LLM) module of the robot system;
receiving, from the LLM module, the response verbalization for the robot system to deliver to the person; and
delivering, by the robot system, the response verbalization to the person”. 

These limitations, as drafted, are simple processes that, under their broadest reasonable interpretation, cover performance of the mind, but for the recitation of “of a robot system; receiving, by the robot system, an inbound verbalization from a person in an environment of the robot system; providing the NL query to a large language model (LLM) module of the robot system; receiving, from the LLM module, the response verbalization for the robot system to deliver to the person; and delivering, by the robot system, the response verbalization to the person”. That is, other than reciting the underlined and italicized limitations above, nothing in the claim elements preclude the steps from being performed in the mind. For example, a human can, in their mind, perform a method of operation, the method comprising: identifying the person; accessing, information about the person; generating a natural language (NL) query, the NL query including a NL description of the information about the person, a NL description of contextual information, a NL transcription of the inbound verbalization received from the person, and a NL request for a response verbalization for the robot system to deliver to the person.
This judicial exception is not integrated into a practical application. The claim recites the additional elements underlined and italicized above. The of a robot system and by the robot system is/are recited at a high level of generality and merely link(s) the use of the abstract idea to a particular technological environment (see MPEP 2106.05(h)). The receiving an inbound verbalization, providing the NL query, receiving the response verbalization, and delivering the response verbalization is/are recited at a high level of generality and amounts to mere data gathering, manipulation, and transmission, which is a form of insignificant extra-solution activity (see MPEP 2106.05(g)). Accordingly, even in combination, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of a robot system and by the robot system is/are no more than mere generic linking of the abstract idea to a technological environment, which cannot provide an inventive concept. The additional element of receiving an inbound verbalization, providing the NL query, receiving the response verbalization, and delivering the response verbalization is/are mere data gathering, manipulation, and transmission, and is a well-understood, routine, and conventional function (see MPEP 2106.05(d) and see Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93), and thus is/are no more than insignificant extra-solution activity (see MPEP 2106.05(g) and see OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93). Thus, the limitations do not provide an inventive concept, and the claim contains ineligible subject matter.

Claim(s) 2-4, 11, 14-15 recite(s) limitations that are no more that the abstract idea recited in claim(s) 1 and 13. The claim(s) recite(s) identifying, determining, and accessing information, generating queries, descriptions, and requests steps which can reasonably be performed in the human mind. The claim(s) recite(s) by the robot system, by at least one sensor, the database stored in a non-transitory processor-readable storage medium, and at least one camera at a high level of generality to generically link the use of the abstract idea in a particular technological environment. The claim(s) recite(s) scanning an identifier, retrieving digital information, receiving verbalization, providing queries, receiving responses, delivering responses, capturing image data, and scanning an identifier of a person steps which is/are mere data gathering, manipulation, and transmission, and is/are a well-understood, routine, and conventional function, and thus is/are no more than insignificant extra-solution activity. See MPEP 2106.05(g). Thus, the claim(s) contain(s) ineligible subject matter.

Claim(s) 5, 9-10, 16-17, 20 recite(s) limitations that are no more that the abstract idea recited in claim(s) 1 and 13. The claim(s) recite(s) by the robot system and the database stored in a non-transitory processor-readable storage medium at a high level of generality to generically link the use of the abstract idea in a particular technological environment. The claim(s) recite(s) retrieving digital information, defining the digital information, delivering the outbound verbalization, defining the content of the outbound verbalization, accessing information, and delivering the response verbalization to the person steps which is/are mere data gathering, manipulation, and transmission, and is/are a well-understood, routine, and conventional function, and thus is/are no more than insignificant extra-solution activity. See MPEP 2106.05(g). Thus, the claim(s) contain(s) ineligible subject matter.

Claim(s) 6-8, 12, 18-19 recite(s) limitations that are no more that the abstract idea recited in claim(s) 1 and 13. The claim(s) recite(s) defining NL description of contextual information including environment, robot role, person role, news data, and NL queries which can reasonably be performed in the human mind. Thus, the claim(s) contain(s) ineligible subject matter.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim(s) 1, 6-7, 9-13, 18-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Bailey (US 2024/0205174 A1, hereinafter “Bailey”).

Regarding claim 1: Bailey teaches: A method of operation of a robot system, the method comprising (Abstract: utilizing a large language model (LLM), input based on sensor data, from sensors of client device, to generate LLM output and causing output, based on generated LLM output, to be rendered by an interactive chatbot):
identifying, by the robot system, a person in an environment of the robot system ([0049] user-input detection engine to determine whether user input is received from non-acoustic sensors; vision sensors capture images, videos, and certain motions in a field of view of the vision sensors; can determine that user input is received when a gesture from a user is detected from the vision data received via the one or more vision sensors; [0082] trained LLM utilized to process received non-acoustic sensor data, other data (user utterance, speech recognition of user utterance, chat history, user preference), as input);
accessing, by the robot system, information about the person ([0042] client device can include data storage storing user data (account data, user preference data, user historical data), device data (sensor data), and application data (chat history of local interactive chatbot); [0082] trained LLM utilized to process received non-acoustic sensor data, other data (audio data capturing user utterance, speech recognition of user utterance, chat history, user preference), as input; [0087] metadata (chat history, user preference, historical data, description of client device, interactive chatbot, virtual character));
generating a first natural language (NL) query by the robot system, the first NL query including a NL description of the information about the person, a NL description of contextual information, and a NL request for an outbound verbalization for the robot system to deliver to the person ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output generated based on user utterance and non-acoustic sensor data indicate that room in which user stays is bright and warm and historical data indicating user is frequent visitor of cinemas, and be applied to generate NL statement, “looks like you are having cozy day, any plans to watch movie?”, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume; [0061] NLU engine can perform, using NLU models and grammar-based rules, NL understanding on textual representation (“where is Kentucky Derby held”) recognized from spoken utterance generated by ASR engine, to generate NLU output; NLU output can include search query that requests return of search results for location at which Kentucky Derby will be held; search result for location where Kentucky Derby is held can be processed using TTS engine into corresponding synthesized speech, to be rendered audibly via client device);
providing the first NL query to a large language model (LLM) module of the robot system ([0091] receives sensor data from sensors of client device, where sensor data includes acoustic sensor data that captures user utterance; [0092] perform speech recognition on acoustic sensor data that captures user utterance, to generate speech recognition of user utterance; [0093] system can use an LLM to process speech recognition of user utterance and NL description that describes non-acoustic sensor data as input, to generate corresponding LLM output);
receiving, from the LLM module, the outbound verbalization for the robot system to deliver to the person ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output can be generated based on user utterance and non-acoustic sensor data indicate that room in which user stays is bright and warm and historical data indicating user is frequent visitor of cinemas, and be applied to generate NL statement, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume); and
delivering, by the robot system, the outbound verbalization to the person ([0094] generate NL statement, “looks like you are having cozy day, any plans to watch movie?”, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume; [0095] cause synthetic speech to be audibly rendered via client device).

Regarding claim 6: Bailey further teaches: The method of claim 1 wherein the NL description of contextual information includes a NL description of at least a portion of the environment ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output generated based on (1) user utterance (“What to do now?”) and (2) non-acoustic sensor data indicate that a room in which a user stays is bright and warm (and/or (3) historical data indicating the user is a frequent visitor of cinemas), and be applied to generate a natural language statement, i.e., “looks like you are having a cozy day, any plans to watch a movie?”, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume).

Regarding claim 7: Bailey further teaches: The method of claim 1 wherein the NL description of contextual information includes a NL description of a respective role of each of the robot system and the person ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output generated based on (1) user utterance (“What to do now?”) and (2) non-acoustic sensor data indicate that room in which user stays is bright and warm and (3) historical data indicating the user is a frequent visitor of cinemas, and be applied to generate NL statement, “looks like you are having a cozy day, any plans to watch a movie?”, and voice to deliver such synthetic speech controlled based on LLM output to be delightful and have moderate voice volume; [0061] interpret spoken utterance (“could you provide some latest music?”) provided to chatbot, to derive an intent (“play music” being intent) of user or desired action by user; perform, using NLU models and grammar-based rules, NL understanding on textual representation (“where is Kentucky Derby held”) recognized from spoken utterance generated by ASR engine, to generate NLU output which can include search query that requests return of search results for location at which Kentucky Derby will be held; search result processed into synthesized speech, to be rendered audibly via client device).

Regarding claim 9: Bailey further teaches: The method of claim 1 wherein delivering, by the robot system, the outbound verbalization to the person includes verbalizing the outbound verbalization by the robot system ([0094] generate NL statement, “looks like you are having cozy day, any plans to watch movie?”, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume; [0095] cause synthetic speech to be audibly rendered via client device).

Regarding claim 10: Bailey further teaches: The method of claim 9 wherein the outbound verbalization includes a question about the person, and wherein verbalizing the outbound verbalization by the robot system includes verbally asking the person a question by the robot system ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output generated based on (1) user utterance (“What to do now?”) and (2) non-acoustic sensor data indicate that room in which user stays is bright and warm and (3) historical data indicating the user is a frequent visitor of cinemas, and be applied to generate NL statement, “looks like you are having a cozy day, any plans to watch a movie?”, and voice to deliver such synthetic speech controlled based on LLM output to be delightful and have moderate voice volume).

Regarding claim 11: Bailey further teaches: The method of claim 1, further comprising: receiving, by the robot system, an inbound verbalization from the person ([0091] receives sensor data from sensors of client device, where sensor data includes acoustic sensor data that captures user utterance; [0061] can perform, using NLU models and grammar-based rules, NL understanding on textual representation (“where is Kentucky Derby for the year 2023”) recognized from the spoken utterance generated by ASR engine);
generating a second NL query by the robot system, the second NL query including a NL transcription of the inbound verbalization received from the person by the robot system, a NL description of the outbound verbalization delivered from the robot system to the person, a NL description of the first NL query, and a NL request for a response to the inbound verbalization received from the person by the robot system ([0092] perform speech recognition on acoustic sensor data that captures user utterance, to generate speech recognition of user utterance; [0093] system can use an LLM to process speech recognition of user utterance and NL description that describes non-acoustic sensor data as input, to generate corresponding LLM output; [0094] generate, based on generated LLM output, NL statement responsive to received sensor data; [0061] can perform, using NLU models and grammar-based rules, NL understanding on textual representation (“where is Kentucky Derby for the year 2023”) recognized from the spoken utterance generated by ASR engine, to generate NLU output);
providing the second NL query to the LLM module of the robot system ([0093] system can use an LLM to process speech recognition of user utterance and NL description that describes non-acoustic sensor data as input, to generate corresponding LLM output; [0061] can perform, using NLU models and grammar-based rules, NL understanding on textual representation (“where is Kentucky Derby for the year 2023”) recognized from the spoken utterance generated by ASR engine, to generate NLU output);
receiving, from the LLM module, the response ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; [0061] can perform, using NLU models and grammar-based rules, NL understanding on textual representation recognized from the spoken utterance generated by ASR engine, to generate NLU output which can include a search query that requests the return of search results for a location at which the Kentucky Derby will be held in 2023); and 
delivering the response to the person by the robot system ([0095] cause synthetic speech to be audibly rendered via client device; [0061] search result processed using TTS engine into a corresponding synthesized speech, to be rendered audibly via the client device).

Regarding claim 12: Bailey further teaches: The method of claim 11 wherein the NL description of the first NL query includes at least one NL description selected from a group consisting of: a NL summary of the first NL query, a NL excerpt from the first NL query, and a NL copy of the first NL query ([0091]-[0095] receives sensor data from sensors of client device, includes acoustic data of user utterance; perform speech recognition on acoustic data of user utterance, generate speech recognition of user utterance; use an LLM to process speech recognition of user utterance and NL description that describes non-acoustic sensor data as input, generate corresponding LLM output; generate, based on generated LLM output, NL statement responsive to received sensor data; cause synthetic speech to be audibly rendered via client device; [0061] perform, using NLU models and grammar-based rules, NL understanding on textual representation (“where is Kentucky Derby held”) recognized from spoken utterance generated by ASR engine, to generate NLU output which can include search query that requests return of search results for location at which Kentucky Derby will be held; search result processed into synthesized speech, to be rendered audibly via client device).

Regarding claim 13: Bailey teaches: A method of operation of a robot system, the method comprising (Abstract: utilizing a large language model (LLM), input based on sensor data, from sensors of client device, to generate LLM output and causing output, based on generated LLM output, to be rendered by an interactive chatbot):
receiving, by the robot system, an inbound verbalization from a person in an environment of the robot system ([0082] trained LLM utilized to process received non-acoustic sensor data, other data (user utterance, speech recognition of user utterance, chat history, user preference), as input; [0091] receives sensor data from sensors of client device, where sensor data includes acoustic sensor data that captures user utterance);
identifying the person by the robot system ([0049] user-input detection engine to determine whether user input is received from non-acoustic sensors; vision sensors capture images, videos, and certain motions in a field of view of the vision sensors; can determine that user input is received when a gesture from a user is detected from the vision data received via the one or more vision sensors; [0082] trained LLM utilized to process received non-acoustic sensor data, other data (user utterance, speech recognition of user utterance, chat history, user preference), as input);
accessing, by the robot system, information about the person ([0042] client device can include data storage storing user data (account data, user preference data, user historical data), device data (sensor data), and application data (chat history of local interactive chatbot); [0082] trained LLM utilized to process received non-acoustic sensor data, other data (audio data capturing user utterance, speech recognition of user utterance, chat history, user preference), as input; [0087] metadata (chat history, user preference, historical data, description of client device, interactive chatbot, virtual character));
generating a natural language (NL) query by the robot system, the NL query including a NL description of the information about the person, a NL description of contextual information, a NL transcription of the inbound verbalization received from the person by the robot system, and a NL request for a response verbalization for the robot system to deliver to the person ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output generated based on user utterance and non-acoustic sensor data indicate that room in which user stays is bright and warm and historical data indicating user is frequent visitor of cinemas, and be applied to generate NL statement, “looks like you are having cozy day, any plans to watch movie?”, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume; [0061] NLU engine can perform, using NLU models and grammar-based rules, NL understanding on textual representation (“where is Kentucky Derby held”) recognized from spoken utterance generated by ASR engine, to generate NLU output; NLU output can include search query that requests return of search results for location at which Kentucky Derby will be held; search result for location where Kentucky Derby is held can be processed using TTS engine into corresponding synthesized speech, to be rendered audibly via client device);
providing the NL query to a large language model (LLM) module of the robot system ([0091] receives sensor data from sensors of client device, where sensor data includes acoustic sensor data that captures user utterance; [0092] perform speech recognition on acoustic sensor data that captures user utterance, to generate speech recognition of user utterance; [0093] system can use an LLM to process speech recognition of user utterance and NL description that describes non-acoustic sensor data as input, to generate corresponding LLM output);
receiving, from the LLM module, the response verbalization for the robot system to deliver to the person ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output can be generated based on user utterance and non-acoustic sensor data indicate that room in which user stays is bright and warm and historical data indicating user is frequent visitor of cinemas, and be applied to generate NL statement, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume); and
delivering, by the robot system, the response verbalization to the person ([0094] generate NL statement, “looks like you are having cozy day, any plans to watch movie?”, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume; [0095] cause synthetic speech to be audibly rendered via client device).

Regarding claim 18: Bailey further teaches: The method of claim 13 wherein the NL description of contextual information includes a NL description of at least a portion of the environment ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output generated based on (1) user utterance (“What to do now?”) and (2) non-acoustic sensor data indicate that a room in which a user stays is bright and warm (and/or (3) historical data indicating the user is a frequent visitor of cinemas), and be applied to generate a natural language statement, i.e., “looks like you are having a cozy day, any plans to watch a movie?”, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume).

Regarding claim 19: Bailey further teaches: The method of claim 13 wherein the NL description of contextual information includes a NL description of a respective role of each of the robot system and the person ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; LLM output generated based on (1) user utterance (“What to do now?”) and (2) non-acoustic sensor data indicate that room in which user stays is bright and warm and (3) historical data indicating the user is a frequent visitor of cinemas, and be applied to generate NL statement, “looks like you are having a cozy day, any plans to watch a movie?”, and voice to deliver such synthetic speech controlled based on LLM output to be delightful and have moderate voice volume; [0061] interpret spoken utterance (“could you provide some latest music?”) provided to chatbot, to derive an intent (“play music” being intent) of user or desired action by user; perform, using NLU models and grammar-based rules, NL understanding on textual representation (“where is Kentucky Derby held”) recognized from spoken utterance generated by ASR engine, to generate NLU output which can include search query that requests return of search results for location at which Kentucky Derby will be held; search result processed into synthesized speech, to be rendered audibly via client device).

Regarding claim 20: Bailey further teaches: The method of claim 13 wherein delivering, by the robot system, the response verbalization to the person includes verbalizing the response verbalization by the robot system ([0094] generate NL statement, “looks like you are having cozy day, any plans to watch movie?”, and voice to deliver such synthetic speech can be controlled based on LLM output to be delightful and have moderate voice volume; [0095] cause synthetic speech to be audibly rendered via client device).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim(s) 2-5, 8, 14-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bailey (US 2024/0205174 A1, hereinafter “Bailey”) and further in view of Cui et al. (US 2019/0206400 A1, hereinafter “Cui”).

Regarding claim 2: Bailey further teaches: The method of claim 1 wherein identifying, by the robot system, the person in the environment of the robot system includes (see at least [0049], [0082], described in claim 1).
However, Bailey does not explicitly teach, but Cui teaches: capturing, by at least one camera, an image of a face of the person ([0067] input detected may be voice input, visual input (presentation of a face), and is received from an autonomous robot; [0168] Vision apps: phone camera and processor to implement face detection, face recognition, face tracking); and
determining an identity of the person based on the image of the face of the person ([0096] autonomous robotic system partitions stored knowledge data by user or agent; includes user information related to that user's conversation history with robotic system; [0313] the user is identified by face recognition).
Bailey and Cui are analogous art to the claimed invention since they are from the similar field of context aware and language modeling robots. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify the invention of Bailey with the aspects of Cui to create, with a reasonable expectation for success, a robot system and method that captures, by at least one camera, an image of a face of the person, and determines an identity of the person based on the image of the face of the person. The motivation for modification would have been to improve the performance of the system, improve the user experience through increase comfort in the robot-human interaction, improve the system’s abilities such as reasoning and planning, reduce errors accumulated over time from use, and reduce the system response time for solving complex problems (Cui, [0070], [0265], [0309], [0082]).

Regarding claim 3: Bailey further teaches: The method of claim 1 wherein identifying, by the robot system, the person in the environment of the robot system includes (see at least [0049], [0082], described in claim 1).
However, Bailey does not explicitly teach, but Cui teaches: scanning, by at least one sensor, an identifier associated with the person ([0296] context aware interactive robot scans the environment with lidar and RGB-D cameras; [0067] input detected may be voice input, visual input (presentation of a face), and is received from an autonomous robot; [0168] Vision apps: phone camera and processor to implement face detection, face recognition, face tracking); and
determining an identity of the person based on the identifier associated with the person ([0115] memory graph data structure includes UserID node, which is the root node for a particular user, includes a unique user identifier for the referenced user and supporting knowledge associated with the user; [0096] a speaker related to the sentence input is identified and the source node corresponding to the speaker is selected; knowledge data accessed by speaker identity; source node for Alice is selected in response to a query initiated by Alice and source node for Bob is selected in response to a query initiated by Bob; autonomous robotic system partitions stored knowledge data by user or agent; includes user information related to that user's conversation history with robotic system; [0313] the user is identified by face recognition).
Bailey and Cui are analogous art to the claimed invention since they are from the similar field of context aware and language modeling robots. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify the invention of Bailey with the aspects of Cui to create, with a reasonable expectation for success, a robot system and method that scans, by at least one sensor, an identifier associated with the person, and determines an identity of the person based on the identifier associated with the person. The motivation for modification would have been to improve the performance of the system, improve the user experience through increase comfort in the robot-human interaction, improve the system’s abilities such as reasoning and planning, reduce errors accumulated over time from use, and reduce the system response time for solving complex problems (Cui, [0070], [0265], [0309], [0082]).

Regarding claim 4: Bailey further teaches: The method of claim 1 wherein accessing, by the robot system, information about the person includes retrieving, by the robot system, digital information about the person from a database of digital information […], the database stored in a non-transitory processor-readable storage medium ([0028] “dialog session”: logically-self-contained exchange between user and chatbot (sometimes other human participants); [0042] client device includes sensors, local interactive chatbot that is in communication with cloud-based interactive chatbot at server computing device, data storage storing user data (account data, user preference and historical data), device data, and application data (chat history of chatbot); [0087] input to LLM can include metadata (chat history, user preference, historical data, description of client device, interactive chatbot, virtual character); [0128] processors, CPUs, GPUs, TPUs; non-transitory computer readable storage media storing computer instructions executable by processors to perform methods).
However, Bailey does not explicitly teach, but Cui teaches: digital information about the person from a database of digital information about multiple people ([0115] memory graph data structure includes UserID node, which is the root node for a particular user, includes a unique user identifier for the referenced user and supporting knowledge associated with the user; [0096] knowledge data accessed by speaker identity; source node for Alice is selected in response to a query initiated by Alice and source node for Bob is selected in response to a query initiated by Bob; autonomous robotic system partitions stored knowledge data by user or agent; includes user information related to that user's conversation history with robotic system).
Bailey and Cui are analogous art to the claimed invention since they are from the similar field of context aware and language modeling robots. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify the invention of Bailey with the aspects of Cui to create, with a reasonable expectation for success, a robot system with digital information about the person from a database of digital information about multiple people. The motivation for modification would have been to improve the performance of the system, improve the user experience through increase comfort in the robot-human interaction, improve the system’s abilities such as reasoning and planning, reduce errors accumulated over time from use, and reduce the system response time for solving complex problems (Cui, [0070], [0265], [0309], [0082]).

Regarding claim 5: Bailey-Cui further teach: The method of claim 4 wherein retrieving, by the robot system, digital information about the person from a database of digital information about multiple people includes retrieving, by the robot system, digital information about the person from a database of digital information about multiple people (see at least Bailey: [0028], [0042], [0087], [0128] and  Cui: [0115], [0096], described in claim 4),
the digital information about multiple people collected through multiple channels including at least one channel selected from a group consisting of: purchasing histories of the multiple people; location histories of the multiple people; internet browsing histories of the multiple people; account profiles of the multiple people; event history of the environment; and information about past interactions between the robot system and the multiple people (Bailey: [0042] storing user data (account data, user preference data, user historical data), device data, and application data (chat history of local interactive chatbot); [0062] ranks, based on user preference or historical data, hypotheses generated as NLU output; [0082] other data (audio data of user utterance, speech recognition of user utterance, chat history, user preference); [0087] metadata (chat history, user preference, historical data, description of client device); Cui: [0115] memory graph data structure includes UserID node, which is the root node for a particular user, includes a unique user identifier for the referenced user and supporting knowledge associated with the user; case node is associated with cases for previously saved problems and their respective solutions along with additional context; [0311] list of complementary objects input is predicted based on previous buying history of user or other users; [0069] organizes data previously learned including data from sources such as conversations, actions, and observations; [0096] knowledge data associated with each user includes user information related to that user's conversation history with the autonomous robotic system; [0097] knowledge store includes information captured by sensors such as location, time, weather; declarative memory holds the system's knowledge about the world and itself; [0312] optimal route from the current robot position to the goal position).
The motivation for modification would have been to improve the performance of the system, improve the user experience through increase comfort in the robot-human interaction, improve the system’s abilities such as reasoning and planning, reduce errors accumulated over time from use, and reduce the system response time for solving complex problems (Cui, [0070], [0265], [0309], [0082]).

Regarding claim 8: Bailey further teaches: The method of claim 1 wherein the NL description of contextual information includes a NL description of information ([0094] generate, based on generated LLM output, NL statement responsive to received sensor data; see also [0061]).
However, Bailey does not explicitly teach, but Cui teaches: accessed by the robot system from at least one source selected from a group consisting of: a local news report, a national news report, an international news report, and a weather report ([0097] knowledge store includes information captured by sensors such as location, time, weather; declarative memory holds the system's knowledge about world and itself; [0231] stores a history of conversations that agents were engaged in, plus information captured by its sensors about environment such as location, time, weather; can be in at least two forms: given to the agent in the form of an ontology or factual knowledge; or inferred by agent based on content of its episodic memory).
Bailey and Cui are analogous art to the claimed invention since they are from the similar field of context aware and language modeling robots. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify the invention of Bailey with the aspects of Cui to create, with a reasonable expectation for success, a robot system and method wherein the NL description of contextual information includes a NL description of information accessed by the robot system from at least one source selected from a group consisting of: a local news report, a national news report, an international news report, and a weather report. The motivation for modification would have been to improve the performance of the system, improve the user experience through increase comfort in the robot-human interaction, improve the system’s abilities such as reasoning and planning, reduce errors accumulated over time from use, and reduce the system response time for solving complex problems (Cui, [0070], [0265], [0309], [0082]).

Regarding claim 14: Bailey further teaches: The method of claim 13 wherein identifying the person by the robot system includes (see at least [0049], [0082], described in claim 13).
However, Bailey does not explicitly teach, but Cui teaches: capturing, by at least one camera, an image of a face of the person ([0067] input detected may be voice input, visual input (presentation of a face), and is received from an autonomous robot; [0168] Vision apps: phone camera and processor to implement face detection, face recognition, face tracking); and
determining an identity of the person based on the image of the face of the person ([0096] autonomous robotic system partitions stored knowledge data by user or agent; includes user information related to that user's conversation history with robotic system; [0313] the user is identified by face recognition).
Bailey and Cui are analogous art to the claimed invention since they are from the similar field of context aware and language modeling robots. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify the invention of Bailey with the aspects of Cui to create, with a reasonable expectation for success, a robot system and method that captures, by at least one camera, an image of a face of the person, and determines an identity of the person based on the image of the face of the person. The motivation for modification would have been to improve the performance of the system, improve the user experience through increase comfort in the robot-human interaction, improve the system’s abilities such as reasoning and planning, reduce errors accumulated over time from use, and reduce the system response time for solving complex problems (Cui, [0070], [0265], [0309], [0082]).

Regarding claim 15: Bailey further teaches: The method of claim 13 wherein identifying the person by the robot system includes (see at least [0049], [0082], described in claim 13).
However, Bailey does not explicitly teach, but Cui teaches: scanning, by at least one sensor, an identifier associated with the person ([0296] context aware interactive robot scans the environment with lidar and RGB-D cameras; [0067] input detected may be voice input, visual input (presentation of a face), and is received from an autonomous robot; [0168] Vision apps: phone camera and processor to implement face detection, face recognition, face tracking); and
determining an identity of the person based on the identifier associated with the person ([0115] memory graph data structure includes UserID node, which is the root node for a particular user, includes a unique user identifier for the referenced user and supporting knowledge associated with the user; [0096] a speaker related to the sentence input is identified and the source node corresponding to the speaker is selected; knowledge data accessed by speaker identity; source node for Alice is selected in response to a query initiated by Alice and source node for Bob is selected in response to a query initiated by Bob; autonomous robotic system partitions stored knowledge data by user or agent; includes user information related to that user's conversation history with robotic system; [0313] the user is identified by face recognition).
Bailey and Cui are analogous art to the claimed invention since they are from the similar field of context aware and language modeling robots. It would have been obvious to one of ordinary ski
Read full office action
Prosecution Timeline

Jan 29, 2024
Application Filed
Sep 20, 2025
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/456,811
Patent 12594945
DETECTION AND REMEDIATION OF AN INSTABILITY CONDITION IN A VEHICLE-TRAILER SYSTEM
2y 5m to grant Granted Apr 07, 2026
18/080,623
Patent 12583108
GRASP SELECTION
2y 5m to grant Granted Mar 24, 2026
17/978,102
Patent 12573296
ROAD INFORMATION DISPLAY SYSTEM AND METHOD
2y 5m to grant Granted Mar 10, 2026
18/122,514
Patent 12572162
SYSTEM AND METHOD FOR PRECISE FORCE CONTROL OF ROBOT
2y 5m to grant Granted Mar 10, 2026
18/310,149
Patent 12559122
STEERING INPUT WITH LIGHT SOURCE
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
79%
Grant Probability
90%
With Interview (+11.4%)
2y 9m
Median Time to Grant
Low
PTA Risk
Based on 158 resolved cases by this examiner. Grant probability derived from career allow rate.