Last updated: April 19, 2026
Application No. 18/794,249
APPARATUS FOR CONTROLLING ROBOT AND METHOD THEREOF

Non-Final OA §103
Filed
Aug 05, 2024
Examiner
RAMIREZ, ELLIS B
Art Unit
3658
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Kia Corporation
OA Round
1 (Non-Final)
Interview Optional

— +18.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 194 resolved cases, 2023–2026
Examiner Intelligence

RAMIREZ, ELLIS B View full profile →
Grants 80% — above average
Career Allow Rate
156 granted / 194 resolved
+28.4% vs TC avg
Strong +18% interview lift
Without
With
+18.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
39 currently pending
Career history
233
Total Applications
across all art units
Statute-Specific Performance

§101
9.1%
-30.9% vs TC avg
§103
62.0%
+22.0% vs TC avg
§102
14.1%
-25.9% vs TC avg
§112
7.4%
-32.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 194 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
            Status of Claims
This is in response to applicant’s filing date of August 5, 2024. Claims 1-20 are currently pending.
                                                            Priority
Acknowledgment is made of applicant’s claim for foreign priority to Application KR10-2024-0043456, filed on March 29, 2024.  The certified copy of the application as required by 37 CFR 1.55 has been received.        
                                                          Information Disclosure Statement
The information disclosure statement (IDS) submitted on August 5, 2024, is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
                                                           Claim Rejections -- 35 U.S.C. § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sergiy Vasylyev (US-20240412720-A1)(“ Vasylyev”) and TORISAWA et al (US-20240265200-A)(“Torisawa”).



    PNG
    media_image1.png
    412
    409
    media_image1.png
    Greyscale

As per claim 1, Vasylyev discloses a  robot control apparatus (Figure 7 (above) and Figure 1) comprising:
 at least one processor (Vasylyev at Figure 1, processor 122.); and
 a storage medium storing computer-readable instructions that, when executed by the at least one processor, enable the at least one processor (Vasylyev at Figure 1, RAM 124 and system memory 118, and at least Para. [0661] which discloses the processing of instructions to cause the processor to perform programmed functions:” processor 122 is configured to execute instructions to process this voice input using the transformer-based language model for contextual understanding. The high-priority tag may then be associated with the task's representation within contextual memory unit 116.”) to:
 obtain a feature vector for providing a target service to a user according to an input sentence based on identifying the input sentence including requirements of the user (Vasylyev at Para. [0206] discloses the vectorization of an input sentence to obtain a target feature of a requested service by the user:” embeddings may be high-dimensional vector representations that capture the semantic meaning and relationships between the words and phrases in the context. The system uses these embeddings as input to a deep neural network, such as a recurrent neural network (RNN) or a long short-term memory (LSTM) network, which is trained to predict the most likely next user request based on the conversation history and context.”), 
, and 
provide the target service that is paired with a target vector and that includes a specific service according to the input sentence (Vasylyev at Para. [0103] discloses the processing of audio so that the assistant (robot) 2 can provide the requested service:” identifying the active user, audio processing unit 125 enables assistant system 2 to provide personalized responses and services tailored to the individual user's preferences and context.”), based on the target vector being determined through the candidate score of the candidate vector (Vasylyev at Figure 3, process for contextual understanding 838, and at Para. [0102] disclosing that the utterance is further processed such as by natural language models to customized the response to a request by the user:” the contextual memory unit 116, which can maintain separate conversation histories and contextual data for each identified user. Furthermore, the user identification results can be passed to processor 122, which can then utilize user-specific language models, knowledge bases, and response generation strategies to ensure that the system's conversational responses are customized and relevant to the identified user.”).  
While disclosing the use of a database primarily for ascertaining a user profile from a   user profile database within the non-volatile system memory unit, see Para. [0012], Vasylyev does not explicitly discloses a process to obtain a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database.
Torisawa in the same field of endeavor which answer user’s questions through a conversation device, teaches a dialogue apparatus implemented by a neural network designed to generate an output sentence to an input sentence in a natural language, by using the training data samples stored in the training data preparing means.
In particular, Torisawa discloses a process to obtain a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database (Torisawa at Para. [0091] discloses comparing utterance (sentence) with prior stored utterance from the user to determine a closeness which is score representing the level of  relatedness:” Utterance selector 350 has a neural network that is pre-trained to receive a word vector sequence obtained by concatenating a user input 102 and a past user input stored in user input storage 346 with a prescribed separator token and to output a value representing the degree of relatedness between these..”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Vasylyev further in view of Torisawa to allow for finding a match, from a database, between a query and a question/answer pair, and answering the query with the matching answer. Motivation to do so would allow for reducing negative effects of service waiting time in a human-machine interaction by providing a more natural interaction during a service waiting time (Torisawa at Para. [0018]).
As per claim 2, Vasylyev and  Torisawa disclose an apparatus of claim 1, wherein the instructions further enable the at least one processor to:
 translate an input language of the input sentence by translating the input language of the input sentence into a target language in response to the input language of the input sentence not being the target language (Vasylyev at Para. [0045] discloses input translation when the language of the user is different than the target language:” Assistant system 2 may be configured to provide on-the-fly translation between languages to support meeting participants that speak different languages. This feature can be particularly advantageous in international meetings, where language barriers might hinder effective communication, decision-making, and collaboration.”);
 obtain at least one input sentence keyword from the input sentence by removing a stopword of the input sentence (Vasylyev at Para. [0275] discloses processing the input vector as to prune the input which under the broadest reasonable interpretation is removal of stopwords:” the model might use techniques like dimensionality reduction, quantization, or pruning to reduce the size of the context vectors without significantly compromising their ability to represent the conversation context.”); and
obtain a target keyword of the input sentence from a first service table based on the at least one input sentence keyword and the first service table regarding synonyms mapping (Vasylyev at Para. [0146] discloses the use of similarity word which using broadest reasonable interpretation is a synonym word and keywords to obtain a target keyword:” retrieval process may employ a hybrid approach combining semantic similarity matching and keyword-based retrieval. The AI Assistant may be configured to use advanced natural language processing techniques, such as word embeddings and transformer-based models, to encode the search query and the documents in the knowledge base into dense vector representations. The AI Assistant may then perform a similarity search to find the most relevant documents based on their semantic proximity to the query.”).  
As per claim 3, Vasylyev and  Torisawa disclose an apparatus of claim 2, wherein the instructions further enable the at least one processor to:
 obtain a guidance sentence corresponding to the target keyword based on a second service table regarding service mapping (Vasylyev at Para. [0421] discloses providing guidance to a user based on the parsing of an input utterance:” navigation system, powered by mapping technologies, calculates the optimal route from the user's current location to the restaurant, considering real-time traffic conditions and road restrictions. Assistant system 2 then provides turn-by-turn voice guidance to the user, keeping the user informed about the estimated time of arrival and any relevant updates.”); and
obtain the feature vector by applying the guidance sentence to a feature extraction model trained to extract a feature of a given sentence (Vasylyev at Figure 3, control signal detection 852 and command processing 860, and Para. [0603] disclosing extracting a feature from a sentence to generate a guidance action like navigating to a particular location:” multimodal fusion module 310 may employ an early fusion technique where the module concatenates the feature vectors extracted from the different modalities at the input level and creates a single multimodal representation that is then processed by the subsequent layers of the model. Early fusion may be particularly suitable for scenarios where the different modalities are tightly coupled and synchronized, such as audio-visual speech recognition, for example”.).  
As per claim 4, Vasylyev and  Torisawa disclose an apparatus of claim 1, wherein the instructions further enable the at least one processor to:
 obtain a token by performing word-tokenization from a corpus including documents including at least one sentence (Vasylyev at Para. [0037] discloses the tokenization of input words:” the tokenization process, the model used by the Assistant receives a textual representation of the conversation as input and breaks down the received text into a sequence of tokens. Each token is then converted into a high-dimensional vector using the model's learned embedding layer. This layer is configured to act as a lookup table that assigns each unique token in the model's vocabulary to a specific vector.”);
 determine a first frequency value of the token regarding a term frequency, at which the token is included in the corpus, based on the corpus (Vasylyev at Para. [0200] discloses a frequency for a term:” the tokenization process may include splitting words into individual characters, counting the frequency of each pair of characters (or character sequences) in the text, merging the most frequently occurring pair to create a new token, and repeating the process until a pre-defined number of tokens is reached, or the most frequent pairs are too infrequent.”);
 determine a second frequency value of the token regarding an inverse document frequency, at which the token is included in the documents, based on the corpus (Vasylyev at Para. [0079] discloses a second frequency based on an inverse document frequency:” it may extract Term Frequency-Inverse Document Frequency (TF-IDF) features, or use word embeddings.”);
 determine a target weight of the token based on the first frequency value and the second frequency value (Vasylyev at Para. [0112] discloses applying and determining weights to a token based the determined frequency:” assistant system 2 may assign weights to each normalized factor based on their relative importance. These weights can be adjusted dynamically based on the current context and user behavior. For example: w_user_preference=0.4; w_memory=0.3; w_processor_speed=0.2; and w_latency=0.1. The weights may be chosen so that they sum up to 1.”); and
 determine the candidate vector of a given sentence including the token based on the target weight of the token (Vasylyev at Para. [0113] discloses using the determined vector from the weighted of the token:” assistant system 2 may calculate the weighted average of the normalized factors to obtain the overall utility score.”).  
As per claim 5, Vasylyev and  Torisawa disclose an apparatus of claim 1, wherein the instructions further enable the at least one processor to obtain the candidate score of the candidate vector by applying the feature vector and the candidate vector to a score calculation model (Vasylyev at Para. [0120] discloses applying a normalized factor to score the candidate vector:” assistant system 2 normalizes the factors (values shown are post-normalization) as follows”.), wherein the score calculation model is trained to extract a similarity score related to similarity based on Euclidean scalar product (Vasylyev at Para. [0146] discloses applying a similarity search:” AI Assistant may then perform a similarity search to find the most relevant documents based on their semantic proximity to the query.”; and in Para. [0209] discloses that the similarity search can be based on an Euclidean distance:” embedding of the actual user request and compares it with the predicted request embeddings stored in the cache memory using a similarity metric, such as cosine similarity or Euclidean distance.”).  
As per claim 6, Vasylyev and  Torisawa disclose an apparatus of claim 1, wherein the instructions further enable the at least one processor to:
 identify at least one database vector from the database in which the candidate vector is stored (Vasylyev at Para. [0285] discloses analyzing a conversation to identify attributes that are stored and used in later interactions:” analyzing the topics, sentiment, and language style of a user's conversations, assistant system 2 can identify their interests, emotional tendencies, and communication preferences. These inferred attributes are then incorporated into the user model to inform future personalization decisions.”);
 obtain a database vector score of the at least one database vector based on the feature vector and the at least one database vector (Vasylyev at Para. [0741] discloses scoring the feature and candidate to determine a confidence:” robot may be configured to assign confidence scores to its speech recognition and language understanding outputs, indicating the level of certainty in its interpretations. If the confidence score falls below a predefined threshold, the robot can seek clarification from the user or provide alternative suggestions.”); and
 determine the target vector based on the database vector score of the at least one database vector and a threshold score (Vasylyev at Para. [0741] discloses the use of a threshold to determine a confidence that there is a match: “confidence score falls below a predefined threshold, the robot can seek clarification from the user or provide alternative suggestions.”).  
As per claim 7, Vasylyev and  Torisawa disclose an apparatus of claim 6, wherein the instructions further enable the at least one processor to:
 determine an output vector group that exceeds the threshold score and that includes the target vector, by comparing the database vector score of the at least one database vector with the threshold score (Vasylyev  at Para. [0212] discloses providing an output to a request that meets a confidence level: “Assistant system 2 may further incorporate uncertainty estimation techniques to quantify the confidence of its predicted requests and pre-generated responses. This may allow assistant system 2 to prioritize the delivery of high-confidence responses and to prompt the user for clarification or additional information when the confidence is low. Assistant system 2 may further employ contextual pruning techniques to filter out predicted requests and pre-generated responses that are not relevant to the current conversation context.”); and
provide a vector-related service paired with each vector included in the output vector group (Vasylyev  at Para. [0212] providing a service to a user based on the confidence of the request:” considering factors such as the topic, tone, and user's intent, the system can eliminate unnecessary or inappropriate responses and focus on delivering the most pertinent information to the user.” See Para. [0467]-[0471] for application as  an eldercarebot that interacts with the user and provides answers and action based on speech recognition.).  
As per claim 8, Vasylyev and  Torisawa disclose an apparatus of claim 1, wherein the instructions further enable the at least one processor to:
 obtain an additional feature vector from an additional input sentence based on identifying the additional input sentence including additional requirements of the user after identifying the input sentence (Vasylyev at Para. [0473] discloses that a robot can be trained for multiple commands which would require the processing of additional features:” ElderlyCareBot may be configured for multimodal input/output. For example, it may be equipped with a camera (which can be exemplified by camera 168) for capturing video and further provided with the ability to interpret the captured video to further enhance its interaction with Mr. Smith. The AI engine of the ElderlyCareBot may be trained to recognize and contextually interpret certain events that can be translated into actions useful for Mr. Smith.”);
 ; and
provide a vector-related service that is paired with the target vector and that is according to the additional input sentence (Vasylyev at Para. [0103] discloses the processing of audio so that the assistant (robot) 2 can provide the requested service:” identifying the active user, audio processing unit 125 enables assistant system 2 to provide personalized responses and services tailored to the individual user's preferences and context.”), based on the target vector being determined through the candidate score of the candidate vector (Vasylyev at Figure 3, process for contextual understanding 838, and at Para. [0102] disclosing that the utterance is further processed such as by natural language models to customized the response to a request by the user:” the contextual memory unit 116, which can maintain separate conversation histories and contextual data for each identified user. Furthermore, the user identification results can be passed to processor 122, which can then utilize user-specific language models, knowledge bases, and response generation strategies to ensure that the system's conversational responses are customized and relevant to the identified user.”).  
While disclosing the use of a database primarily for ascertaining a user profile from a   user profile database within the non-volatile system memory unit, see Para. [0012], Vasylyev does not explicitly discloses a process to obtain a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database.
Torisawa in the same field of endeavor which answer user’s questions through a conversation device, teaches a dialogue apparatus implemented by a neural network designed to generate an output sentence to an input sentence in a natural language, by using the training data samples stored in the training data preparing means.
In particular, Torisawa discloses a process to obtain a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database (Torisawa at Para. [0091] discloses comparing utterance (sentence) with prior stored utterance from the user to determine a closeness which is score representing the level of  relatedness:” Utterance selector 350 has a neural network that is pre-trained to receive a word vector sequence obtained by concatenating a user input 102 and a past user input stored in user input storage 346 with a prescribed separator token and to output a value representing the degree of relatedness between these..”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Vasylyev further in view of Torisawa to allow for finding a match, from a database, between a query and a question/answer pair, and answering the query with the matching answer. Motivation to do so would allow for reducing negative effects of service waiting time in a human-machine interaction by providing a more natural interaction during a service waiting time (Torisawa at Para. [0018]).
As per claim 9, Vasylyev and  Torisawa disclose an apparatus of claim 1, wherein the instructions further enable the at least one processor to store a vector-related service that is paired with the feature vector and that is according to the input sentence, in the database by pairing the vector-related service according to the input sentence with the feature vector (Vasylyev at Para. [0584] discloses storing and retrieving information so that it can be vector-related services:” assistant system 2 may be configured to identify and store information that is or can potentially be relevant to the current conversation, but storing which in the main context window immediately available to the LLM is not necessary or practical. For example, this may include facts, experiences, preferences, etc., about the user(s) or assistant system 2 that are of less priority than the other information present in the main context window and storing which would require a size of the main context window which is either beyond what is available or what is optimal.”).  
As per claim 10,  Vasylyev discloses a robot control method (Figure 2), the method comprising:
 obtaining a feature vector for providing a target service to a user according to an input sentence based on identifying the input sentence including requirements of the user (Vasylyev at Para. [0206] discloses the vectorization of an input sentence to obtain a target feature of a requested service by the user:” embeddings may be high-dimensional vector representations that capture the semantic meaning and relationships between the words and phrases in the context. The system uses these embeddings as input to a deep neural network, such as a recurrent neural network (RNN) or a long short-term memory (LSTM) network, which is trained to predict the most likely next user request based on the conversation history and context.”), 
, and 
providing the target service that is paired with a target vector and that includes a specific service according to the input sentence (Vasylyev at Para. [0103] discloses the processing of audio so that the assistant (robot) 2 can provide the requested service:” identifying the active user, audio processing unit 125 enables assistant system 2 to provide personalized responses and services tailored to the individual user's preferences and context.”), based on the target vector being determined through the candidate score of the candidate vector (Vasylyev at Figure 3, process for contextual understanding 838, and at Para. [0102] disclosing that the utterance is further processed such as by natural language models to customized the response to a request by the user:” the contextual memory unit 116, which can maintain separate conversation histories and contextual data for each identified user. Furthermore, the user identification results can be passed to processor 122, which can then utilize user-specific language models, knowledge bases, and response generation strategies to ensure that the system's conversational responses are customized and relevant to the identified user.”).  
While disclosing the use of a database primarily for ascertaining a user profile from a   user profile database within the non-volatile system memory unit, see Para. [0012], Vasylyev does not explicitly discloses a process to obtaining a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database.
Torisawa in the same field of endeavor which answer user’s questions through a conversation device, teaches a dialogue apparatus implemented by a neural network designed to generate an output sentence to an input sentence in a natural language, by using the training data samples stored in the training data preparing means.
In particular, Torisawa discloses a process to obtain a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database (Torisawa at Para. [0091] discloses comparing utterance (sentence) with prior stored utterance from the user to determine a closeness which is score representing the level of  relatedness:” Utterance selector 350 has a neural network that is pre-trained to receive a word vector sequence obtained by concatenating a user input 102 and a past user input stored in user input storage 346 with a prescribed separator token and to output a value representing the degree of relatedness between these..”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Vasylyev further in view of Torisawa to allow for finding a match, from a database, between a query and a question/answer pair, and answering the query with the matching answer. Motivation to do so would allow for reducing negative effects of service waiting time in a human-machine interaction by providing a more natural interaction during a service waiting time (Torisawa at Para. [0018]).
As per claim 11, Vasylyev and  Torisawa disclose a method of claim 10, wherein the obtaining of the feature vector includes:
 translating an input language of the input sentence by translating the input language of the input sentence into a target language in response to the input language of the input sentence not being the target language (Vasylyev at Para. [0045] discloses input translation when the language of the user is different than the target language:” Assistant system 2 may be configured to provide on-the-fly translation between languages to support meeting participants that speak different languages. This feature can be particularly advantageous in international meetings, where language barriers might hinder effective communication, decision-making, and collaboration.”);
 obtaining at least one input sentence keyword from the input sentence by removing a stopword of the input sentence (Vasylyev at Para. [0275] discloses processing the input vector as to prune the input which under the broadest reasonable interpretation is removal of stopwords:” the model might use techniques like dimensionality reduction, quantization, or pruning to reduce the size of the context vectors without significantly compromising their ability to represent the conversation context.”); and
obtaining a target keyword of the input sentence from a first service table based on the at least one input sentence keyword and the first service table regarding synonyms mapping (Vasylyev at Para. [0146] discloses the use of similarity word which using broadest reasonable interpretation is a synonym word and keywords to obtain a target keyword:” retrieval process may employ a hybrid approach combining semantic similarity matching and keyword-based retrieval. The AI Assistant may be configured to use advanced natural language processing techniques, such as word embeddings and transformer-based models, to encode the search query and the documents in the knowledge base into dense vector representations. The AI Assistant may then perform a similarity search to find the most relevant documents based on their semantic proximity to the query.”).  
As per claim 12, Vasylyev and  Torisawa disclose a method of claim 11, wherein the obtaining of the feature vector includes:
obtaining a guidance sentence corresponding to the target keyword based on a second service table regarding service mapping (Vasylyev at Para. [0421] discloses providing guidance to a user based on the parsing of an input utterance:” navigation system, powered by mapping technologies, calculates the optimal route from the user's current location to the restaurant, considering real-time traffic conditions and road restrictions. Assistant system 2 then provides turn-by-turn voice guidance to the user, keeping the user informed about the estimated time of arrival and any relevant updates.”); and
obtaining the feature vector by applying the guidance sentence to a feature extraction model trained to extract a feature of a given sentence (Vasylyev at Figure 3, control signal detection 852 and command processing 860, and Para. [0603] disclosing extracting a feature from a sentence to generate a guidance action like navigating to a particular location:” multimodal fusion module 310 may employ an early fusion technique where the module concatenates the feature vectors extracted from the different modalities at the input level and creates a single multimodal representation that is then processed by the subsequent layers of the model. Early fusion may be particularly suitable for scenarios where the different modalities are tightly coupled and synchronized, such as audio-visual speech recognition, for example”.).  
As per claim 13, Vasylyev and  Torisawa disclose a method of claim 10, wherein the obtaining of the candidate score of the candidate vector includes:
obtaining a token by performing word-tokenization from a corpus including documents including at least one sentence (Vasylyev at Para. [0037] discloses the tokenization of input words:” the tokenization process, the model used by the Assistant receives a textual representation of the conversation as input and breaks down the received text into a sequence of tokens. Each token is then converted into a high-dimensional vector using the model's learned embedding layer. This layer is configured to act as a lookup table that assigns each unique token in the model's vocabulary to a specific vector.”);
 determining a first frequency value of the token regarding a term frequency, at which the token is included in the corpus, based on the corpus (Vasylyev at Para. [0200] discloses a frequency for a term:” the tokenization process may include splitting words into individual characters, counting the frequency of each pair of characters (or character sequences) in the text, merging the most frequently occurring pair to create a new token, and repeating the process until a pre-defined number of tokens is reached, or the most frequent pairs are too infrequent.”);
 determining a second frequency value of the token regarding an inverse document frequency, at which the token is included in the documents, based on the corpus (Vasylyev at Para. [0079] discloses a second frequency based on an inverse document frequency:” it may extract Term Frequency-Inverse Document Frequency (TF-IDF) features, or use word embeddings.”);
 determining a target weight of the token based on the first frequency value and the second frequency value (Vasylyev at Para. [0112] discloses applying and determining weights to a token based the determined frequency:” assistant system 2 may assign weights to each normalized factor based on their relative importance. These weights can be adjusted dynamically based on the current context and user behavior. For example: w_user_preference=0.4; w_memory=0.3; w_processor_speed=0.2; and w_latency=0.1. The weights may be chosen so that they sum up to 1.”); and
 determining the candidate vector of a given sentence including the token based on the target weight of the token (Vasylyev at Para. [0113] discloses using the determined vector from the weighted of the token:” assistant system 2 may calculate the weighted average of the normalized factors to obtain the overall utility score.”).  
As per claim 14, Vasylyev and  Torisawa disclose a method of claim 10, wherein the instructions further enable the at least one processor to obtaining the candidate score of the candidate vector by applying the feature vector and the candidate vector to a score calculation model (Vasylyev at Para. [0120] discloses applying a normalized factor to score the candidate vector:” assistant system 2 normalizes the factors (values shown are post-normalization) as follows”.), wherein the score calculation model is trained to extract a similarity score related to similarity based on Euclidean scalar product (Vasylyev at Para. [0146] discloses applying a similarity search:” AI Assistant may then perform a similarity search to find the most relevant documents based on their semantic proximity to the query.”; and in Para. [0209] discloses that the similarity search can be based on an Euclidean distance:” embedding of the actual user request and compares it with the predicted request embeddings stored in the cache memory using a similarity metric, such as cosine similarity or Euclidean distance.”) .    
As per claim 15, Vasylyev and  Torisawa disclose a method of claim 10, wherein the providing of the target service includes:
identify at least one database vector from the database in which the candidate vector is stored (Vasylyev at Para. [0285] discloses analyzing a conversation to identify attributes that are stored and used in later interactions:” analyzing the topics, sentiment, and language style of a user's conversations, assistant system 2 can identify their interests, emotional tendencies, and communication preferences. These inferred attributes are then incorporated into the user model to inform future personalization decisions.”);
 obtaining a database vector score of the at least one database vector based on the feature vector and the at least one database vector (Vasylyev at Para. [0741] discloses scoring the feature and candidate to determine a confidence:” robot may be configured to assign confidence scores to its speech recognition and language understanding outputs, indicating the level of certainty in its interpretations. If the confidence score falls below a predefined threshold, the robot can seek clarification from the user or provide alternative suggestions.”); and
 determining the target vector based on the database vector score of the at least one database vector and a threshold score (Vasylyev at Para. [0741] discloses the use of a threshold to determine a confidence that there is a match: “confidence score falls below a predefined threshold, the robot can seek clarification from the user or provide alternative suggestions.”).  
 As per claim 16, Vasylyev and  Torisawa disclose a method  of claim 15, wherein the providing of the target service includes:
 determining an output vector group that exceeds the threshold score and that includes the target vector, by comparing the database vector score of the at least one database vector with the threshold score (Vasylyev  at Para. [0212] discloses providing an output to a request that meets a confidence level: “Assistant system 2 may further incorporate uncertainty estimation techniques to quantify the confidence of its predicted requests and pre-generated responses. This may allow assistant system 2 to prioritize the delivery of high-confidence responses and to prompt the user for clarification or additional information when the confidence is low. Assistant system 2 may further employ contextual pruning techniques to filter out predicted requests and pre-generated responses that are not relevant to the current conversation context.”); and
providing a vector-related service paired with each vector included in the output vector group (Vasylyev  at Para. [0212] providing a service to a user based on the confidence of the request:” considering factors such as the topic, tone, and user's intent, the system can eliminate unnecessary or inappropriate responses and focus on delivering the most pertinent information to the user.” See Para. [0467]-[0471] for application as  an eldercarebot that interacts with the user and provides answers and action based on speech recognition.).  
As per claim 17, Vasylyev and  Torisawa disclose a method of claim 10, wherein the providing of the target service includes:
obtaining an additional feature vector from an additional input sentence based on identifying the additional input sentence including additional requirements of the user after identifying the input sentence (Vasylyev at Para. [0473] discloses that a robot can be trained for multiple commands which would require the processing of additional features:” ElderlyCareBot may be configured for multimodal input/output. For example, it may be equipped with a camera (which can be exemplified by camera 168) for capturing video and further provided with the ability to interpret the captured video to further enhance its interaction with Mr. Smith. The AI engine of the ElderlyCareBot may be trained to recognize and contextually interpret certain events that can be translated into actions useful for Mr. Smith.”);
 ; and
providing a vector-related service that is paired with the target vector and that is according to the additional input sentence (Vasylyev at Para. [0103] discloses the processing of audio so that the assistant (robot) 2 can provide the requested service:” identifying the active user, audio processing unit 125 enables assistant system 2 to provide personalized responses and services tailored to the individual user's preferences and context.”), based on the target vector being determined through the candidate score of the candidate vector (Vasylyev at Figure 3, process for contextual understanding 838, and at Para. [0102] disclosing that the utterance is further processed such as by natural language models to customized the response to a request by the user:” the contextual memory unit 116, which can maintain separate conversation histories and contextual data for each identified user. Furthermore, the user identification results can be passed to processor 122, which can then utilize user-specific language models, knowledge bases, and response generation strategies to ensure that the system's conversational responses are customized and relevant to the identified user.”).  
While disclosing the use of a database primarily for ascertaining a user profile from a   user profile database within the non-volatile system memory unit, see Para. [0012], Vasylyev does not explicitly discloses a process to obtaining a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database.
Torisawa in the same field of endeavor which answer user’s questions through a conversation device, teaches a dialogue apparatus implemented by a neural network designed to generate an output sentence to an input sentence in a natural language, by using the training data samples stored in the training data preparing means.
In particular, Torisawa discloses a process to obtaining a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database (Torisawa at Para. [0091] discloses comparing utterance (sentence) with prior stored utterance from the user to determine a closeness which is score representing the level of  relatedness:” Utterance selector 350 has a neural network that is pre-trained to receive a word vector sequence obtained by concatenating a user input 102 and a past user input stored in user input storage 346 with a prescribed separator token and to output a value representing the degree of relatedness between these..”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Vasylyev further in view of Torisawa to allow for finding a match, from a database, between a query and a question/answer pair, and answering the query with the matching answer. Motivation to do so would allow for reducing negative effects of service waiting time in a human-machine interaction by providing a more natural interaction during a service waiting time (Torisawa at Para. [0018]).
 As per claim 18, Vasylyev and  Torisawa disclose a method of claim 10, wherein the instructions further enabling the at least one processor to store a vector-related service that is paired with the feature vector and that is according to the input sentence, in the database by pairing the vector-related service according to the input sentence with the feature vector (Vasylyev at Para. [0584] discloses storing and retrieving information so that it can be vector-related services:” assistant system 2 may be configured to identify and store information that is or can potentially be relevant to the current conversation, but storing which in the main context window immediately available to the LLM is not necessary or practical. For example, this may include facts, experiences, preferences, etc., about the user(s) or assistant system 2 that are of less priority than the other information present in the main context window and storing which would require a size of the main context window which is either beyond what is available or what is optimal.”).    
As per claim 19,  Vasylyev discloses a robot control method (Figure 2), the method comprising:
translating an input language of the input sentence by translating the input language of the input sentence into a target language in response to the input language of the input sentence not being the target language (Vasylyev at Para. [0045] discloses input translation when the language of the user is different than the target language:” Assistant system 2 may be configured to provide on-the-fly translation between languages to support meeting participants that speak different languages. This feature can be particularly advantageous in international meetings, where language barriers might hinder effective communication, decision-making, and collaboration.”);
obtaining a feature vector for providing a target service to a user according to an input sentence based on identifying the input sentence including requirements of the user (Vasylyev at Para. [0206] discloses the vectorization of an input sentence to obtain a target feature of a requested service by the user:” embeddings may be high-dimensional vector representations that capture the semantic meaning and relationships between the words and phrases in the context. The system uses these embeddings as input to a deep neural network, such as a recurrent neural network (RNN) or a long short-term memory (LSTM) network, which is trained to predict the most likely next user request based on the conversation history and context.”), 
, and 
providing the target service that is paired with a target vector and that includes a specific service according to the input sentence (Vasylyev at Para. [0103] discloses the processing of audio so that the assistant (robot) 2 can provide the requested service:” identifying the active user, audio processing unit 125 enables assistant system 2 to provide personalized responses and services tailored to the individual user's preferences and context.”), based on the target vector being determined through the candidate score of the candidate vector (Vasylyev at Figure 3, process for contextual understanding 838, and at Para. [0102] disclosing that the utterance is further processed such as by natural language models to customized the response to a request by the user:” the contextual memory unit 116, which can maintain separate conversation histories and contextual data for each identified user. Furthermore, the user identification results can be passed to processor 122, which can then utilize user-specific language models, knowledge bases, and response generation strategies to ensure that the system's conversational responses are customized and relevant to the identified user.”).  
While disclosing the use of a database primarily for ascertaining a user profile from a   user profile database within the non-volatile system memory unit, see Para. [0012], Vasylyev does not explicitly discloses a process to obtaining a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database.
Torisawa in the same field of endeavor which answer user’s questions through a conversation device, teaches a dialogue apparatus implemented by a neural network designed to generate an output sentence to an input sentence in a natural language, by using the training data samples stored in the training data preparing means.
In particular, Torisawa discloses a process to obtain a candidate score of a candidate vector based on the feature vector and the candidate vector stored in a database (Torisawa at Para. [0091] discloses comparing utterance (sentence) with prior stored utterance from the user to determine a closeness which is score representing the level of  relatedness:” Utterance selector 350 has a neural network that is pre-trained to receive a word vector sequence obtained by concatenating a user input 102 and a past user input stored in user input storage 346 with a prescribed separator token and to output a value representing the degree of relatedness between these..”).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Vasylyev further in view of Torisawa to allow for finding a match, from a database, between a query and a question/answer pair, and answering the query with the matching answer. Motivation to do so would allow for reducing negative effects of service waiting time in a human-machine interaction by providing a more natural interaction during a service waiting time (Torisawa at Para. [0018]).
As per claim 20, Vasylyev and  Torisawa disclose a method of claim 19, wherein the providing of the target service includes:
 identifying at least one database vector from the database in which the candidate vector is stored (Vasylyev at Para. [0285] discloses analyzing a conversation to identify attributes that are stored and used in later interactions:” analyzing the topics, sentiment, and language style of a user's conversations, assistant system 2 can identify their interests, emotional tendencies, and communication preferences. These inferred attributes are then incorporated into the user model to inform future personalization decisions.”);
 obtaining a database vector score of the at least one database vector based on the feature vector and the at least one database vector (Vasylyev at Para. [0741] discloses scoring the feature and candidate to determine a confidence:” robot may be configured to assign confidence scores to its speech recognition and language understanding outputs, indicating the level of certainty in its interpretations. If the confidence score falls below a predefined threshold, the robot can seek clarification from the user or provide alternative suggestions.”); and
 determining the target vector based on the database vector score of the at least one database vector and a threshold score (Vasylyev at Para. [0741] discloses the use of a threshold to determine a confidence that there is a match: “confidence score falls below a predefined threshold, the robot can seek clarification from the user or provide alternative suggestions.”);
determining an output vector group that exceeds the threshold score and that includes the target vector, by comparing the database vector score of the at least one database vector with the threshold score (Vasylyev  at Para. [0212] discloses providing an output to a request that meets a confidence level: “Assistant system 2 may further incorporate uncertainty estimation techniques to quantify the confidence of its predicted requests and pre-generated responses. This may allow assistant system 2 to prioritize the delivery of high-confidence responses and to prompt the user for clarification or additional information when the confidence is low. Assistant system 2 may further employ contextual pruning techniques to filter out predicted requests and pre-generated responses that are not relevant to the current conversation context.”); and
providing a vector-related service paired with each vector included in the output vector group (Vasylyev  at Para. [0212] providing a service to a user based on the confidence of the request:” considering factors such as the topic, tone, and user's intent, the system can eliminate unnecessary or inappropriate responses and focus on delivering the most pertinent information to the user.” See Para. [0467]-[0471] for application as  an eldercarebot that interacts with the user and provides answers and action based on speech recognition.).  
                                                                  Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
BACHRACH et al (US-20190236155-A1) discloses processing text data that is exchanged between computing devices and for  providing feedback for a conversational agent, where the conversational agent uses a predictive model to compute responses.
Wang et al (US-20190163985-A1) discloses deploying a frontend system (FIES); activating input streams from on-site cameras located at a current deployment location of the FIES and automatically generating a first product recommendation, including: if the inspection event meets enhanced inspection criteria, which are met when a second inspection event of the first user exists in previously stored inspection events associated with the respective first sample product, automatically adding a product-specific description of the first sample product in the first product recommendation.
Tian et al (CN-109408622-A) discloses a process for determining the order of the different candidate answer relative with the input sentence according to the ordering result. selecting satisfy the ordering condition of the candidate answer from the candidate answer, the selected candidate answers filled in the corresponding groove position of the combinations named entity, marked as the input sentence the expression command.
Nobuo Yamato (US-20180085928-A1) discloses a robot that recognizes an emotion of a user and controls an operation in a conversation by exchanging messages in the conversation between the user and the robot is provided. The robot is able to communicate with a mobile terminal carried by the user and includes: a storage unit configured to store a plurality of applications to control motions of the robot in advance.
Hashimoto et al (US-20160357854-A1) discloses an apparatus and method to provide for collecting elements as a basis for generating a social scenario useful for people to make well-balanced good decision.
CHO et al (US-20160164815-A1) discloses a terminal device includes a communication interface configured to perform communication with an external device, as chatting begins, a display configured to display chatting messages sent and received through the communication interface, a storage, and a processor configured to classify the chatting messages into a plurality of dialogue sessions, store keywords for defining the respective classified dialogue sessions on the storage, and provide the dialogue session matching at least one keyword through the display when an event associated with at least one keyword among the plurality of keywords stored on the storage occurs.
Shiina et al (US-20130115578-A1) discloses a method  of performing sign language by a robot, and more specifically, to a sign language action generating device which generates a sign language action generated by connecting each sign language action corresponding to a word used in a sign language by a robot, and a communication robot equipped with the sign language action generating device.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ELLIS B. RAMIREZ whose telephone number is (571)272-8920. The examiner can normally be reached 7:30 am to 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ramon Mercado can be reached at 571-270-5744. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ELLIS B. RAMIREZ/Examiner, Art Unit 3658
Read full office action
Prosecution Timeline

Aug 05, 2024
Application Filed
Feb 18, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/616,895
Patent 12600034
Compensation of Positional Tolerances in the Robot-assisted Surface Machining
2y 5m to grant Granted Apr 14, 2026
17/884,737
Patent 12584758
VEHICLE DISPLAY DEVICE, VEHICLE DISPLAY PROCESSING METHOD, AND NON-TRANSITORY STORAGE MEDIUM
2y 5m to grant Granted Mar 24, 2026
17/828,598
Patent 12571639
SYSTEM AND METHOD FOR IDENTIFYING TRIP PAIRS
2y 5m to grant Granted Mar 10, 2026
18/012,431
Patent 12551302
CONTROLLING A SURGICAL INSTRUMENT
2y 5m to grant Granted Feb 17, 2026
18/130,050
Patent 12552018
INTEGRATING ROBOTIC PROCESS AUTOMATIONS INTO OPERATING AND SOFTWARE SYSTEMS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
80%
Grant Probability
99%
With Interview (+18.2%)
3y 3m
Median Time to Grant
Low
PTA Risk
Based on 194 resolved cases by this examiner. Grant probability derived from career allow rate.