Last updated: April 19, 2026
Application No. 17/710,385
MULTI-FACETED BOT SYSTEM AND METHOD THEREOF

Final Rejection §103
Filed
Mar 31, 2022
Examiner
STANLEY, JEREMY L
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Jio Platforms Limited
OA Round
2 (Final)
This examiner grants 48% of cases after interview

— +44.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 276 resolved cases, 2023–2026
Examiner Intelligence

STANLEY, JEREMY L View full profile →
Grants 48% of resolved cases
Career Allow Rate
131 granted / 276 resolved
-7.5% vs TC avg
Strong +45% interview lift
Without
With
+44.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
28 currently pending
Career history
304
Total Applications
across all art units
Statute-Specific Performance

§101
10.2%
-29.8% vs TC avg
§103
49.1%
+9.1% vs TC avg
§102
13.5%
-26.5% vs TC avg
§112
17.1%
-22.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 276 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Amendment filed on November 3, 2025.  Claims 1, 2, 4, 6-8, 10, 11, 15-17, 21, 27, 28, 33, 34, and 37 are amended.  Claims 1-37 are pending in the case.  Claims 1 and 21 are the independent claims.  
This action is non-final.

Applicant’s Response
In the response filed on November 3, 2025, Applicant amended the claims and provided arguments in response to the objections to the claims and rejections of the claims under 35 USC 102, 103, and 112 in the previous office action.

Response to Argument/Amendment
Applicant’s amendments to the claims in response to the objections to the claims in the previous office action are acknowledged, and Applicant’s associated arguments have been fully considered.  The amendments to the claims render the objections moot.  Therefore, the objections are withdrawn.
Applicant’s amendments to the claims in response to the rejection of the claims under 35 USC 112 in the previous office action are acknowledged, and Applicant’s associated arguments have been fully considered.  The amendments to the claims render the rejections moot.  Therefore, the rejection is withdrawn.
Applicant’s amendments to the claims in response to the rejections of the claims under 35 USC 102 and 103 in the previous office action are acknowledged, and Applicant’s associated arguments have been fully considered. 
Applicant first argues that Polleri does not teach various aspects of the instant application, providing a list of elements which appear to summarize and paraphrase various limitation in the independent claims (see Applicant’s remarks filed November 3, 2025, pages 9-10).  To the extent that this paraphrased set of features may include elements which are not explicitly recited in the claims, Examiner notes that some of the element relied upon are not explicitly recited in the claims (i.e. receiving a user query, as opposed to a set of data packets corresponding to a user query; thereafter receiving the knowledgebase (where the claim language does not impose such a timing requirement); outputting the predicted one or more responses, to the extent that this is distinct from converting).  Further, Examiner notes that with respect to the majority of the features listed by Applicant in this portion of the remarks, Applicant does not provide any further explanation or argument regarding how the cited teachings of Polleri fail to teach these features.  With respect to those features, Applicant’s argument appears to amount to a mere allegation of patentability, and Examiner reiterates the teachings of Polleri as previously cited with respect to the limitations actually recited.
Applicant does provide specific argument with a subset of the limitations of the independent claims, i.e. “Polleri does not disclose that for identification of ‘primary potential intent’, one or more responses corresponding to the user query also need to be considered/factored into.  Also Polleri does not disclose that based on the identified ‘primary potential intent’, a trained model is generated (which is thereafter used to predict one or more responses to the user query).  Polleri does not disclose generation of a trained model based on the identified primary potential intent, and also at the time of identifying said primary potential intent, it does not consider the potential one or more responses that may be applicable to the corresponding user query.”
However, the claims do not appear to require any consideration/factoring into of “responses” when identifying the primary potential intent.  Instead, the claims recite “identifies a primary potential intent among said one or more potential intents for said user query by calculating probability for each potential intent among said one or more potential intents for said set of expressions associated with said user query.”  None of this language indicates that “responses” are also considered, factored, or otherwise utilized in identifying the primary potential intent.  Instead, the claims merely recite that “one or more responses” are processed and predicted through an ML engine, without any particular language which indicates that the responses are somehow also utilized in identifying the intent.  Similarity, the claims do not appear to actually recite any language requiring “at the time of identifying said primary potential intent” considering “the potential one or more responses that may be applicable to the corresponding user query.”  Therefore, these arguments also appear to be made with respect to limitations which are not actually recited in the claims.  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
With respect to the limitations actually recited in the claims, Polleri clearly teaches:
processing, through an ML engine, training data comprising the user query, one or more responses corresponding to the user query, and said one or more potential intents that are mapped to said user query, and wherein said ML engine identifies a primary potential intent among said one or more potential intents for said user query by calculating probability for each potential intent among said one or more potential intents for said set of expressions associated with said user query to generate a trained model (e.g. paragraph 0111, typical user requests and statements, referred to as utterances, intents created by providing name that illustrates user action and compiling set of real-life user statements/utterances commonly associated with triggering the action; rich dataset of utterances; collectively intents and utterances that belong to them make up training corpus for the chatbot; training model with the corpus, improving acuity of chatbot’s cognition through rounds of intent testing and training; paragraph 0125, using permutations of utterances to train an intent classifier of the chatbot; paragraph 0129, performing processing to understand meaning of utterance, which involves identifying one or more intents and one or more entities corresponding to the utterance; paragraph 0132, using NLP engine or machine learning model such as intent classifier to map end user utterances to specific intents; paragraph 0139, selecting appropriate skill bot for handling request; paragraph 0140, identifying/predicting specific skill bot which can best handle user request; paragraphs 0143-0154, indicating that the skill bot may be custom created for the user, including by configuring settings and intents for the skill bot and training the skill bot, where this training may be performed using training data including intents and example utterances; skill bot represented by model that is trained using the training data, using machine learning techniques, etc.; paragraph 0168, digital assistant evaluates the received user input and computes confidence scores for the skill bots and system intents; any system intent or skill bot with associated confidence score exceeding a threshold value is selected as a candidate for further evaluation; the digital assistant then selects, from the identified candidates, a particular system intent or skill bot for further handling of the user input; paragraph 0174, determining end user intent based on parsed message; paragraph 0196, final intent classification result/identified intent, along with confidence scores associated with each respective intent in the set of intents; i.e. the model is trained (resulting in a generated trained model) using a corpus (i.e. training data) which includes utterances corresponding to user queries, corresponding responses/actions, and corresponding intents (i.e. user queries, responses corresponding to the queries, and corresponding/mapped intents), where the model is trained to identify the corresponding intents, including intents having confidence scores exceeding a threshold and from among these a particular intent which is to be selected, analogous to a primary potential intent); 
wherein said knowledgebase is used to train a neural net to create sentence  vectors, said sentence vectors being used to calculate said probability for each potential intent (e.g. paragraph 0111, utterance or message refers to set of words such as one or more sentences; creating intents from data set that is robust and varied; intents and utterances that belong to them collectively making up training corpus for the chatbot; training model with the corpus to turn model into reference tool for resolving end user input to a single intent; paragraph 0132, performing sentence parsing, mapping end user utterances to specific intents; NLP engine learning to understand and categorize natural language conversations from end users and to extract information; syntax and sentence structure of sentence identified using parser, part-of-speech tagger, and/or named entity recognizer; paragraph 0183, determining end user intents associated with end user utterances; normalizing message; paragraph 0184, after normalization, probability that occurrence of word may signify certain intent determined; paragraph 0185, normalizing every sentence in training dataset to rule; template rules returning particular probability; paragraph 0186, if particular word or set of words is important to an intent, probabilities manipulated by having more examples word and synonyms; paragraph 0222, hierarchical classification model having tree structure that includes a plurality of nodes on multiple layers; paragraph 0223-0224, classification model may be neural network classifier/logistic regression classifier; paragraph 0281, trained models including logistic regression model, etc.; i.e. the model, which may be a neural network, is trained using utterances, which may be sentences/sentence vectors, where the utterances/sentence vectors have associated intents, such that the model determines probabilities for associated intents for a given utterance/sentence vector);
predicting, using the ML engine, said one or more responses in any or a combination of said textual form, said audio form, and video form based on the extracted set of attributes and the generated trained model (e.g. paragraph 0135, digital assistant skills implemented with individual skill bots; paragraph 0136, conversation including text/audio responses provided by skill bots; paragraph 0140, identifying/predicting specific skill bot to handle user request; paragraph 0168, if system intent is selected, actions are performed according to the selected system intent; if skill bot selected, user input routed to skill bot for further processing; paragraph 0172, content exchanged between end user and bot system may include text, emojis, audio, media, etc.; paragraph 0175, after end user intent is determined, the message (and parameters associated with intent) is sent to action engine, which determines an action to perform based on the intent, such as sending outbound content as the response via messaging application; paragraph 0194, message sent by bot system to end user device may include content of the message (text or HTML of the message), time sent, language of the message, etc.; i.e. where the responses are provided by the digital assistant and/or skill bots, which are themselves trained models, these responses are analogous to predictions by the models implementing the digital assistant/skill bots); and 
converting, using the ML engine, said one or more responses to any or a combination of textual form, audio form, and video form from any or a combination of textual form, said audio form, and said video form based on any user and system requirement (e.g. paragraph 0120, bot performing conversations, responding to natural language messages through messaging application/channel, which may be user preferred messaging application which the end user has already installed and is familiar with, and may include various different messaging channels, virtual private assistants, extensions that extend apps/applications with chat capabilities, or voice based input; paragraph 0136, conversation including text/audio responses provided by skill bots; paragraph 0168, if system intent is selected, actions are performed according to the selected system intent; paragraph 0172, content exchanged between end user and bot system may include text, emoji, audio, media, etc.; paragraph 0173, bot system using connector that acts as interface between messaging application system and bot system, and normalizes content between the messaging application system and bot system so that bot system can analyze content across different messaging application systems, including formatting content from each type of messaging application to a common format for processing; paragraph 0177, sending content to end user using connector and messaging application system; i.e. the bot system utilizes a connector which converts/reformats content as appropriate for communication via the end user’s preferred messaging application while also permitting the bot to perform processing of messages received across a variety of applications in a normalized format, analogous to converting the responses (i.e. from the bot) to a textual, audio, etc., format based on a user’s preference/requirement for a particular messaging application).
Therefore, these arguments are not persuasive.
Applicant notes that the independent claims have been amended to further recite “wherein said knowledgebase is used to train a neural net such that each layer of the neural net extracts information during training and passes to logistic regression (LR) to create sentence vectors from the set of expressions, said sentence vectors being used to calculate said probability for each potential intent,” and that the cited references do not teach this limitation.  Examiner agrees that the cited references do not appear to explicitly disclose at least “each layer of the neural net extracts information during training and passes to logistic regression (LR) to create sentence vectors.”  Therefore, this argument is persuasive, and the rejections are withdrawn.
New grounds of rejection are provided below.

Claim Rejections – 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 103(c) and potential pre-AIA  35 U.S.C. 102€, (f) or (g) prior art under pre-AIA  35 U.S.C. 103(a).
Claims 1, 2, 4, 17-21, and 34-37 are rejected under 35 U.S.C. 103 as being unpatentable over Polleri et al. (US 20210081848 A1) in view of Chen et al. (US 20200334520 A1).
With respect to claims 1 and 21, Polleri teaches a system for generating an executable multi-faceted bot of an entity, said system comprising a processor that executes a set of executable instructions that are stored in a memory, upon which execution, the processor causes the system to perform a method (e.g. paragraph 0126, Fig. 4, digital assistant builder platform (DABP) 402 in distributed environment 400 including chatbot, that enables enterprises to create and deploy digital assistants for their users; digital assistant is also referred to as a chatbot system; paragraph 0432, Fig. 22, computer system 220 in which described invention is implemented; paragraph 0434-0435, processing unit 2204 executing programs in response to program code resident in storage and providing described functionalities; paragraphs 0439-0444, describing storage subsystem which stores program instructions providing described functionality); and the method for generating an executable and multi-faceted bot of an entity, said method comprising:
receiving, by a bot maker engine, a first set of data packets corresponding to a user query of a user, and receive, from a database coupled to a server, a knowledgebase comprising a set of expressions associated with one or more potential intents corresponding to said user query (e.g. paragraph 0111, user statement/utterance; paragraph 0120, natural language messages such as questions; bot system communicating with user through messaging application; paragraph 0122, end user interacting with bot system through conversational interaction; paragraph 0128, user inputs in natural language referred to as an utterance; paragraph 0171, routing content such as message or information from message from mobile device to bot system using the internet; paragraph 0173, bot system receiving content); 
extracting, by a bot maker engine, a set of attributes corresponding to form of the user query, wherein the form of the user query is selected from any or a combination of a textual form, an audio form, and a video form (e.g. paragraph 0123, user messages including text, audio, image, video, etc., content; paragraph 0128, user utterance can be in text form, audio input/speech form, etc.; paragraph 0131, utterance received as input goes through processing steps, including parsing the utterance; paragraph 0132, sentence parsing including tokenizing, lemmatizing, identifying part-of-speech tags, identifying named entities, generating dependency tress to represent sentence structure, splitting sentence into clauses, analyzing individual clauses, etc.; paragraph 0172, content exchanged between end user and bot system may include text, emojis, audio, media (picture, video, link), etc.; paragraph 0174, message processor parsing message and performing semantic analysis, including identifying subject, predicate/action, object, etc.; parameters associated with intent, referred to as entities, also extracted from message); 
processing, through an ML engine, training data comprising the user query, one or more responses corresponding to the user query, and said one or more potential intents that are mapped to said user query, and wherein said ML engine identifies a primary potential intent among said one or more potential intents for said user query by calculating probability for each potential intent among said one or more potential intents for said set of expressions associated with said user query to generate a trained model (e.g. paragraph 0111, typical user requests and statements, referred to as utterances, intents created by providing name that illustrates user action and compiling set of real-life user statements/utterances commonly associated with triggering the action; rich dataset of utterances; collectively intents and utterances that belong to them make up training corpus for the chatbot; training model with the corpus, improving acuity of chatbot’s cognition through rounds of intent testing and training; paragraph 0125, using permutations of utterances to train an intent classifier of the chatbot; paragraph 0129, performing processing to understand meaning of utterance, which involves identifying one or more intents and one or more entities corresponding to the utterance; paragraph 0132, using NLP engine or machine learning model such as intent classifier to map end user utterances to specific intents; paragraph 0139, selecting appropriate skill bot for handling request; paragraph 0140, identifying/predicting specific skill bot which can best handle user request; paragraphs 0143-0154, indicating that the skill bot may be custom created for the user, including by configuring settings and intents for the skill bot and training the skill bot, where this training may be performed using training data including intents and example utterances; skill bot represented by model that is trained using the training data, using machine learning techniques, etc.; paragraph 0168, digital assistant evaluates the received user input and computes confidence scores for the skill bots and system intents; any system intent or skill bot with associated confidence score exceeding a threshold value is selected as a candidate for further evaluation; the digital assistant then selects, from the identified candidates, a particular system intent or skill bot for further handling of the user input; paragraph 0174, determining end user intent based on parsed message; paragraph 0196, final intent classification result/identified intent, along with confidence scores associated with each respective intent in the set of intents); 
wherein said knowledgebase is used to train a neural net to create sentence  vectors, said sentence vectors being used to calculate said probability for each potential intent (e.g. paragraph 0111, utterance or message refers to set of words such as one or more sentences; creating intents from data set that is robust and varied; intents and utterances that belong to them collectively making up training corpus for the chatbot; training model with the corpus to turn model into reference tool for resolving end user input to a single intent; paragraph 0132, performing sentence parsing, mapping end user utterances to specific intents; NLP engine learning to understand and categorize natural language conversations from end users and to extract information; syntax and sentence structure of sentence identified using parser, part-of-speech tagger, and/or named entity recognizer; paragraph 0183, determining end user intents associated with end user utterances; normalizing message; paragraph 0184, after normalization, probability that occurrence of word may signify certain intent determined; paragraph 0185, normalizing every sentence in training dataset to rule; template rules returning particular probability; paragraph 0186, if particular word or set of words is important to an intent, probabilities manipulated by having more examples word and synonyms; paragraph 0222, hierarchical classification model having tree structure that includes a plurality of nodes on multiple layers; paragraph 0223-0224, classification model may be neural network classifier/logistic regression classifier; paragraph 0281, trained models including logistic regression model, etc.; i.e. the model, which may be a neural network, is trained using utterances, which may be sentences/sentence vectors, where the utterances/sentence vectors have associated intents, such that the model determines probabilities for associated intents for a given utterance/sentence vector);
predicting, using the ML engine, said one or more responses in any or a combination of said textual form, said audio form, and video form based on the extracted set of attributes and the generated trained model (e.g. paragraph 0135, digital assistant skills implemented with individual skill bots; paragraph 0136, conversation including text/audio responses provided by skill bots; paragraph 0140, identifying/predicting specific skill bot to handle user request; paragraph 0168, if system intent is selected, actions are performed according to the selected system intent; if skill bot selected, user input routed to skill bot for further processing; paragraph 0172, content exchanged between end user and bot system may include text, emojis, audio, media, etc.; paragraph 0175, after end user intent is determined, the message (and parameters associated with intent) is sent to action engine, which determines an action to perform based on the intent, such as sending outbound content as the response via messaging application; paragraph 0194, message sent by bot system to end user device may include content of the message (text or HTML of the message), time sent, language of the message, etc.; i.e. where the responses are provided by the digital assistant and/or skill bots, which are themselves trained models, these responses are analogous to predictions by the models implementing the digital assistant/skill bots); and 
converting, using the ML engine, said one or more responses to any or a combination of textual form, audio form, and video form from any or a combination of textual form, said audio form, and said video form based on any user and system requirement (e.g. paragraph 0120, bot performing conversations, responding to natural language messages through messaging application/channel, which may be user preferred messaging application which the end user has already installed and is familiar with, and may include various different messaging channels, virtual private assistants, extensions that extend apps/applications with chat capabilities, or voice based input; paragraph 0136, conversation including text/audio responses provided by skill bots; paragraph 0168, if system intent is selected, actions are performed according to the selected system intent; paragraph 0172, content exchanged between end user and bot system may include text, emoji, audio, media, etc.; paragraph 0173, bot system using connector that acts as interface between messaging application system and bot system, and normalizes content between the messaging application system and bot system so that bot system can analyze content across different messaging application systems, including formatting content from each type of messaging application to a common format for processing; paragraph 0177, sending content to end user using connector and messaging application system; i.e. the bot system utilizes a connector which converts/reformats content as appropriate for communication via the end user’s preferred messaging application while also permitting the bot to perform processing of messages received across a variety of applications in a normalized format, analogous to converting the responses (i.e. from the bot) to a textual, audio, etc., format based on a user’s preference/requirement for a particular messaging application).
Polleri does not explicitly disclose that the neural net is trained such that each layer of the neural net extracts information during training  and passes to logistic regression (LR) to create sentence vectors from the set of expressions.  However Chen teaches that the neural net is trained such that each layer of the neural net extracts information during training and passes to logistic regression (LR) to create sentence vectors from the set of expressions (e.g. paragraphs 0034-0036, Fig. 3, transformer encoder 304 obtaining contextual information for each word, generating sequence of context embedding vectors; mapping input embedding vectors 306 into context embedding vectors; context embedding vectors used as shared representation of input sentences across different tasks; context embedding vectors 308 input into task-specific layers 310 including single-sentence classification layer 310 which outputs respective probabilities of class labels via logistic regression; paragraph 0037, pairwise text similarity layer performing regression task and outputting semantic similarity between sentences; paragraph 0044, Fig. 4, showing training process, including resulting outputs of regression layers, such as single-sentence classification output 312(1); i.e. input information is extracted into a sentence vector format and passed to a logistic regression layer, resulting in corresponding output including classification probabilities for the sentence vectors).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri and Chen in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models), to incorporate the teachings of Chen (directed to multi-task machine learning architectures and training procedures) to include the capability to train the neural network such that each layer of the neural network extracts information during training and passes to logistic regression to create sentence vectors from a set of expressions (as taught by Chen).  One of ordinary skill would have been motivated to perform such a modification in order to produce accurate multi-task machine learning models even when there is a limited amount of task-specific training data for individual tasks, while also avoiding overfitting, as described in Chen (paragraph 0018).
With respect to claim 2, Polleri in view of Chen teaches all of the limitations of claim 1 as previously discussed, and Polleri further teaches wherein the database coupled to the server is configured to store one or more users, bots, user queries, video forms, audio forms and textual messages associated with predefined topic with a time stamp (e.g. paragraph 0191, database used to store data for bot system, such as data for classification models, logs of conversation, etc.; paragraph 0195, attributes of dialog state execution event including user query statement, response statement, time of execution, communication language, device property, operating system property, geolocation property, identification information, time stamp, channel, etc.; paragraph 0202, collected events and information written to database, including location and end user associated with IP address).
With respect to claim 4, Polleri in view of Chen teaches all of the limitations of claim 1 as previously discussed, and Polleri further teaches wherein said user is identified, verified and then authorized to access the system (e.g. paragraph 0431, identity management module providing identity services including access management and authorization services, controlling information about customers who wish to utilize services, authenticates identities of customers and describes actions those customers are authorized to perform).
With respect to claims 17 and 34, Polleri in view of Chen teaches all of the limitations of claims 1 and 21 as previously discussed, and Polleri further teaches wherein the ML engine is configured with language processing engines to receive said user query in a first language and provide said response corresponding to said user query in a second language (e.g. paragraph 0133, digital assistant is capable of handling utterances in different languages; NLU processing is flexible and extensible for each language; providing language packs for individual languages).
With respect to claims 18 and 35, Polleri in view of Chen teaches all of the limitations of claims 1 and 21 as previously discussed, and Polleri further teaches wherein an authoring portal engine coupled to said ML engine is configured to manage any or a combination of information associated with said users, a plurality of trained models, life cycle of each trained model of said plurality of trained models, sorting and searching said plurality of trained models, life cycle of a plurality of multi- faceted bots and generating executable instructions to invoke said multi- faceted bot among plurality of multi-faceted bots (e.g. paragraph 0142, DABP providing infrastructure and services that enable use of DABP to create digital assistant including skill bots, such as by cloning existing skill bots with or without modification, or from scratch using offered tools and services; DAPB provides skills store/catalog offering multiple skill bots for performing various tasks; paragraph 0154, skill bot represented by model that is trained; paragraph 0403, services provided by system offered as cloud services; paragraph 0431, cloud infrastructure system may include identity management module 2128 providing identity services, controlling information about customers who wish to utilize the services, including information that authenticates identities of customers and describes authorized actions, and managing descriptive information about each customer).
With respect to claims 19 and 36, Polleri in view of Chen teaches all of the limitations of claims 18 and 35 as previously discussed, and Polleri further teaches wherein management of the life cycle of said trained model by the authoring portal engine comprises creating a model, adding expressions and one or more potential intents to said model. training said model, testing said model and publishing said model (e.g. 0142-0154, DABP providing services and features to enable creation of digital assistant including skill bots; creating skill bot through various methods, including based on existing skill bot or from scratch, where creating the skill bot includes configuring settings including related utterances to invoke the skill bot, configuring intents for the skill bot identifying tasks and associated sets of utterances, configuring entities related to intents for the skill bot, and training the skill bot; skill bot represented by model that is trained using training data; portion of training data used to train the skill bot model and another portion used to test or verify the model; paragraph 0165, testing and deploying the skill bot).
With respect to claim 20, Polleri in view of Chen teaches all of the limitations of claim 1 as previously discussed, and Polleri further teaches wherein said ML engine is configured with an event streaming module, wherein the event streaming module is configured to maintain a queue of expressions, containing information about predictions performed by the ML engine (e.g. paragraph 0196, intent resolution event resulting from execution of intent modeler using trained classification models to identify end user intents based on utterances; result of intent classification captured ad intent resolution event attributes including final intent classification result and associated confidence scores for intents in the set of intents; paragraph 0201, collecting events and additional information as bot system conducts conversations with end users and generates corresponding events; sending the collected information to a queue; capturing dialog state attributes, intent resolution attributes, entity resolution attributes, etc.).
With respect to claim 37, Polleri in view of Chen teaches all of the limitations of claim 21 as previously discussed, and Polleri further teaches wherein said ML engine is configured with an event streaming module, wherein the event streaming module is configured to maintain a queue of expressions, containing information about predictions performed by the ML engine (e.g. paragraph 0196, intent resolution event resulting from execution of intent modeler using trained classification models to identify end user intents based on utterances; result of intent classification captured ad intent resolution event attributes including final intent classification result and associated confidence scores for intents in the set of intents; paragraph 0201, collecting events and additional information as bot system conducts conversations with end users and generates corresponding events; sending the collected information to a queue; capturing dialog state attributes, intent resolution attributes, entity resolution attributes, etc.).
Claims 8, 9, 25, and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Polleri in view of Chen, further in view of Howard (US 20210232632 A1).
With respect to claims 8 and 25, Polleri in view of Chen teaches all of the limitations of claims 1 and 21 as previously discussed.  Polleri does not explicitly disclose wherein said client side of the multi- faceted bot is represented in the form of any or a combination of an animated character, a personality character, or an actual representation of the entity character.
However, Howard teaches wherein said client side of the multi- faceted bot is represented in the form of any or a combination of an animated character, a personality character, or an actual representation of the entity character (e.g. paragraph 0022, target identity can be a real person, a fictional person such as a character from a novel, movie, game, etc.; paragraph 0098, conversation with chatbot representing the target identity; paragraph 0184, conversation with virtual avatar or chatbot that emulates personality or conversational style of target identity; paragraph 0209, interactive scenarios between beholder and target identity such as a chatbot, or virtual avatar rendered in a virtual world; paragraph 0212, chatbot representation of the target identity).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Howard in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Howard (directed to multi-modal virtual experiences of distributed content, such as interactions with chatbots representing people, characters, etc.) to include the capability to represent the client side of the chatbot (i.e. of Polleri) as a fictional character, person, etc. (as taught by Howard).  One of ordinary skill would have been motivated to perform such a modification in order to build bespoke virtual experience containers with greater processing efficiency, lower user wait time, and decreased overall storage costs as described in Howard (paragraph 0004).
With respect to claims 9 and 26, Polleri in view of Chen teaches all of the limitations of claims 1 and 21 as previously discussed, and Polleri further teaches, and wherein said responses pertaining to said textual form, said audio form and said video form are stored in the database coupled to said server (e.g. paragraph 0191, database used to store data for bot system, such as logs of conversation, etc.; paragraph 0195, attributes of dialog state execution event including response statement; paragraph 0202, collected events and information written to database).
Polleri does not explicitly disclose wherein said responses pertaining to said audio form and said video form are manually recorded using a recording device.  However, Howard teaches wherein said responses pertaining to said audio form and said video form are manually recorded using a recording device (e.g. paragraph 0049, beholder provided content including text, images, photographs, video, and sound recordings; paragraph 0058, virtual experience built using element sources including target identity content repositories 142; paragraph 0062, target identity content repositories 142 include media repositories containing photos, videos, voice recordings, etc.; digital media/content recorded by third parties in which target identity is shown; paragraph 0098, conversation with chatbot representing the target identity; paragraph 0184, conversation with virtual avatar or chatbot that emulates personality or conversational style of target identity; paragraph 0209, interactive scenarios between beholder and target identity such as a chatbot, or virtual avatar rendered in a virtual world; paragraph 0212, chatbot representation of the target identity).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Howard in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Howard (directed to multi-modal virtual experiences of distributed content, such as interactions with chatbots representing people, characters, etc.) to include the capability to represent the client side of the chatbot (i.e. of Polleri) as a fictional character, person, etc., and to further use, as part of the responses of the chatbot in audio or video form, manually recorded audio and video content, such as audio or video recordings of the character/person that the chatbot represents (as taught by Howard).  One of ordinary skill would have been motivated to perform such a modification in order to build bespoke virtual experience containers with greater processing efficiency, lower user wait time, and decreased overall storage costs as described in Howard (paragraph 0004).
Claims 6, 7, 23, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Polleri in view of Chen, further in view of Griffin (US 20180176269 A1).
With respect to claims 6 and 23, Polleri in view of Chen teaches all of the limitations of claims 1 and 21 as previously discussed.  Polleri does not explicitly disclose wherein the ML engine is configured to enable said user to switch to any said textual, said audio form and said video form from a current form to initiate said user query.
However, Griffin teaches wherein the ML engine is configured to enable said user to switch to any said textual, said audio form and said video form from a current form to initiate said user query (e.g. paragraph 0010, Bot subsystem receiving multimodal input streams from user devices, processing, and performing response user devices a multimodal user responses including audio, video, and text; paragraphs 0019-0020, user devices are configured to generate multimodal content streams including audio, video, and text; streams may include concurrent audio, video, and text streams, or moa include only one of these; user devices also configured to receive concurrent multimodal streams, including audio, video, and text and present to the user; CCS enables users of devices to interface with and control collaboration services on behalf of and in a way that is natural to the users; CCS receives concurrent multimodal content streams, performs cognitive processing to derive user intent, and converts to cognitive actions; CCS applies cognitive actions to appropriate collaboration services to control supported communication sessions; depending on derived user intent, providing concurrent multimodal content responses including audio, video, and text responses to user devices; claims 8 and 17, identifying for each user request a mode as audio, video, or text from which user request was primarily derived, and transmitting user response to user request using audio, video, or text using the mode indicated in the user request; i.e. the user is enabled to select, for each particular query/request, a text, audio, or video form/mode, and transmit the request/query using that form).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Griffin in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Griffin (directed to multi-modal stream processing cognitive collaboration system, using bots to respond to user requests) to include the capability to enable the user to select a text, audio, or video format/mode for a current query, and transmit the query/request to the chatbot using the selected format/mode (as taught by Griffin).  One of ordinary skill would have been motivated to perform such a modification in order to provide a cognitive processing application which is not limited with respect to user input type, which is able to concurrently process multimodal input, and which provides greater flexibility, scaling, etc. as described in Griffin (paragraph 0002).
With respect to claims 7 and 24, Polleri in view of Chen teaches all of the limitations of claims 1 and 21 as previously discussed.  Polleri does not explicitly disclose wherein the ML engine is configured to enable said user to switch to any said textual, said audio form and said video form from a current form of response provided by the system/method.
However, Griffin teaches wherein the ML engine is configured to enable said user to switch to any said textual, said audio form and said video form from a current form of response provided by the system/method (e.g. paragraph 0010, Bot subsystem receiving multimodal input streams from user devices, processing, and performing response user devices a multimodal user responses including audio, video, and text; paragraphs 0019-0020, user devices are configured to generate multimodal content streams including audio, video, and text; streams may include concurrent audio, video, and text streams, or moa include only one of these; user devices also configured to receive concurrent multimodal streams, including audio, video, and text and present to the user; CCS enables users of devices to interface with and control collaboration services on behalf of and in a way that is natural to the users; CCS receives concurrent multimodal content streams, performs cognitive processing to derive user intent, and converts to cognitive actions; CCS applies cognitive actions to appropriate collaboration services to control supported communication sessions; depending on derived user intent, providing concurrent multimodal content responses including audio, video, and text responses to user devices; claims 8 and 17, identifying for each user request a mode as audio, video, or text from which user request was primarily derived, and transmitting user response to user request using audio, video, or text using the mode indicated in the user request; i.e. the user is enabled to select, for each particular query/request, a text, audio, or video form/mode, and transmit the request/query using that form, and the response to the query is provided using the same selected mode/form, such that the user’s selection of the mode/form for the user query/request is also a selection of a mode/form for the corresponding response).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Griffin in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Griffin (directed to multi-modal stream processing cognitive collaboration system, using bots to respond to user requests) to include the capability to enable the user to select a text, audio, or video format/mode for a current query, and transmit the query/request to the chatbot using the selected format/mode, and for the chatbot to transmit the response to the query/request to the user in the same selected format/mode, such that the user’s selection of mode/format is a selection the mode/format of the response as well as the query/request (as taught by Griffin).  One of ordinary skill would have been motivated to perform such a modification in order to provide a cognitive processing application which is not limited with respect to user input type, which is able to concurrently process multimodal input, and which provides greater flexibility, scaling, etc. as described in Griffin (paragraph 0002).
Claims 3 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Polleri in view of Chen, further in view of Williams et al. (US 20190347668 A1).
With respect to claims 3 and 22, Polleri in view of Chen teaches all of the limitations of claims 1 and 21 as previously discussed.  Polleri does not explicitly disclose wherein said bot maker engine extracts from the server a second set of data packets to initialize said multi-faceted bot, wherein said second set of data packets pertains to information comprising of said one or more potential intents, one or more video forms, and a set of trending queries.
However, Williams teaches wherein said bot maker engine extracts from the server a second set of data packets to initialize said multi-faceted bot, wherein said second set of data packets pertains to information comprising of said one or more potential intents, one or more video forms, and a set of trending queries (e.g. paragraph 0091, frequently asked question item; paragraph 0106, analytics indicating people who engage frequently with cardio topic also engage frequently with running topic; offering suggested topics that are interesting to specific person; paragraph 0226, database 1620 storing records, including communication records, etc.; paragraph 0233, knowledge graph 1622 structuring knowledge base 1624 which is set of media assets including videos, etc. used to aid contact; paragraph 0245, obtaining information that is relevant to current communication session, including current issue/reason; paragraph 0252, system initializing chat bot with data relating to contact, script that is directed to handle a particular type of conversation, etc.; chatbot obtaining information from database 1620 and knowledge graph 1622 to engage in the conversation; using information obtained from database 1620, knowledge graph 1622, previous text in the chat, etc.; performing natural language processing to understand response of user; providing articles/videos as response; paragraph 0272, recommended content directed to common problems; i.e. to initialize the chat bot, data may be retrieved/received, including data relevant to the user/client’s purpose/intent (such as data reflective of the current issue/reason, data provided within the communication history/record, etc.), data relevant to video content/forms to be potentially communicated via the chat (such as information contained in a knowledgebase including video information which may be provided in response to user/client query), and data relevant to trending queries (such as previous conversational history, previous text in the chat, etc., where previous instances of conversational history/text in the chat are reflective of previous messages/queries of the client/user, and may therefore be trending with respect to that particular user; moreover, information related to common problems/issues and to be provided as recommended content is analogous to information which may pertain to trending queries (i.e. where the trending queries are reflective of the common problem)).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Williams in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Williams (directed to multi-client service system platform, such as utilizing chatbots) to include the capability to retrieve initialization information/packets for the chatbot (i.e. of Polleri), including information regarding the user’s intent/reason/issue for the conversation, information regarding video forms/content which are to be potentially communicated via the chat, and information regarding current and previous conversation history of the user, common problems/issues, frequently asked questions, and frequently accessed topics, which may be reflective of trending queries within the system (as taught by Williams).  One of ordinary skill would have been motivated to perform such a modification in order to provide improvements to the functioning of computer systems, information networks, data stores, etc., by enabling, in a single database and system, the development and maintenance of a set of universal contact objects enabling use for a wide range of activities as described in Williams (paragraph 0004).
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Polleri in view of Chen, further in view of Kothari et al. (US 20190179608 A1).
With respect to claim 5, Polleri in view of Chen teaches all of the limitations of claim 4 as previously discussed.  Polleri does not explicitly disclose wherein said one or more responses are initiated once an authorized user generates said user query, and wherein the one or more responses corresponding to said user query that is mapped with the one or more potential intents is transmitted in real-time in the form of a third set of data packets to said user computing device from server side of the multi-faceted bot.
However, Kothari teaches wherein said one or more responses are initiated once an authorized user generates said user query, and wherein the one or more responses corresponding to said user query that is mapped with the one or more potential intents is transmitted in real-time in the form of a third set of data packets to said user computing device from server side of the multi-faceted bot (e.g. paragraph 0029, chatbot provider components engaging with client computing device to create back and forth real-time voice or audio based conversation between the client computing device and the chatbot provider computing device; paragraph 0050, obtaining input audio signal, identifying request indicating intent; paragraph 0058, prior to launch/execution of chatbot, determining whether computing device is authorized to access the chatbot; granting computing device access to chatbot if associated with valid account or profile; selecting chatbot responsive to determination that the computing device is authorized to access it).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Kothari in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Kothari (directed graphical user interface rendering management by voice-driven computing infrastructure, such as for implementing chatbots) to include the capability to initiate the responses once an authorized user generates the query, and to transmit response information in real-time, as part of a real-time conversation between the chatbot system/server and the user/client computing device (as taught by Kothari).  One of ordinary skill would have been motivated to perform such a modification in order to integrate a digital assistant with a third party user experience, allowing the digital assistant to integrate with a chatbot application in order to provide user experience and interface consistent with the chatbot’s interface as described in Kothari (paragraph 0015).
Claims 10, 11, 27,  and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Polleri in view of Chen, further in view of Bhardwaj et al. (US 20220247700 A1).
With respect to claims 10 and 27, Polleri in view of Chen teaches all of the limitations of claims 1 and 21 as previously discussed, and Polleri further teaches wherein the ML engine pre-processes the knowledgebase through a prediction engine for any or a combination of data cleansing, data correction, synonym formation, proper noun extraction, white space removal, stemming of words, punctuation removal, feature extraction, and special character removal, wherein the data pertains to a set of potential queries associated with the entity and corresponding any or a combination of textual form, audio form and video form responses (e.g. paragraphs 0182-0185, content/message normalization, including tagging parts of speech such as noun, adjective, verb, and normalizing messages; every sentence in training dataset, once normalized may become a rule, and new rules generated from rules via process of induction).
Assuming arguendo that Polleri does not explicitly disclose this limitation, Bhardwaj teaches wherein the ML engine pre-processes the knowledgebase through a prediction engine for any or a combination of data cleansing, data correction, synonym formation, proper noun extraction, white space removal, stemming of words, punctuation removal, feature extraction, and special character removal, wherein the data pertains to the set of potential queries associated with the entity and corresponding any or a combination of textual form, audio form and video form responses  (e.g. paragraph 0053, training using historical chat session data, which may be preprocessed to remove stop words, correct misspelled words, replace patterns like buyer offer price, phone numbers, emails, mileage, numerals, etc.; paragraph 0074, during training chat messages cleaned up such as by removing stop words, refining/clarifying misspellings, etc.).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Bhardwaj in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Bhardwaj (directed to an interactive chatbot for multi-way communication) to include the capability to implement the first component model of the ML model as a Bi-LSTM model (as taught by Bhardwaj).  One of ordinary skill would have been motivated to perform such a modification in order to provide a chatbot having the ability to learn how a user will respond in certain situations and transmit responses on the user’s behalf, to predict questions for users to ask, to output suggestions to users to provide assurances, etc. as described in Bhardwaj (paragraph 0020, 0025).
With respect to claims 11 and 28, Polleri in view of Chen, further in view of Bhardwaj teaches all of the limitations of claims 10 and 27 as previously discussed, and Polleri further teaches wherein the ML model comprises a first component model having culmination of logistic regression model (e.g. paragraph 0222-0223, using classification model for determining user intents based on input data; classifying input may include classifying by a binary classification model the input as belonging to classes; may further include classifying by a second binary classification model the input as belonging to additional classes; second binary classification model may include a logistic regression classifier).
Polleri does not explicitly disclose wherein the first component model comprises a long term short term memory (LSTM) based model having culmination of neural network based bi-directional LSTM cells.  However, Bhardwaj teaches wherein the first component model comprises a long term short term memory (LSTM) based model having culmination of neural network based bi-directional LSTM cells (e.g. paragraph 0049, intent classification model including bidirectional long short-term memory (LSTM) architecture; intent classification model receiving message from chat service and determining intent of the message).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Bhardwaj in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Bhardwaj (directed to an interactive chatbot for multi-way communication) to include the capability to implement the first component model of the ML model as a Bi-LSTM model (as taught by Bhardwaj).  One of ordinary skill would have been motivated to perform such a modification in order to provide a chatbot having the ability to learn how a user will respond in certain situations and transmit responses on the user’s behalf, to predict questions for users to ask, to output suggestions to users to provide assurances, etc. as described in Bhardwaj (paragraph 0020, 0025).
Claims 12, 13, 29, and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Polleri in view of Chen, further in view of Bhardwaj, further in view of Shao et al. (US 20210200961 A1).
With respect to claims 12 and 29, Polleri in view of Chen, further in view of Bhardwaj  teaches all of the limitations of claims 11 and 28 as previously discussed, and Polleri and Bhardwaj further teach wherein the knowledgebase is used to train LSTM neural net, wherein the ML model facilitates supervised learning (e.g. Polleri paragraph 0111, intents and utterances making up training corpus for chatbot; paragraph 0125, training intent classifier of chatbot; paragraph 0154, training of skill bots; paragraph 0281, classification systems that execute supervised learning techniques; Bhardwaj paragraph 0027, trained intent classification model that can predict the intent of a message; paragraph 0029, intent classification model trained based on historical messages and intents; paragraph 0051, intent classification model developed using Bi-LSTM architecture).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, and Bhardwaj in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models) and Chen (directed to multi-task machine learning architectures and training procedures), to incorporate the teachings of Bhardwaj (directed to an interactive chatbot for multi-way communication) to include the capability to implement the first component model of the ML model as a Bi-LSTM model (as taught by Bhardwaj).  One of ordinary skill would have been motivated to perform such a modification in order to provide a chatbot having the ability to learn how a user will respond in certain situations and transmit responses on the user’s behalf, to predict questions for users to ask, to output suggestions to users to provide assurances, etc. as described in Bhardwaj (paragraph 0020, 0025).
Polleri and Bhardwaj do not explicitly disclose that the training is performed using categorical cross entropy as loss function and an optimizer.  However, Shao teaches that the training is performed using categorical cross entropy as loss function and an optimizer (e.g. paragraph 0017, context-based multi-turn dialogue method; paragraph 0033, BiLSTM used to model context information in natural language processing tasks, used to read historical dialog partial matching vector and candidate answer partial matching vector, and max and average pooling performed on hidden vector output by BILSTM; matching probability calculation model trained and obtained by minimizing cross-entropy loss in an end-to-end manner; paragraph 0037, selecting highest probability from all candidate answer matching probabilities; paragraph 0114, to-be-classified vector input into multi-layer perceptron and cross-entropy loss is minimized in an end-to-end manner to train a matching probability calculation model).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, Bhardwaj, and Shao in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models), Chen (directed to multi-task machine learning architectures and training procedures), and Bhardwaj (directed to an interactive chatbot for multi-way communication), to incorporate the teachings of Shao (directed to context-based multi-turn dialogue) to include the capability to perform the training of the BiLSTM model (i.e. the model of Polleri, which may be implemented as a BiLSTM-based model as taught by Bhardwaj and Shao), using cross entropy as a loss function and optimizer (as taught by Shao).  One of ordinary skill would have been motivated to perform such a modification in order to provide a multi-turn dialogue method that can realize sufficient matches between the context and the answer when the context and the answer sequence have different characteristics as described in Shao (paragraph 0004).
With respect to claims 13 and 30, Polleri in view of Chen, further in view of Bhardwaj, further in view of Shao teaches all of the limitations of claims 12 and 29 as previously discussed, and Shao further teaches wherein each layer of the LSTM neural net extracts information during the training to minimize loss function and to retrain one or more weights of the respective layer (e.g. paragraph 0033, matching probability calculation model trained and obtained by minimizing cross-entropy loss in an end-to-end manner; paragraph 0035, indicating that steps S106-S112 of Fig. 1 are repeated; paragraph 0091-0092, expanding on step 110 of Fig. 1, performing partial semantic relationship matching; performing attention weight calculations; paragraph 0114, to-be-classified vector input into multi-layer perceptron and cross-entropy loss is minimized in an end-to-end manner to train a matching probability calculation model; i.e. repetitively performing attention weight calculations during training of a multi-layer neural network including BiLSTM layers).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, Bhardwaj, and Shao in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models), Chen (directed to multi-task machine learning architectures and training procedures), and Bhardwaj (directed to an interactive chatbot for multi-way communication), to incorporate the teachings of Shao (directed to context-based multi-turn dialogue) to include the capability to perform the training of the BiLSTM model (i.e. the model of Polleri, which may be implemented as a BiLSTM-based model as taught by Bhardwaj and Shao), in which layers of the LSTM extract information during the training to minimize loss function and to retrain/recalculate respective weights (as taught by Shao).  One of ordinary skill would have been motivated to perform such a modification in order to provide a multi-turn dialogue method that can realize sufficient matches between the context and the answer when the context and the answer sequence have different characteristics as described in Shao (paragraph 0004).
Claims 14 and 31 are rejected under 35 U.S.C. 103 as being unpatentable over Polleri in view of Chen, further in view of Bhardwaj, further in view of Shao, further in view of Pan et al. (US 20210083994 A1).
With respect to claims 14 and 31, Polleri in view of Chen, further in view of Bhardwaj, further in view of Shao teaches all of the limitations of claims 13 and 20 as previously discussed.  Polleri, Bhardwaj, and Shao do not explicitly disclose wherein the lowest layer of the LSTM neural net is passed to logistic regression (LR) to create sentence vectors from the set of potential queries, said sentence vectors acting as input for the LR to calculate probabilities for each intent mapped to a potential query such that the system/method estimates an output including the intent with highest probability.
However, Pan teaches wherein the lowest layer of the LSTM neural net is passed to logistic regression (LR) to create sentence vectors from the set of potential queries, said sentence vectors acting as input for the LR to calculate probabilities for each intent mapped to a potential query such that the system/method estimates an output including the intent with highest probability (e.g. paragraph 0100, evaluating input utterance and computing confidence scores using logistic regression model for system intents and skill bots; paragraph 0106, classifier model generates input feature vector describing and representing input utterance; paragraph 0112, classifier using logistic regression model to determine associated confidence scores; paragraph 0115, bot classifier model using logistic regression to assign confidence levels for each chatbot intent; paragraph 0118, classifier model using feature vectors when determining whether input utterances are related to skill bots; paragraph 0119, feature vectors based on word embeddings; feature vectors used to represent sentences; paragraph 0140, assigning confidence scores using logistic regression model for input utterance; paragraph 0193, using logistic regression model with respect to input utterance to compute confidence scores; paragraph 0194, classifier model selecting skill bot whose associated training feature vectors shared clusters with input feature vector).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, Bhardwaj, Shao, and Pan in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models), Chen (directed to multi-task machine learning architectures and training procedures), Shao (directed to context-based multi-turn dialogue), and Bhardwaj (directed to an interactive chatbot for multi-way communication), to incorporate the teachings of Pan (directed to detecting unrelated utterances in a chatbot system) to include the capability to pass the user utterance to a logistic regression classifier model to create sentence vectors which are utilized by the logistic regression classifier model to calculate probabilities for intents so that a highest intent probability can be determined (as taught by Pan).  One of ordinary skill would have been motivated to perform such a modification in order to provide the capability to determine whether a user utterance is, or is not, related to a chat bot, and to route the input to the appropriate chat bot, preventing computing resources from being wasted by providing early detection of unrelated user inputs as described in Pan (paragraph 0003, 0044).
With respect to claims 16 and 33, Polleri in view of Chen, further in view of Bhardwaj, further in view of Shao, further in view of Pan teaches all of the limitations of claims 14 and 31 as previously discussed.  Polleri and Bhardwaj do not explicitly disclose wherein during evaluation of the output, assessment is performed by the prediction engine based on a predetermined set of rules that screen through any or a combination of a pre-defined salutation and one or more attributes associated with the user query such that if the assessment indicates a negative response, the user query is converted into a mathematical representation of expressions using the trained model to identify a relevant intent associated with the user query for providing the output, wherein said prediction is done to estimate the predicted intent with highest probability in a manner that the any or a combination of textual, audio and video response that is mapped with the predicted intent is transmitted.
However, Pan teaches wherein during evaluation of the output, assessment is performed by the prediction engine based on a predetermined set of rules that screen through any or a combination of a pre-defined salutation and one or more attributes associated with the user query such that if the assessment indicates a negative response, the user query is converted into a mathematical representation of expressions using the trained model to identify a relevant intent associated with the user query for providing the output, wherein said prediction is done to estimate the predicted intent with highest probability in a manner that the any or a combination of textual, audio and video response that is mapped with the predicted intent is transmitted (e.g. paragraph 0100, determining if input utterance explicitly defines skill bot and if so routing input utterance to the skill bot; if not, evaluating input utterance and computing confidence scores for intents and skill bots; selecting particular intent or skill bot based on confidence scores; paragraph 0112, selecting skill bot with highest confidence score; paragraph 0115, selecting intent with highest confidence score; paragraphs 0140-0142, Fig. 8, determining whether input utterance includes only words found in training utterances; if yes, determining input utterance is related to at least one skill bot, and routing the utterance to the skill bot most closely matching the input utterance; if input utterance includes words not in training utterances or percentage of words greater than threshold, generating an input feature vector from the input utterance; converting input utterance/words into feature vector using various techniques such as one-hot encoding or other encoding; classifier model comparing the generated input feature vector to representations to determine whether input feature vector matches any skill bots; comparing point representing input feature vector to clusters to determine whether input feature vector falls inside any of the clusters and thus matches at least one skill bot, or comparing to composite feature vectors to determine whether sufficiently similar to any such composite feature vector than matches at least one skill bot; i.e. where a process such as that shown in Fig. 8 may provide a predetermined set of rules to screen through attributes associated with the user query, and include an assessment which may have a negative response, such as a determination that the input utterance does not explicitly define a particular intent or skill bot, or does not include only words found in training utterances; moreover, based on this negative response, the utterance is converted into a feature vector, analogous to a mathematical representation, which can then be used to identify a relevant intent according to a highest probability/confidence score).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, Bhardwaj, Shao, and Pan in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models), Chen (directed to multi-task machine learning architectures and training procedures), Shao (directed to context-based multi-turn dialogue), and Bhardwaj (directed to an interactive chatbot for multi-way communication), to incorporate the teachings of Pan (directed to detecting unrelated utterances in a chatbot system) to include the capability to utilize a process providing a predetermined set of rules to screen through attributes associated with the user query, and including an assessment which may have a negative response, such as a determination that the input utterance does not explicitly define a particular intent or skill bot, or does not include only words found in training utterances, and then, based on this negative response, converting the utterance is converted into a feature vector/mathematical representation, which can then be used to identify a relevant intent according to a highest probability/confidence score (as taught by Pan).  One of ordinary skill would have been motivated to perform such a modification in order to provide the capability to determine whether a user utterance is, or is not, related to a chat bot, and to route the input to the appropriate chat bot, preventing computing resources from being wasted by providing early detection of unrelated user inputs as described in Pan (paragraph 0003, 0044).
Claims 15 and 32 are rejected under 35 U.S.C. 103 as being unpatentable over Polleri view of Chen, further in view of Bhardwaj, further in view of Shao, further in view of Pan, further in view of Crudele et al. (US 20190361977 A1).
With respect to claims 15 and 32, Polleri in view of Chen, further in view of Bhardwaj, further in view of Shao, further in view of Pan teaches all of the limitations of claims 14 and 31 as previously discussed, and Polleri further teaches wherein the various components of the system are configured for a user query generated in audio form (e.g. paragraph 0111, rich set of utterances and intents that belong to them making up training corpus; paragraph 0128, user utterances in audio input or speech form; speech input; paragraph 0131, utterance received as input by digital assistant and goes through series or pipeline of processing steps; paragraph 0132, NLP engine or other resources such as parser, etc. used for processing utterances).  Polleri does not explicitly disclose wherein the ML engine is configured with an L1L2 engine coupled to said knowledgebase to create variations of a word in the training set to increase the vocabulary of the trained model, wherein said L1L2 engine is configured for a user query generated in audio form.
However, Crudele teaches wherein the ML engine is configured with an L1L2 engine coupled to said knowledgebase to create variations of a word in the training set to increase the vocabulary of the trained model, wherein said L1L2 engine is configured for a user query generated in audio form (e.g. paragraph 0002, chatbot, talkbot, etc. simulating natural language communication; expressed intent including text and utterances (i.e. where utterances appear to be distinguished from text, such as because they refer to spoken input); paragraph 0037, user input in the form of an utterance; paragraphs 0049-0050, updated training data generated based on initial training data; generating expressed intent in expanded set of expressions based on initial training data by derivation from one or more terms of the vocabulary of terms; expanded training set generated by derivation from initial training set, such as by permutation of synonyms corresponding to initial training set).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention having the teachings of Polleri, Chen, Bhardwaj, Shao, Pan, and Crudele (with respect to claim 32, or just Polleri, Chen, and Crudele with respect to claim 15) in front of him to have modified the teachings of Polleri (directed to adaptive pipelining composition for machine learning, such as by using chatbots implemented using trained models), Chen (directed to multi-task machine learning architectures and training procedures), Shao (directed to context-based multi-turn dialogue), Bhardwaj (directed to an interactive chatbot for multi-way communication), and Pan (directed to detecting unrelated utterances in a chatbot system), to incorporate the teachings of Crudele (directed to detecting unrelated utterances in a chatbot system) to include the capability to pass the user utterance to a logistic regression classifier model to create sentence vectors which are utilized by the logistic regression classifier model to calculate probabilities for intents so that a highest intent probability can be determined (as taught by Pan).  One of ordinary skill would have been motivated to perform such a modification in order to provide the capability to efficiently develop training data to produce high degrees of natural language comprehension by and of an NLC of variously expressed intents as described in Crudele (paragraph 0020).
	
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned. They are part of the literature of the art, relevant for all they contain,” In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting in re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (GCPA 1968)). Further, a reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill the art, including nonpreferred embodiments. Merck & Co, v. Biocraft Laboratories, 874 F.2d 804, 10 USPQ2d 1843 (Fed. Cir.), cert, denied, 493 U.S. 975 (1989). See also Upsher-Smith Labs. v. Pamlab, LLC, 412 F,3d 1319, 1323, 75 USPQ2d 1213, 1215 (Fed. Cir, 2005): Celeritas Technologies Ltd. v. Rockwell International Corp., 150 F.3d 1354, 1361, 47 USPQ2d 1516, 1522-23 (Fed. Cir. 1998).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JEREMY L STANLEY whose telephone number is (469)295-9105. The examiner can normally be reached on Monday-Friday from 9:00 AM to 5:00 PM CST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar, can be reached at telephone number (571) 270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center and the Private Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from Patent Center or Private PAIR. Status information for unpublished applications is available through Patent Center and Private PAIR for authorized users only. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/uspto-automated- interview-request-air-form.
/JEREMY L STANLEY/
Primary Examiner, Art Unit 2127
Read full office action
Prosecution Timeline

Mar 31, 2022
Application Filed
Aug 04, 2025
Non-Final Rejection — §103
Nov 03, 2025
Response Filed
Feb 17, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/648,629
Patent 12591827
ETHICAL CONFIDENCE FABRICS: MEASURING ETHICAL ALGORITHM DEVELOPMENT
2y 5m to grant Granted Mar 31, 2026
18/621,529
Patent 12580783
CONFIGURING 360-DEGREE VIDEO WITHIN A VIRTUAL CONFERENCING SYSTEM
2y 5m to grant Granted Mar 17, 2026
18/383,433
Patent 12572266
ACCESSING AND DISPLAYING INFORMATION CORRESPONDING TO PAST TIMES AND FUTURE TIMES
2y 5m to grant Granted Mar 10, 2026
18/384,355
Patent 12561041
Systems, Methods, and Graphical User Interfaces for Interacting with Virtual Reality Environments
2y 5m to grant Granted Feb 24, 2026
17/503,714
Patent 12555684
ASSESSING A TREATMENT SERVICE BASED ON A MEASURE OF TRUST DYNAMICS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
48%
Grant Probability
92%
With Interview (+44.7%)
3y 2m
Median Time to Grant
Moderate
PTA Risk
Based on 276 resolved cases by this examiner. Grant probability derived from career allow rate.