Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/31/2025 considered by the examiner.
Drawings
The drawing submitted on 04/19/2024 considered by the examiner.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s)1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Zheng et al.(US 2022/0415320 A1).
Regarding Claims 1, 11, and 20, Zheng et al. teach: A system comprising: at least one processor; at least one memory storing instructions that, when executed by the at least one processor, cause operations comprising (Abstract: In one embodiment, a system includes an automatic speech recognition (ASR) module, a natural-language understanding (NLU) module, a dialog manager, one or more agents, an arbitrator, a delivery system, one or more processors, and a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to receive a user input, process the user input using the ASR module, the NLU module, the dialog manager, one or more of the agents, the arbitrator, and the delivery system, and provide a response to the user input.): classifying, by a conversational artificial intelligence agent (dialog manager 216a ) in a first dialogue state (first operational mode (i.e., on-device mode) using client-side dialog state, i.e. current dialog state), an inquiry to generate a classified inquiry ([0089] In particular embodiments, as discussed above, an on-device orchestrator 206 on the client system 130 may coordinate receiving a user input and may determine, at one or more decision points in an example workflow, which of the operational modes described above should be used to process or continue processing the user input. As further discussed above, selection of an operational mode may be based at least in part on a device state, a task associated with a user input, and/or one or more additional factors. [0090] In particular embodiments, the on-device dialog manager 216a may comprise a dialog state tracker 218a and an action selector 222a. The on-device dialog manager 216a may have complex dialog logic and product-related business logic to manage the dialog state and flow of the conversation between the user and the assistant system 140. The on-device dialog manager 216a may include full functionality for end-to-end integration and multi-turn support (e.g., confirmation, disambiguation). [0111] In particular embodiments, an assistant service module 305 may access a request manager 310 upon receiving a user input. In particular embodiments, the request manager 310 may comprise a context extractor 312 and a conversational understanding object generator (CU object generator) 314. The context extractor 312 may extract contextual information associated with the user input. [0112] In particular embodiments, the request manger 310 may send the generated CU objects to the NLU module 210.The NLU module 210 may then perform domain classification/selection 334 on user input based on the features resulted from the featurization 332 to classify the user input into predefined domains. In particular embodiments, the NLU module 210 may process the domain classification/selection results using an intent classifier 336b. The intent classifier 336b may determine the user's intent associated with the user input. [0113] In particular embodiments, the NLU module 210 may identify one or more of a domain, an intent, or a slot from the user input in a personalized and context-aware manner. [0126] Based on the current dialog state, a dialog policy 360 may choose a node to execute and generate the corresponding actions.); determining, by the conversational artificial intelligence agent, a second dialogue state (new state) to transition to from the first dialogue state based at least on one or more parameters extracted (one or more of a domain, an intent, or a slot from the user input) from the classified inquiry ([0113] In particular embodiments, the NLU module 210 may identify one or more of a domain, an intent, or a slot from the user input in a personalized and context-aware manner. [0114] In particular embodiments, the output of the NLU module 210 may be sent to the entity resolution module 212 to resolve relevant entities. [0122] In particular embodiments, the output of the entity resolution module 212 may be sent to the dialog manager 216 to advance the flow of the conversation with the user. The dialog manager 216 may be an asynchronous state machine that repeatedly updates the state and selects actions based on the new state. [0129] After an event is processed and the state is updated, the action selector 222 may run a fast search algorithm (e.g., similarly to the Boolean satisfiability) to identify which policies should be triggered based on the current state. [0131] In particular embodiments, the action selector 222 may call different agents 228 for task execution. Meanwhile, the dialog manager 216 may receive an instruction to update the dialog state.); determining one or more response outputs to generate based at least on the one or more parameters and based at least on the second dialogue state([0103] In particular embodiments, the local arbitrator 226a may generate a response based on the final result and send it to a render output module 232. [0124] In particular embodiments, the dialog manager 216 may map events determined by the context engine 220 to actions. To support processing of event streams, the dialog state tracker 218a may use an event handler (e.g., for disambiguation, confirmation, request) that may consume various types of events and update an internal assistant state. [0129] In particular embodiments, the action selector 222 may select an action based on one or more of the event determined by the context engine 220, the dialog intent and state, the associated content objects, and the guidance from dialog policies 360. [0130] In particular embodiments, the action selector 222 may take the dialog state update operators as part of the input to select the dialog action. [0133] In particular embodiments, the determined actions by the action selector 222 may be sent to the delivery system 230. [0137] Besides determining how to process the user input, the orchestrator 206 may receive the results from the agents 228 and/or the results from the delivery system 230 provided by the dialog manager 216. The orchestrator 206 may then forward these results to the arbitrator 226. In particular embodiments, the render output module 232 may generate a response that is suitable for the client system 130.); determining one or more electronic messages to generate based at least on the one or more response outputs; and generating the one or more electronic messages to be conveyed, via one or more communication networks, to a computing device, wherein the one or more electronic messages are in response to the inquiry ([0089] As further discussed above, selection of an operational mode may be based at least in part on a device state, a task associated with a user input, and/or one or more additional factors. As another example and not by way of limitation, if a messaging task is not supported by on-device processing on the client system 130, the on-device orchestrator 206 may select the third operational mode (i.e., blended mode) to process the user input associated with a messaging request. [0100] As an example and not by way of limitation, the delivery system 230a/b may broadcast to all online devices that belong to one user. As another example and not by way of limitation, the delivery system 230a/b may deliver events to target-specific devices. [0119] In particular embodiments, a new or running task capable of handling the intent may be identified and provided with the intent (e.g., a message composition task for an intent to send a message to another user). [0121] As another example and not by way of limitation, for the utterance “send a message to John”, the entity resolution module 212 may easily determine “John” refers to a person that one can message. [0137] In particular embodiments, the render output module 232 may generate a response that is suitable for the client system 130. [0141] As an example and not by way of limitation, the context engine 220 may cause a push notification message to be displayed on a display screen of the user's client system 130. The user may interact with the push notification message, which may initiate a multi-modal event (e.g., an event workflow for replying to a message received from another user). As an example, and not by way of limitation, receiving a message may be a social event, which may trigger the task of reading the message to the user.).
Regarding Claims 2 and 12, Zheng et al. teach: The system of claim 1, wherein the operations further comprise providing the one or more parameters as inputs to a dialogue state tracking control layer logic tree (a tree-based policy, which is a pre-constructed dialog plan) (See rejection of claim 1 and [0126] In particular embodiments, the dialog state tracker 218 may communicate with the action selector 222 about the dialog intents and associated content objects. In particular embodiments, the action selector 222 may rank different dialog hypotheses for different dialog intents. The action selector 222 may take candidate operators of dialog state and consult the dialog policies 360 to decide what actions should be executed. In particular embodiments, a dialog policy 360 may a tree-based policy, which is a pre-constructed dialog plan. Based on the current dialog state, a dialog policy 360 may choose a node to execute and generate the corresponding actions. As an example, and not by way of limitation, the tree-based policy may comprise topic grouping nodes and dialog action (leaf) nodes. In particular embodiments, a dialog policy 360 may also comprise a data structure that describes an execution plan of an action by an agent 228.).
Regarding Claims 3 and 13, Zheng et al. teach: The system of claim 2, wherein the one or more parameters are coupled to one or more conditional nodes (a dialog policy 360 ) within the dialogue state tracking control layer logic tree (topic grouping nodes and dialog action (leaf) nodes), and wherein one or more parameters comprise a topic (topic grouping node) and an object (dialog intents and associated content objects ) of the inquiry (See rejection of claim 2).
Regarding Claims 4 and 14, Zheng et al. teach: The system of claim 1, wherein the operations further comprise managing any post-interaction events based at least on the one or more response outputs (See rejection of claim 1 and [0103] In particular embodiments, the local arbitrator 226a may generate a response based on the final result and send it to a render output module 232. [0124] The dialog manager 216 may also perform context tracking and interaction management. Context tracking may comprise aggregating real-time stream of events into a unified user state. Interaction management may comprise selecting optimal action in each state. [0137] Besides determining how to process the user input, the orchestrator 206 may receive the results from the agents 228 and/or the results from the delivery system 230 provided by the dialog manager 216. The orchestrator 206 may then forward these results to the arbitrator 226. The arbitrator 226 may aggregate these results, analyze them, select the best result, and provide the selected result to the render output module 232. In particular embodiments, the arbitrator 226 may consult with dialog policies 360 to obtain the guidance when analyzing these results. In particular embodiments, the render output module 232 may generate a response that is suitable for the client system 130.).
Regarding Claims 5 and 15, Zheng et al. teach: The system of claim 1, wherein the operations further comprise updating a first database (data store 330) in response to generating the one or more response outputs (See rejection of claim 1 and [0111] The processing result may be stored in the context engine 220 as part of the user profile. The analysis result may be stored in the context engine 220 also as part of the user profile. [0136] In particular embodiments, the delivery system 230 may perform different tasks based on the output of the CU composer 370. These tasks may include writing (i.e., storing/updating) the dialog state into the data store 330 using the dialog state writing component 382 and generating responses using the response generation component 380.).
Regarding Claims 6 and 16, Zheng et al. teach: The system of claim 1, wherein the operations further comprise: parsing the inquiry to generate a parsed inquiry; conveying the parsed inquiry to a large language model (NLU module) artificial intelligence agent; and receiving, from the large language model artificial intelligence agent, a first indication of a topic (contextual information or topic or domain) and one or more second indications of one or more objects (particular CU objects) of the parsed inquiry (See rejection of Claim 1 and [0105] In particular embodiments, the capability of audio cognition may enable the assistant system 140 to, for example, understand a user's input associated with various domains in different languages, understand and summarize a conversation, perform on-device audio cognition for complex commands, identify a user by voice, extract topics from a conversation and auto-tag sections of the conversation, enable audio interaction without a wake-word, filter and amplify user voice from ambient noise and conversations, and/or understand which client system 130 a user is talking to if multiple client systems 130 are in vicinity. [0111] In particular embodiments, an assistant service module 305 may access a request manager 310 upon receiving a user input. In particular embodiments, the request manager 310 may comprise a context extractor 312 and a conversational understanding object generator (CU object generator) 314. The context extractor 312 may extract contextual information associated with the user input. The CU object generator 314 may generate particular CU objects relevant to the user input. The CU objects may comprise dialog-session data and features associated with the user input, which may be shared with all the modules of the assistant system 140. [0112] In particular embodiments, the request manger 310 may send the generated CU objects to the NLU module 210. The NLU module 210 may then perform domain classification/selection 334 on user input based on the features resulted from the featurization 332 to classify the user input into predefined domains. In particular embodiments, the NLU module 210 may process the domain classification/selection results using an intent classifier 336b. The intent classifier 336b may determine the user's intent associated with the user input. [0113] In particular embodiments, the NLU module 210 may identify one or more of a domain, an intent, or a slot from the user input in a personalized and context-aware manner.).
Regarding Claims 7 and 17, Zheng et al. teach: The system of claim 1, wherein the operations further comprise conveying the one or more parameters (a set of valid or expected named slots may be conditioned on the classified intent) to a front-end template (slot) (See rejection of claim 1 and [062] In particular embodiments, the assistant application 136 may include an assistant xbot functionality as a front-end interface for interacting with the user of the client system 130, including receiving user inputs and presenting outputs.[0112] In one procedure, the NLU module 210 may process the domain classification/selection results using a meta-intent classifier 336a. The meta-intent classifier 336a may determine categories that describe the user's intent. An intent may be an element in a pre-defined taxonomy of semantic intentions, which may indicate a purpose of a user interaction with the assistant system 140. The NLU module 210a may classify a user input into a member of the pre-defined taxonomy. For example, the user input may be “Play Beethoven's 5th,” and the NLU module 210a may classify the input as having the intent [IN:play_music]. A slot may be a named sub-string corresponding to a character string within the user input representing a basic semantic entity. In particular embodiments, a set of valid or expected named slots may be conditioned on the classified intent. As an example, and not by way of limitation, for the intent [IN:play_music], a valid slot may be [SL:song_name]. In particular embodiments, the meta slot tagger 338a may tag generic slots such as references to items (e.g., the first), the type of slot, the value of the slot, etc. In particular embodiments, the NLU module 210 may process the domain classification/selection results using an intent classifier 336b. The intent classifier 336b may determine the user's intent associated with the user input.).
Regarding Claims 8 and 18, Zheng et al. teach: The system of claim 7, wherein the operations further comprise generating, by the front-end template, one or more first control signals (a goal) to be conveyed to a logic tree ( See rejection of claim 7 and [0126] In particular embodiments, the dialog state tracker 218 may communicate with the action selector 222 about the dialog intents and associated content objects. In particular embodiments, a dialog policy 360 may a tree-based policy, which is a pre-constructed dialog plan. Based on the current dialog state, a dialog policy 360 may choose a node to execute and generate the corresponding actions. As an example, and not by way of limitation, the tree-based policy may comprise topic grouping nodes and dialog action (leaf) nodes. In particular embodiments, a dialog policy 360 may also comprise a data structure that describes an execution plan of an action by an agent 228. A dialog policy 360 may further comprise multiple goals related to each other through logical operators. In particular embodiments, a goal may be an outcome of a portion of the dialog policy and it may be constructed by the dialog manager 216. A goal may be represented by an identifier (e.g., string) with one or more named arguments, which parameterize the goal. As an example, and not by way of limitation, a goal with its associated goal argument may be represented as {confirm_artist, args:{artist: “Madonna” }}. In particular embodiments, goals may be mapped to leaves of the tree of the tree-structured representation of the dialog policy 360. [0128] Once a task is active in the dialog state, the corresponding task policy 364 may be consulted to select right actions. [0131] An agent 228 may select among registered content providers to complete the action. In particular embodiments, the agents 228 may comprise first-party agents and third-party agents. In particular embodiments, first-party agents may comprise internal agents that are accessible and controllable by the assistant system 140 (e.g. agents associated with services provided by the online social network, such as messaging services or photo-share services). In particular embodiments, third-party agents may comprise external agents that the assistant system 140 has no control over (e.g., third-party online music application agents, ticket sales agents).).
Regarding Claims 9 and 19, Zheng et al. teach: The system of claim 8, wherein the operations further comprise generating, by the logic tree based at least on the one or more first control signals (a goal), one or more second control signals (first party agents or third-party agents) (See rejection of claim 8 and [0129] In particular embodiments, the action selector 222 may select an action based on one or more of the event determined by the context engine 220, the dialog intent and state, the associated content objects, and the guidance from dialog policies 360. Each dialog policy 360 may be subscribed to specific conditions over the fields of the state. [0131] The data structure may be constructed by the dialog manager 216 based on an intent and one or more slots associated with the intent. The first-party agents may be associated with first-party providers that provide content objects and/or services hosted by the social-networking system 160. The third-party agents may be associated with third-party providers that provide content objects and/or services hosted by the third-party system 170. In particular embodiments, each of the first-party agents or third-party agents may be designated for a particular domain.).
Regarding Claim 10, Zheng et al. teach: The system of claim 9, wherein the operations further comprise generating the one or more response outputs based at least on the one or more second control signals (See rejection of claim 9 and [0131] The first-party agents may be associated with first-party providers that provide content objects and/or services hosted by the social-networking system 160. The third-party agents may be associated with third-party providers that provide content objects and/or services hosted by the third-party system 170. In particular embodiments, each of the first-party agents or third-party agents may be designated for a particular domain. As an example, and not by way of limitation, the domain may comprise weather, transportation, music, shopping, social, videos, photos, events, locations, and/or work. In particular embodiments, the assistant system 140 may use a plurality of agents 228 collaboratively to respond to a user input. [0137] In particular embodiments, the arbitrator 226 may consult with dialog policies 360 to obtain the guidance when analyzing these results. In particular embodiments, the render output module 232 may generate a response that is suitable for the client system 130. ).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of records Sethi et al.(US 2022/0374605A1) teach: Continuous Learning For Natural-Language Understanding Models For Assistant Systems.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras Shah can be reached at 571-270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2653