Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/18/2023 is considered by the examiner.
Drawings
The drawing submitted on 11/18/2023 is considered by the examiner.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-4, 6-7, and 9-16 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Krishnan et al.(US 2022/0093101 A1).
Regarding Claim 1, Krishnan et al. teach: A dialog device comprising a processor configured to execute operations comprising: collecting question response data, the question response data (track a user input and the corresponding system generated response to the user input as a turn )including a state of a dialog (user input as a turn; i.e. multiple turns of user input and corresponding system generated response), a question, and a response ([0074] The dialog manager component 272 may track a user input and the corresponding system generated response to the user input as a turn. The dialog session identifier may correspond to multiple turns of user input and corresponding system generated response. The dialog manager component 272 may transmit data identified by the dialog session identifier directly to the orchestrator component 230 or other component. Depending on system configuration the dialog manager 272 may determine the appropriate system generated response to give to a particular utterance or user input of a turn. Or creation of the system generated response may be managed by another component of the system (e.g., the language output component 293, NLG 279, orchestrator 230, etc.) while the dialog manager 272 selects the appropriate responses.); generating an utterance template (response template) associated with the state on a basis of the question response data ([0078] The NLG may use templates to formulate responses. ); generating a system utterance by using the utterance template associated with a state of a current dialog ([0079] The NLG system may generate dialog data based on one or more response templates. Further continuing the example above, the NLG system may select a template in response to the question, “What is the weather currently like?” of the form: “The weather currently is $weather information$.” The NLG system may analyze the logical form of the template to produce one or more textual responses including markups and annotations to familiarize the response that is generated. Responsive audio data representing the response generated by the NLG system may then be generated using the text-to-speech component 280. ); presenting the system utterance to a user ([0095] For example, the system 120, using a remote directive that is included in response data (e.g., a remote response), may instruct the device 110 to output an audible response (e.g., using TTS processing performed by an on-device TTS component 380) to a user's question via a loudspeaker(s) of (or otherwise associated with) the device 110, to output content (e.g., music) via the loudspeaker(s) of (or otherwise associated with) the device 110, to display content on a display of (or otherwise associated with) the device 110, and/or to send a directive to a secondary device (e.g., a directive to turn on a smart light).); receiving a user utterance uttered by the user; and causing causes the state of the current dialog to transition on a basis of the user utterance ([0073] The system(s) 100 may include a dialog manager component 272 that manages and/or tracks a dialog between a user and a device. As used herein, a “dialog” may refer to data transmissions (such as relating to multiple user inputs and system 100 outputs) between the system 100 and a user (e.g., through device(s) 110) that all relate to a single “conversation” between the system and the user that may have originated with a single user input initiating the dialog. Thus, the data transmissions of a dialog may be associated with a same dialog identifier, which may be used by components of the overall system 100 to track information across the dialog. For example, a user may open a dialog with the system 100 to request a food delivery in a spoken utterance and the system may respond by displaying images of food available for order and the user may speak a response (e.g., “item 1” or “that one”) or may gesture a response (e.g., point to an item on the screen or give a thumbs-up) or may touch the screen on the desired item to be selected. [0074] The dialog manager component 272 may track a user input and the corresponding system generated response to the user input as a turn.).
Regarding Claim 2, Krishnan et al.teach: A dialog device comprising a processor configured to execute operations comprising: collecting question response data including a first dialog act representing an utterance intention (What is the weather currently like? ), a question, and a response; generating an utterance template associated with the first dialog act on a basis of the question response data; generating a system utterance by using the utterance template associated with a second dialog act to be performed next; presenting the system utterance to a user; receiving a user utterance uttered by the user; and determining the second dialog act to be performed next on a basis of the user utterance (See rejection of claim 1).
Regarding Claim 3, Krishnan et al.teach: The dialog device according to claim 2,the processor further configured to execute operations comprising: learning an utterance conversion model that uses an utterance as an input and outputs an utterance obtained by paraphrasing the utterance, by using paraphrase data including the system utterance and an utterance obtained by paraphrasing the system utterance; and inputting the system utterance into the utterance conversion model to obtain a converted system utterance obtained by paraphrasing the system utterance (See rejection of claim 1, specifically, [0079] The NLG system may generate dialog data based on one or more response templates. Further continuing the example above, the NLG system may select a template in response to the question, “What is the weather currently like?” of the form: “The weather currently is $weather information$.” The NLG system may analyze the logical form of the template to produce one or more textual responses including markups and annotations to familiarize the response that is generated. Responsive audio data representing the response generated by the NLG system may then be generated using the text-to-speech component 280.).
Regarding Claim 4, Krishnan et al.teach: The dialog device according to claim 3,the processor further configured to execute operations comprising: presenting the converted system utterance to a user (See rejection of claim 3, specifically [0079] he NLG system may analyze the logical form of the template to produce one or more textual responses including markups and annotations to familiarize the response that is generated. Responsive audio data representing the response generated by the NLG system may then be generated using the text-to-speech component 280. [0095] For example, the system 120, using a remote directive that is included in response data (e.g., a remote response), may instruct the device 110 to output an audible response (e.g., using TTS processing performed by an on-device TTS component 380) to a user's question via a loudspeaker(s) of (or otherwise associated with) the device 110, to output content (e.g., music) via the loudspeaker(s) of (or otherwise associated with) the device 110, to display content on a display of (or otherwise associated with) the device 110, and/or to send a directive to a secondary device (e.g., a directive to turn on a smart light).).
Regarding Claim 6, Krishnan et al.teach: A dialog method comprising: collecting question response data including a first dialog act representing an utterance intention, a question, and a response; generating an utterance template associated with the first dialog act on a basis of the question response data; generating a system utterance by using the utterance template associated with a second dialog act to be performed next; presenting the system utterance to a user; receiving a user utterance uttered by the user; and determining the second dialog act to be performed next on a basis of the user utterance (See rejection of claim 1).
Regarding Claim 7, Krishnan et al.teach: The dialog method according to claim 6,further comprising: collecting paraphrase data, the paraphrase data including the utterance and a paraphrased utterance obtained by paraphrasing the utterance; learning the utterance conversion model that uses an input utterance as an input and outputs an output utterance obtained by paraphrasing the input utterance, by using the paraphrase data; inputting the system utterance into the utterance conversion model to obtain a converted system utterance obtained by paraphrasing the system utterance; and presenting the converted system utterance to a user, by an utterance presentation unit (See rejection of claim 3).
Regarding Claim 9, Krishnan et al.teach: The dialog device according to claim 1, wherein the utterance is in natural language form (See rejection of claim 1 and [0077] The language output component 293 includes a natural language generation (NLG) component 279 and a text-to-speech (TTS) component 280. The NLG component 279 can generate text for purposes of TTS output to a user. For example, the NLG component 279 may generate text corresponding to instructions corresponding to a particular action for the user to perform. The NLG component 279 may generate appropriate text for various outputs as described herein. The NLG component 279 may include one or more trained models configured to output text appropriate for a particular input. The text output by the NLG component 279 may become input for the TTS component 280 (e.g., output text data 2110 discussed below). Alternatively, or in addition, the TTS component 280 may receive text data from a skill 290 or other system component for output.).
Regarding Claim 10, Krishnan et al.teach: The dialog device according to claim 1, wherein the generated utterance template enables a type of phrasing that represents a human-like character of the dialog device (See rejection of claim 1 and [0033] Text-to-speech (TTS) is a field of computer science concerning transforming textual and/or other data into audio data that is synthesized to resemble human speech. ASR, NLU, and TTS may be used together as part of a speech-processing system. [0079] The NLG system may generate dialog data based on one or more response templates. Further continuing the example above, the NLG system may select a template in response to the question, “What is the weather currently like?” of the form: “The weather currently is $weather information$.” The NLG system may analyze the logical form of the template to produce one or more textual responses including markups and annotations to familiarize the response that is generated. Responsive audio data representing the response generated by the NLG system may then be generated using the text-to-speech component 280. [0311] In this way the system may act more human-like as a natural participant in a conversation and may answer questions or interject information that may be helpful to the conversation, even if a user's statement/gesture, etc. as part of the conversation was directed at another user participant of the conversation rather than directly at the system. The conversation mode may be independent from or a part of a multi-user dialog mode as discussed herein.).
Regarding Claim 11, Krishnan et al.teach: The dialog device according to claim 2, wherein the utterance is in natural language form (See rejection of claim 9).
Regarding Claim 12, Krishnan et al.teach: The dialog device according to claim 2, wherein the generated utterance template enables a type of phrasing that represents a human-like character of the dialog device (See rejection of claim 10).
Regarding Claim 13, Krishnan et al.teach: The dialog device according to claim 3, wherein the paraphrase data indicates human character-likeness of the dialog device(See rejection of claim 10).
Regarding Claim 14, Krishnan et al.teach: The dialog method according to claim 6, wherein the utterance is in natural language form(See rejection of claim 9).
Regarding Claim 15, Krishnan et al.teach: The dialog method according to claim 6, wherein the generated utterance template enables a type of phrasing that represents a human-like character in the system utterance(See rejection of claim 10).
Regarding Claim 16, Krishnan et al.teach: The dialog method according to claim 7, wherein the paraphrase data indicates human character-likeness in the converted system utterance(See rejection of claim 10).
.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The pertinent art of records Vasylyev (US 20240412720 A1) teach: REAL-TIME CONTEXTUALLY AWARE ARTIFICIAL INTELLIGENCE (AI) ASSISTANT SYSTEM AND A METHOD FOR PROVIDING A CONTEXTUALIZED RESPONSE TO A USER USING AI.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras Shah can be reached at 571-270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2653