Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Response to Arguments
Applicant's arguments with respect to claims 1, 11, and 20 have been considered but are moot in view of the new ground(s) of rejection.
Applicant’s arguments are directed to new citations with existing art to address the amended subject matter, please see below.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 5-13, and 15-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 11948563 B1 Liu; Xiaohu et al. (hereinafter Liu) in view of US 20220199079 A1 Hanson; Michael Robert et al. (hereinafter Hanson) and further in view of US 20220068255 A1 Chen; Zhehuai et al. (hereinafter Chen)
Re claim 1, Liu teaches
1. A computer-implemented method for improving a generalization and adaptability of a task-oriented dialog system to multiple task domains (col 42 line 45 to col 43 line 37 with shortest path for optimalization col 51 lines 7-59 for new tasks input by the user such as a new conversation col 46 lines 14-41), comprising:
obtaining, by a processor, training data for training a sequence-to- sequence language model, wherein the training data comprises: (sequence to sequence model to predict states col 42 line 45 to col 43 line 52)
an input prompt comprising an utterance labeled with a sequence of slot- value pairs, wherein the sequence of slot-value pairs is derived from a short labeled example dialogue demonstrating semantics of schema elements for a task, (using the shortest path for optimalization analogous to the present invention for instance an entire flight booking query by a user amounts to BOOK_FLIGHT_DESTINATION”: “Menlo Park”, which shortens the construct in memory for the task or goal, a schema understood as a goal of a task per se col 51 lines 7-59 for new tasks input by the user such as a new conversation col 46 lines 14-41),, and wherein the utterance relates to a task; (col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39)
a contextual representation comprising a concatenation of a history of utterances exchanged between a user and a service agent, wherein the history of utterances describe a context for the task; (utterances become part of the conversation history in at least col 2 line 61 to col 3 line 20 and fig. 6b-6e… with col 33 lines 13-56 utilizing the dialog history as part of the machine learning process under BRI as concatenation is analogous to updated with, compared to, added to, or merged with, the outcomes from the user utterance interpretation… using intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39)
an input task comprising an additional utterance that is in addition to the utterance; (the user initially utters a request or sentence as part of a slot pair e.g. “pizza” may be [SL:dish] as in at least col 14 line 58 to col 15 line 24 supported with col 42 line 45 to col 43 line 37, the user later utters again which cane not be relevant e.g. “wait a minute”, all of these utterances become part of the conversation history as in at least col 2 line 61 to col 3 line 20 and fig. 6b-6e… for alternative meaning, an input that does not have a value assigned is considered unlabeled per se without support for “unlabeled” in the specification, and understood by one of ordinary skill contextually as in a slot assigned a value or label col 24 line 39 to col 25 line 34)
training, by the processor, the sequence-to-sequence language model, wherein training the sequence-to-sequence language model comprises: (sequence to sequence model to predict states col 42 line 45 to col 43 line 52)
processing, by the processor, using the sequence-to-sequence language model, a concatenation of; (i) the input prompt comprising the utterance labeled with the sequence of slot-value pairs,(ii) the input task comprising the additional utterance that is in addition to the utterance, and (iii) the contextual representation comprising the concatenation of the history of utterances exchanged between the user and the service agent, to (using a sequence to sequence model expressly, also note in fig. 6e prediction is explicitly the goal to help the user accomplish a task, the user initially utters a request or sentence as part of a slot pair e.g. “pizza” may be [SL:dish] as in at least col 14 line 58 to col 15 line 24 supported with col 42 line 45 to col 43 line 37, the user later utters again which cane not be relevant e.g. “wait a minute”, all of these utterances become part of the conversation history as in at least col 2 line 61 to col 3 line 20 and fig. 6b-6e… consider fig. 3 which shows context, user input, prompts, tasks per se, all of which are concatenated in some capacity to produce a final result i.e. the information thereof is merged to produce a result with tasks col 28 lines 1-58… prompts col 38 line 61 to col 39 line 30…context col 47 lines 30-67)
wherein the predicted sequence of dialog states comprises an assignment of values to slots for the additional utterance and for which the user has indicated a preference in dialog sequences corresponding to the input task… and (using a sequence to sequence model expressly, also note in fig. 6e prediction is explicitly the goal to help the user accomplish a task, the user initially utters a request or sentence as part of a slot pair e.g. “pizza” may be [SL:dish] as in at least col 14 line 58 to col 15 line 24 supported with col 42 line 45 to col 43 line 37, the user later utters again which cane not be relevant e.g. “wait a minute”, all of these utterances become part of the conversation history as in at least col 2 line 61 to col 3 line 20 and fig. 6b-6e… prediction of states and assigning a value, where an input that does not have a value assigned is considered unlabeled per se without support for “unlabeled” in the specification, and understood by one of ordinary skill contextually as in a slot assigned a value or label col 24 line 39 to col 25 line 34 with col 33 lines… 13-56 utilizing the dialog history as part of the machine learning process under BRI as concatenation is analogous to updated with, compared to, added to, or merged with, the outcomes from the user utterance interpretation… using intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39)
updating, based on the predicted sequence of dialog states … and the ground truth sequence of dialog state… (ground truth not taught by Liu) …the sequence-to-sequence language model; and providing, by the processor (the entire system is a sequence to sequence model to predict states col 42 line 45 to col 43 line 52… such as slot values, prediction of states and assigning a value, where an input that does not have a value assigned is considered unlabeled per se without support for “unlabeled” in the specification, and understood by one of ordinary skill contextually as in a slot assigned a value or label col 24 line 39 to col 25 line 34 with col 33 lines)
providing, by the processor, the trained sequence-to-sequence language model to a task processor via an application programming interface (API) to enable seamless integration and operation of the task- oriented dialog system with different APIs. (third party and multiple APIs can be called col 11 lines 15-67… an updated machine learning model is thereby provided using intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39)
However, while Liu teaches new tasks based on training data, using API’s, and schema in the context of goals per se, it fails to teach:
…thereby enabling the task-oriented dialog system to adapt to a new task domain or an unseen task domain without retraining for each task domain (Hanson 0244 and 0247)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Liu to incorporate the above claim limitations as taught by Hanson to allow for a simple substitution of one known element, such as trainable multi-domain intention extraction, for another, such as schema based Multi-Woz or unseen schema concepts of domains, thereby improving the domain switch in Liu to handle not just new domains, but new domains in the context that they are not in the presence of training data per se, thereby allowing for seen, unseen, and mixed (seen + unseen) services, to handle services not seen during training, with a balanced BELU score factor to increase a models generalizability and train thereafter, using an analogous shortest path concept as Liu now expressly with schema guidance.
However, while the combination teaches a suggestion of ground truth, such as an observable or tangibly occurring data from a user, as well as seq-2-seq model learning, tasks, prompts, context, unvalued (labeled), slot assignment, of which to handle inputs, it fails to teach:
a target comprising a ground truth sequence of dialog states for the input task (Chen 0072-0074 and analogously in a low level scope: seq-2-seq models 0031, prompts, context, and tasks fig. 3b 0036 0053, prediction states 0057-0058, and unlabeled per se 0039)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Liu in view of Hanson to incorporate the above claim limitations as taught by Chen to allow for a simple substitution of one known element of ground-truth states for another such as a suggestion of ground truth, such as an observable or tangibly occurring data from a user to obtain predictable results, causing an improvement of both accuracy and latency, the efficiency of large-scale unspoken text utterance learning, and matches between the selected subset of the available unspoken text utterances and a target, which in turn reduces the computational resources required to exploit a large amount of non-domain-specific data, unseen, or unlabeled/unvalued inputs.
Re claim 11, this claim has been rejected for teaching a broader, or narrower claim based on general inclusion of hardware alone (e.g. processor, memory, instructions), representation of claim 1 omitting/including hardware for instance, otherwise amounting to a virtually identical scope.
Re claim 20, this claim has been rejected for teaching a broader, or narrower claim based on general inclusion of hardware alone (e.g. processor, memory, instructions), representation of claim 1 omitting/including hardware for instance, otherwise amounting to a virtually identical scope.
Re claims 2 and 12, Liu teaches
2. The computer-implemented method of claim 1, wherein the input prompt comprises a sequence of utterances, and wherein the sequence of slot-value pairs indicates possible slots and values in the sequence of utterances. (intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39)
Re claims 3 and 13, Liu teaches
3. The computer-implemented method of claim 1, wherein the input prompt is a semantic representation of the schema descriptions associated with the task. (the agent responds reiterating the scheme or goal of the users intent e.g. travel reservations to and from at a time, using intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39)
Re claims 5 and 15, Liu teaches
5. The computer-implemented method of claim 1, further comprising: receiving, via an application programming interface (API) for a task processor, API schemata comprising schema descriptions associated with a particular task; and (API usage col 11 lines 15-64… using intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39)
applying the trained sequence-to-sequence language model to predict a particular sequence of dialog states for the particular task. (using intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39)
Re claims 6 and 16, Liu teaches
6. The computer-implemented method of claim 5, wherein the training of the sequence- to-sequence language model is based on a first type of task, and (various types or context of tasks as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39… the model is updated based on all inputs, the way a user speaks, a style of the user or user habits, history, etc.)
wherein the applying of the trained sequence-to-sequence language model is based on a second type of task different from the first type of task. (here we see that a travel reservation interaction produces various contexts such as preferred airline which can be independent of the actual scheduling of a departure and return time. Since all user inputs effect the model, any historical inputs will effect what the user says such as the system knowing who “Dave” is, i.e. Dave Sanchez, using fig. 5b-6e the user can include “Dave” as a travel companion when reserving a flight, not limited to mixing travel with a contact, various types or context of tasks as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39… the model is updated based on all inputs, the way a user speaks, a style of the user or user habits, history, etc.… using intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d)
Re claim 7, Liu teaches
7. The computer-implemented method of claim 6, wherein the first type of task corresponds to an airline reservation task, and wherein second first type of task corresponds to a blog post generation task. (utilizing the context of an article, social media post, analogous to a blog post per se as in col 1 lines 40-53, col 2 lines 5-38, col 8 lines 10-23, here we analogously see that information from a blog/article post can be utilizing to book a flight for instance, that a travel reservation interaction produces various contexts such as preferred airline which can be independent of the actual scheduling of a departure and return time. Since all user inputs effect the model, any historical inputs will effect what the user says such as the system knowing who “Dave” is, i.e. Dave Sanchez, using fig. 5b-6e the user can include “Dave” as a travel companion when reserving a flight, not limited to mixing travel with a contact, various types or context of tasks as shown in fig. 5b-6d, expressly with an agent col 35 lines 3-21 in various contexts of requests e.g. food, travel, placing an order, reservations, purchases, etc. as in col 45 lines 3-56 with known dialog intent/contexts col 1 lines 19-39… the model is updated based on all inputs, the way a user speaks, a style of the user or user habits, history, etc.… using intent extractions as in col 42 line 45 to col 43 line 7 semantic representations of user utterance in various stats iterated back to the user using a sequence-to-sequence model for tasks/requests within the user utterance represented as key value pairs or slots in dialog with an agent as shown in fig. 5b-6d)
Re claims 8 and 17, while Liu teaches dialog states, slots paired together, and maps or schemes of dialog intent from user inputs, the reference fails to teach:
8. The computer-implemented method of claim 1, wherein the training of the sequence- to-sequence language model is based on a Schema-guided Dialogue (SGD) dataset. (Hanson SGD and Multiwoz for task oriented dialogs)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Liu to incorporate the above claim limitations as taught by Hanson to allow for a simple substitution of a machine learning sequence-to-sequence model or SGD into an existing analogous large scale dialog task model as in Liu to obtain predictable results thereby updating the model with not only personal user interactions but the order of 10,000 to 20,000 revolving updates of human to agent interactions in-context with user intents, thus accounting for a larger combination of multiple intent inputs by a user, thereby reducing errors, model corruption with garbage data, and time spent by a user correcting mistakes.
Re claims 9 and 18, while Liu teaches dialog states, slots paired together, and maps or schemes of dialog intent from user inputs, the reference fails to teach:
9. The computer-implemented method of claim 1, wherein the training of the sequence- to-sequence language model is based on a MultiWOZ dataset. (Hanson SGD and Multiwoz for task-oriented dialogs 0210)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Liu to incorporate the above claim limitations as taught by Hanson to allow for a simple substitution of a machine learning sequence-to-sequence model or SGD into an existing analogous large scale dialog task model as in Liu to obtain predictable results thereby updating the model with not only personal user interactions but the order of 10,000 to 20,000 revolving updates of human to agent interactions in-context with user intents, thus accounting for a larger combination of multiple intent inputs by a user, thereby reducing errors, model corruption with garbage data, and time spent by a user correcting mistakes.
Re claims 10 and 19, while Liu teaches dialog states, slots paired together, and maps or schemes of dialog intent from user inputs, the reference fails to teach:
10. The computer-implemented method of claim 9, further comprising: applying a pre-processing script to the MultiWOZ dataset to correct one or more annotation errors. (Hanson using SGD and Multiwoz for task-oriented dialogs 0210, and for annotation handling including inevitable annotations error 0242)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Liu to incorporate the above claim limitations as taught by Hanson to allow for a simple substitution of a machine learning sequence-to-sequence model or SGD into an existing analogous large scale dialog task model as in Liu to obtain predictable results thereby updating the model with not only personal user interactions but the order of 10,000 to 20,000 revolving updates of human to agent interactions in-context with user intents, thus accounting for a larger combination of multiple intent inputs by a user, thereby reducing annotation errors where contradictions may still be correct i.e. false-negatives/positives, model corruption with garbage data, and time spent by a user correcting mistakes.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20230120940 A1 Qiu; Liang et al. explain
Dialog states and MultiWoz
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL COLUCCI whose telephone number is (571)270-1847. The examiner can normally be reached on M-F 9 AM - 7 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached at (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL COLUCCI/Primary Examiner, Art Unit 2655 (571)-270-1847
Examiner FAX: (571)-270-2847
Michael.Colucci@uspto.gov