DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in responsive to communication(s): original application filed on 05/31/2023, said application claims a priority filing date of 02/28/2023. Claims pending. Claims independent.
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference sign(s) mentioned in the description: 700 in ¶ [0107];. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: 705 and 735 in FIG. 7. Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because (1) reference character “106” has been used to designate both " integration manager" in FIG. 1 and ¶¶ [0017]-[0018], [0030], [0035], [0041], [0047], [0049], [0051]-[0053], [0057], [0072], [0074], [0079], and [0082] and "data store" in ¶¶ [0033], [0044], [0052]-[0053], and [0082]; (2) reference character “730” has been used to designate both "PERIPHERAL DEVICE PORT" in FIG. 7 and "on-board camera" in ¶ [0110]; (3) reference character “821” has been used to designate both "MACHINE LEARNING MODEL" in FIG. 8 and "embedding object memory insertion engine" in ¶ [0114]; (4) reference character “822” has been used to designate both "DIRECTORY SERVICES" in FIG. 8 and "embedding object memory retrieval engine" in ¶ [0114]; and (5) reference character “824” has been used to designate both "WEB PORTAL" in FIG. 8 and "directory service" in ¶ [0113]. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because (1) reference characters "112" in FIG. 1 and ¶¶ [0017]-[0018], [0030], [0034], [0041], [0046]-[0047], [0049], [0051], [0053], [0062], [0064], [0068], [0074], [0079], and [0082] and "106" in ¶¶ [0033], [0044], [0052]-[0053], and [0082] have both been used to designate "data store"; (2) reference characters "822" in FIG. 8 and "824" in ¶ [0113] have both been used to designate "directory service"; and (3) reference characters "824" in FIG. 8 and "825" in ¶ [0113] have both been used to designate "web portal". Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The disclosure is objected to because of the following informalities:
in ¶ [0019], "… An interactive element is an aspect of the of the interactive environment 140 …" appears to be "… An interactive element is an aspect of the interactive environment 140 …";
in ¶ [0020], "… in a gaming environment a user asking a NL question to an NPC with …" appears to be "… in a gaming environment a user asking a natural language (NL) question to a non-player character (NPC) with …" (see also ¶ [0022] for indicating what is "NPC");
in ¶ [0022], it is unclear what the term "MMORPG" stand for in "… an interactive environment 140 may be a gaming environment such as a video game, online games, MMORPG, a virtual reality gaming environment, and/or other a game-like experience …" (NOTE: the term "MMORPG" mentioned again in ¶ [0116], but it still has no description for this term);
in ¶¶ [0033], [0044], [0052]-[0053], and [0082], "… data store 106 …" appears to be "… data store 112 …" (see also Drawing Objections);
in ¶ [0040], "… a NPC in a video game …" appears to be "… an NPC in a video game …";
in ¶ [0045], "… utilizes it to determine one or more a intent objectives …" appears to be "… utilizes it to determine one or more intent objectives …";
in ¶ [0046], "… refined to determine a one or more intent objectives for the input" appears to be "… refined to determine one or more intent objectives for the input";
in ¶ [0048], "… output may be one or more of a text file, an audio file, an image, a video, a NL output …" appears to be "… output may be one or more of a text file, an audio file, an image, a video, NL output …" or "… output may be one or more of a text file, an audio file, an image, a video, an NL output …" (see also ¶¶ [0021] and [0037]);
in ¶ [0072], "… on a computing device (e.g., user device 102and/or developer device 105) …" appears to be "… on a computing device (e.g., user device 102and/or developer device 108) …".
Appropriate correction is required.
The use of the term "Bluetooth" and "Wi-Fi" in ¶ [0069], which is a trade name or a mark used in commerce, has been noted in this application. The term should be accompanied by the generic terminology; furthermore the term should be capitalized wherever it appears or, where appropriate, include a proper symbol indicating use in commerce such as ™, SM , or ® following the term.
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) are permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.
Claim Objections
Claims 1-6, 9-15, and 17-19 are objected to because of the following informalities:
in Claim 1, lines 12-13; Claim 9, lines 7-8; and Claim 19, lines 8-9, "… a intent objective based on one or more of the input, specific context and one or more environment guidelines …" appears to be "… an intent objective based on one or more of the input, the specific context and the one or more environment guidelines …" (see also 112(b) rejection to Claims 1, 9, and 19);
in Claim 1, lines 18-19; Claim 9, lines 13-14; and Claim 19, lines 14-15, "… the model output for responsiveness to the input and the environment guidelines …" appears to be "… the model output for responsiveness to the input and the one or more environment guidelines …" (see also 112(b) rejection to Claims 1, 9, and 19);
in Claim 2, lines 8-9; and Claim 10, lines 8-9, "… a second intent objective based on one or more of the subsequent input, specific context and one or more environment guidelines …" appears to be "… a second intent objective based on one or more of the subsequent input, the specific context and the one or more environment guidelines …" (see also 112(b) rejection to Claims 2 and 10);
in Claim 3, lines 1-3; and Claim 11, lines 1-3, "… wherein generate/generating a prompt further comprises … combine/combining the one or more prompt templates into a prompt" appears to be "… wherein generate/generating the prompt further comprise … combine/combining the one or more prompt templates into the prompt";
in Claim 4, lines 1-5; and Claim 12, lines 1-5, "… wherein associate/associating one or more prompt templates further comprises … identify/identifying one or more prompt templates that are semantically associated with the intent objective …" appears to be "… wherein associate/associating the one or more prompt templates further comprises … identify/identifying the one or more prompt templates that are semantically associated with the intent objective …";
in Claim 5, lines 1-5; and Claim 13, lines 1-5, "… wherein generate/generating a/an intent objective further comprises … an embedding for one or more of the input, specific context, and environment guidelines… is semantically associated with the input, specific context, and environment guidelines based on …" appears to be "… wherein generate/generating the intent objective further comprises … an embedding for one or more of the input, the specific context, and the one or more environment guidelines… is semantically associated with the input, the specific context, and the one or more environment guidelines based on …" (see also 112(b) rejection to Claims 5 and 13);
in Claim 6, lines 1-8; and Claim 14, lines 1-8, "… wherein evaluate/evaluating the model output for responsiveness further comprises … for evaluating model output; generate/generating one or more confidence scores for one or more components of the model output, wherein the confidence score measures responsiveness to the input and satisfying the environment guidelines … compare/comparing the one or more confidence scores for the one or more components of the output against …" appears to be "… wherein evaluate/evaluating the model output for the responsiveness further comprises … for evaluating the model output; generate/generating one or more confidence scores for one or more components of the model output, wherein the one or more confidence scores measure the responsiveness to the input and satisfying the one or more environment guidelines … compare/comparing the one or more confidence scores for the one or more components of the model output against …" (see also 112(b) rejection to Claims 6 and 14);
in Claim 15, line 2, "… storing one or more of the input, intent objective, the prompt, and the model output " appears to be "… storing one or more of the input, the intent objective, the prompt, and the model output ";
in Claim 17, line 1, "… wherein an interactive element comprises a non-player character (NPC) …" appears to be "… wherein the interactive element comprises a non-player character (NPC) …";
in Claim 18, lines 1-2, "… wherein a gaming environment comprises a video game, online game …" appears to be "… wherein the gaming environment comprises a video game, an online game …".
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 9, and 19 recite the limitation "… receive/receiving one or more environment guidelines; associate/associating (…) the input with one or more environment guidelines that provide systemic context about the interactive/gaming environment; determine/determining (…) a intent objective based on one or more of the input, specific context and one or more environment guidelines … evaluate/evaluating (…) the model output for responsiveness to the input and the environment guidelines …" in lines 9-19, 4-14, and 5-15 respectively, which rendering these claims indefinite because it is unclear which instance of "one or more environment guidelines" ("one or more environment guidelines" received or "one or more environment guidelines that provide systemic context about the interactive/gaming environment"?) is referred by "the one or more environment guidelines" and "the environment guidelines" (see also Claim Objections to Claims 1, 9, and 19). Clarification is required.
Claims 2-8, 10-18, and 20 are rejected for fully incorporating the deficiency of their respective base claims.
Claims 2 and 10 recite the limitation "… monitor(ing) the interactive environment for a subsequent input based on … when there is a subsequent input, analyze/analyzing …" in lines 2-4, which rendering these claims indefinite because it is unclear whether these two instances of "a subsequent input" are the same or different. Clarification is required.
Claims 2 and 10 recites the limitation "… analyze/analyzing (…) the interactive/gaming environment for a specific context based on the subsequent input … determine/determining a second intent objective based on one or more of the subsequent input, specific context and …" in lines 4-9, which rendering these claims indefinite because "… analyze/analyzing (… ) the interactive/gaming environment for a specific context based on the input …" is also recited in their respective based claim, and (1) it is unclear whether "a specific context" recited here is the same as or different to "specific context" recited in their respective based claim; (2) if they are different, it is unclear which "specific context" (recited in their respective based claim or here?) is referred by "the specific context" recited here. Clarification is required.
Claims 2 and 10 recite the limitation "… associate/associating the subsequent input with one or more environment guidelines that provide systemic context about the interactive/gaming environment; determine/determining a second intent objective based on one or more of the subsequent input, specific context and one or more environment guidelines …" in lines 6-9, which rendering these claims indefinite because "… receive/receiving one or more environment guidelines; associate/associating (…) the input with one or more environment guidelines that provide systemic context about the interactive/gaming environment …" is also recited in their respective based claim, and it is unclear (1) whether "system context" recited here is the same as or different to "system context" recited in their respective based claim; (2) whether "one or more environment guidelines that provide systemic context about the interactive/gaming environment" recited here is the same as or different to "one or more environment guidelines that provide systemic context about the interactive/gaming environment" recited in their respective based claim; and (3) which instance of "one or more environment guidelines" ("one or more environment guidelines" received, "one or more environment guidelines" associated with "the input", or "one or more environment guidelines" associated with "the subsequent input"?) is referred by "the one or more environment guidelines" recited here (see also Claim Objections to Claims 2 and 10). Clarification is required.
Claims 2 and 10 recite the limitation "… determine/determining a second intent objective based on one or more of the subsequent input … generate/generating a second prompt for the generative machine learning model based on the intent objective" in lines 8-11, which rendering these claims indefinite because "… determine/determining (…) a intent objective based on one or more of the input … generate/generating (…) a prompt for a generative machine learning model based on the intent objective …" is also recited in their respective based claim, and it is unclear (1) how different prompts ("a prompt" and "a second prompt") can be generated based on the same "intent objective"; and (2) why "a second intent objective" is determined without being utilized to generate "a second prompt". For examination purpose, "… generate/generating a second prompt for the generative machine learning model based on the second intent objective" is considered.
Claims 5 and 13 recite the limitation "… an embedding for one or more of the input, specific context, and environment guidelines … is semantically associated with the input, specific context, and environment guidelines …" in lines 2-5, which rendering these claims indefinite because "… receive/receiving one or more environment guidelines; associate/associating (…) the input with one or more environment guidelines that provide systemic context about the interactive/gaming environment …" is also recited in their respective based claim, and it is unclear which instance of "one or more environment guidelines" recited in their respective based claim is referred by "the environment guidelines" recited here (see also Claim Objections to Claims 5 and 13). Clarification is required.
Claims 5 and 13 recite the limitation "… identify/identifying a intent objective that is semantically associated with the input, specific context, and environment guidelines based on the intent objective" in lines 4-5, which rendering these claims indefinite because "… determine/determining (…) a intent objective based on one or more of the input, specific context and one or more environment guidelines …" is also recited in their respective based claim, and (1) it is unclear whether "a intent objective that is semantically associated with the input, specific context, and environment guidelines" recited here is the same as or different to "a intent objective" determined based on "one or more of the input, specific context and one or more environment guidelines" recited in their respective based claim; (2) if they are different, it is unclear which instance of "intent objective" ("a intent objective that is semantically associated with the input, specific context, and environment guidelines" recited here or "a intent objective" determined based on "one or more of the input, specific context and one or more environment guidelines" recited in their respective based claim?) is referred by "the intent objective" to be based upon when identifying "a intent objective that is semantically associated with the input, specific context, and environment guidelines"; and (3) if they are the same, how "a intent objective that is semantically associated with the input, specific context, and environment guidelines" can be identified based upon the same "intent objective" (i.e., how unknow data can be identified/determined based on the same unknown data). Clarification is required.
Claims 6 and 14 recite the limitation "... satisfying the environment guidelines based on one or more metrics ..." in lines 5-6, which rendering these claims indefinite because "… determine/determining (…) a intent objective based on one or more of the input, specific context and one or more environment guidelines …" is also recited in their respective based claim, and it is unclear which instance of "one or more environment guidelines" recited in their respective based claim is referred by "the environment guidelines" recited here (see also Claim Objections to Claims 6 and 14). Clarification is required.
Claim 7 recites the limitation "the one or more intent objectives" in line 2. There is insufficient antecedent basis for this limitation in the claim. Clarification is required.
Claim 10 recites the limitation "the interactive environment" in line 2. There is insufficient antecedent basis for this limitation in the claim. For examination purpose, "the gaming environment" is considered.
Claim18 recites the limitation "… wherein a gaming environment comprises … MMORPG …" in lines 1-2, which rendering the claim indefinite because it is unclear what does the term "MMORPG" stand for, and also it is not described in ¶¶ [0022] and [0116] of the specification where are cited. Clarification is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea without significantly more.
Independent Claims 1, 9, and 19
Step 1: Claim 1 is a system claim, Claim 9 is a process claim, and Claim 19 is a computer storage media (excluding carrier wave or other propagated or modulated data signal described in ¶ [0104]) claim. These claims are fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) recite(s) "analyze/analyzing (…) the interactive/gaming environment for a specific context based on the input", "associate/associating (…) the input with one or more environment guidelines that provide systemic context about the interactive/gaming environment", "determine/determining (…) a intent objective based on one or more of the input, specific context and one or more environment guidelines", "generate/generating (…) a prompt for a generative model based on the intent objective", "evaluate/evaluating (…) the model output for responsiveness to the input and the environment guidelines", and "when the model output is responsive, modify(ing) the interactive element of the interactive/gaming environment based on the model output" which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) recite(s) additional elements/limitations of "a system" (Claim 1), "at least one processor" (Claim 1), "memory" (Claim 1), "a computer storage media" (Claim 19),, "a processor" (Claim 19), "receive/receiving an input (…) to modify an interactive element of an interactive/a gaming environment", "receive/receiving one or more environment guidelines", "machine learning", and "execute/executing the generative machine learning model with the prompt to produce a model output" which only amount to "apply it" with the use of generic computer components or insignificant extra solution activity. None of the additional elements/limitations, taken alone or in combination, integrate the abstract idea into a practical application.
Step 2B: The claim(s) does/do not include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception because (a) the additional limitations/elements of "receive/receiving an input (…) to modify an interactive element of an interactive/a gaming environment" and "receive/receiving one or more environment guidelines" are well-understood, routine and conventional (WURC) activity similar to "receiving or transmitting data over a network" (see MPEP 2106.05(d), "Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network)"); and (b) the additional limitations/elements of "machine learning" and "execute/executing the generative machine learning model with the prompt to produce a model output" are also well-understood, routine and conventional (WURC) activity similar to "performing repetitive calculation" (see MPEP 2106.05(d), "Performing repetitive calculations, Flook, 437 U.S. at 594, 198 USPQ2d at 199 (recomputing or readjusting alarm limit values)"). Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claims 2 and 10
Step 1: Claim 2 is a system claim and Claim 10 is a process claim. These claims are fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) further recite(s) "when there is a subsequent input, analyze/analyzing (…) the interactive/gaming environment for a specific context based on the subsequent input", "associate/associating the subsequent input with one or more environment guidelines that provide systemic context about the interactive/gaming environment", "determine/determining a second intent objective based on one or more of the subsequent input, specific context and one or more environment guidelines", and "generate/generating a second prompt for the generative machine learning model based on the intent objective" which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) further recite(s) additional element/limitation of "monitor/monitoring the interactive environment for a subsequent input based on the provided model output" which only amount to "apply it" with the use of generic computer components or insignificant extra solution activity. None of the additional elements/limitations, taken alone or in combination, integrate the abstract idea into a practical application.
Step 2B: The claim(s) does/do not include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception because the additional limitation/element of "monitor/monitoring the interactive environment for a subsequent input based on the provided model output" is also well-understood, routine and conventional (WURC) activity similar to "receiving or transmitting data over a network" (see MPEP 2106.05(d), "Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network)"). Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claims 3 and 11
Step 1: Claim 3 is a system claim and Claim 11 is a process claim. These claims are fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) further recite(s) "associate/associating one or more prompt templates with the intent objective" and "combine/combining the one or more prompt templates into a prompt" which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) does/do not further recite(s) additional elements/limitations.
Step 2B: The claim(s) does/do not further include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception. Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claims 4 and 12
Step 1: Claim 4 is a system claim and Claim 12 is a process claim. These claims are fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) further recite(s) "generate/generating an embedding for the intent objective" and "identify(ing) one or more prompt templates that are semantically associated with the intent objective based on the embedding" which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) does/do not further recite(s) additional elements/limitations.
Step 2B: The claim(s) does/do not further include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception. Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claims 5 and 13
Step 1: Claim 5 is a system claim and Claim 13 is a process claim. These claims are fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) further recite(s) "generate/generating an embedding for one or more of the input, specific context, and environment guidelines" and "identify(ing) a intent objective that is semantically associated with the input, specific context, and environment guidelines based on the intent objective" which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) does/do not further recite(s) additional elements/limitations.
Step 2B: The claim(s) does/do not further include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception. Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claims 6 and 14
Step 1: Claim 6 is a system claim and Claim 14 is a process claim. These claims are fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) further recite(s) "generate/generating one or more confidence scores for one or more components of the model output, wherein the confidence score measures responsiveness to the input and satisfying the environment guidelines based on one or more metrics" and "compare/comparing the one or more confidence scores for the one or more components of the output against the confidence threshold value" which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) further recite(s) additional element/limitation of "receive/receiving a confidence threshold value for evaluating model output" which only amount to "apply it" with the use of generic computer components or insignificant extra solution activity. None of the additional elements/limitations, taken alone or in combination, integrate the abstract idea into a practical application.
Step 2B: The claim(s) does/do not include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception because the additional limitation/element of "receive/receiving a confidence threshold value for evaluating model output" is also well-understood, routine and conventional (WURC) activity similar to "receiving or transmitting data over a network" (see MPEP 2106.05(d), "Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network)"). Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claims 7 and 15
Step 1: Claim 7 is a system claim and Claim 15 is a process claim. These claims are fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) does/do not further recite(s) elements/limitations which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) further recite(s) additional element/limitation of "store/storing one or more of the input, the one or more intent objectives/intent objective, the prompt, and the model output" which only amount to "apply it" with the use of generic computer components or insignificant extra solution activity. None of the additional elements/limitations, taken alone or in combination, integrate the abstract idea into a practical application.
Step 2B: The claim(s) does/do not include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception because the additional limitation/element of "store/storing one or more of the input, the one or more intent objectives/intent objective, the prompt, and the model output" is also well-understood, routine and conventional (WURC) activity similar to "storing and retrieving information in memory" (see MPEP 2106.05(d), "Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93"). Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claims 8, 16, and 20
Step 1: Claim 8 is a system claim, Claim 16 is a process claim, and Claim 20 is a computer storage media (excluding carrier wave or other propagated or modulated data signal described in ¶ [0104]) claim. These claims are fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) further recite(s) "when the model output is not responsive, generate/generating a new prompt for the generative model based on the intent objective " which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) further recite(s) additional elements/limitations of "machine learning" and "execute/executing the generative machine learning model with the new prompt to produce a new model output" which only amount to "apply it" with the use of generic computer components or insignificant extra solution activity. None of the additional elements/limitations, taken alone or in combination, integrate the abstract idea into a practical application.
Step 2B: The claim(s) does/do not include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception because the additional limitations/elements of "machine learning" and "execute/executing the generative machine learning model with the prompt to produce a model output" are also well-understood, routine and conventional (WURC) activity similar to "performing repetitive calculation" (see MPEP 2106.05(d), "Performing repetitive calculations, Flook, 437 U.S. at 594, 198 USPQ2d at 199 (recomputing or readjusting alarm limit values)"). Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claim 17
Step 1: Claim 17 is a process claim which is fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) does/do not further recite(s) elements/limitations which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) further recite(s) additional element/limitation of "receiving an input to modify an interactive element of a gaming environment, wherein an interactive element comprises a non-player character (NPC), animated infographic, video, image, quiz, game object, and other aspects of the gaming environment which a user may be able to access and interact with" which only amount to "apply it" with the use of generic computer components or insignificant extra solution activity. None of the additional elements/limitations, taken alone or in combination, integrate the abstract idea into a practical application.
Step 2B: The claim(s) does/do not include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception because the additional limitation/element of "receiving an input to modify an interactive element of a gaming environment, wherein an interactive element comprises a non-player character (NPC), animated infographic, video, image, quiz, game object, and other aspects of the gaming environment which a user may be able to access and interact with" is also well-understood, routine and conventional (WURC) activity similar to "receiving or transmitting data over a network" (see MPEP 2106.05(d), "Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network)"). Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claim 18
Step 1: Claim 18 is a process claim which is fall within at least one of the four categories of patent eligible subject matter.
Step 2A Prong 1: The claim(s) does/do not further recite(s) elements/limitations which can be reasonably considered as mental processes (i.e., which "can be performed in the human mind, or by a human using a pen and paper") or mathematical concepts/algorithms/calculations.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the claim(s) further recite(s) additional element/limitation of "receiving an input to modify an interactive element of a gaming environment, wherein a gaming environment comprises a video game, online game, MMORPG, and a virtual reality environment" which only amount to "apply it" with the use of generic computer components or insignificant extra solution activity. None of the additional elements/limitations, taken alone or in combination, integrate the abstract idea into a practical application.
Step 2B: The claim(s) does/do not include additional limitations/elements that are sufficient to amount to significantly more than the judicial exception because the additional limitation/element of "receiving an input to modify an interactive element of a gaming environment, wherein a gaming environment comprises a video game, online game, MMORPG, and a virtual reality environment" is also well-understood, routine and conventional (WURC) activity similar to "receiving or transmitting data over a network" (see MPEP 2106.05(d), "Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network)"). Thus, none of the additional limitations/elements, taken either alone or combined, amount to significantly more than the abstract idea.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-3, 7-11, 15-16, and 19-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by GOSLIN et al. (US 2021/0081498 A1, pub. date: 03/18/2021), hereinafter GOSLIN.
Independent Claims 1, 9, and 19
GOSLIN discloses a system (GOSLIN, ¶ [0034] with 115 in FIG. 3: AI System 115) comprising: at least one processor (GOSLIN, ¶ [0034] with 310 in FIG. 3: Processor 310); and memory storing instructions (GOSLIN, ¶ [0034] with 315 in FIG. 3: programming instructions stored in Memory 315) that, when executed by the at least one processor, cause the system to perform a set of operations (GOSLIN, ¶ [0034]: Processor 310 retrieves and executes programming instructions stored in Memory 315 as well as stores and retrieves application data residing in Storage 320; ¶ [0062]: the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the block(s) of the flowchart illustrations or block diagrams), the set of operations comprising:
receive an input, by a director service, to modify an interactive element of an interactive environment; analyze, by the director service, the interactive environment for a specific context based on the input (GOSLIN, ¶¶ [0014]-[0015] and [0018]-[0019]: the AI systems that utilize machine learning to interact with users in a dynamic and immersive manner; the AI system utilizes a collection of machine learning (ML) models, each trained on specified context or scope; the AI system determines the context of a given input; dynamically select ML models as the context of the interaction shifts, in order to continue to provide deep conversation; the AI system acts as an intelligent character in a role-playing game; the AI system infers the context of the conversation based on input as the user interacts with the character; the AI system uses natural language processing (NLP) and/or natural language understanding (NLU) to attempt to identify role-playing scenario the user is partaking in; the AI system repeatedly determines the context for each input (which may include analyzing prior input), such that the AI system can respond to shifting contexts (e.g., if the user switches from a stealth methodology to a brute-force methodology); the input comprises natural language, and may include text and/or audio input; ¶¶ [0023]-[0025] and [0027] with FIG. 1: a User 105 provides Input 110 to the AI System 115; the Input 110 include natural language, and may include text (e.g., typed by the User 105) or audio speech data (e.g., recorded by a microphone); if the Input 110 is audio, the AI System 115 utilizes speech-to-text techniques, and processes the resulting text; the AI System 115 determines the context of the Input 110 based at least in part on prior user selection; the AI System 115 uses NLP and/or NLU (e.g., performed on audio, text, and/or combinations thereof) to determine some or all of the context of the Input 110; e.g., the user may select an objective, and the AI System 115 uses NLP to identify the means the user is pursuing, and/or to infer the character or role the user is playing; the User 105 may provide additional Input 110; the User 105 and AI System 115 can interact during the role-playing scenario until the User 105 quits, or until predefined criteria are met; ¶¶ [0028]-[0030] with FIG. 2: the user interacts with a scenario creator to select one or more aspects of the role-playing scenario they wish to use; the options available for a given selection can depend on one or more other selections; i.e., the selections for each of the Objectives 210, Roles 220, and/or Means 230 may have predefined relationships defining combinations that can be selected; ¶ [0042] with FIG. 3: the Context Component 340 may determine the context based at least in part on the original input provided by the user (e.g., the explicit selection(s) the user made in initiating the scenario); ¶ [0049] with 405 and 410 in FIG. 4 and FIG. 3: at block 405, where an Interactivity Application 330 receives user input; this input may include textual input, audio input, and the like; at block 410, the Interactivity Application 330 evaluates the input to determine the current context; ¶ [0057] with 605 and 610 in FIG. 6 and FIG. 3: at block 605, the Interactivity Application 330 receives a first input to an artificial intelligence (AI) system; at block 610, the Interactivity Application 330 determines a first context of the first input, wherein the first context indicates a first role-playing scenario);
receive one or more environment guidelines; associate, by the director service, the input with one or more environment guidelines that provide systemic context about the interactive environment (GOSLIN, ¶¶ [0015]-[0018]: a roleplaying scenario is an interactive session that includes a role for the user to play, an objective for the user to pursue, and/or a means that the user should use to achieve the objective; the user plays a more specific role, such as a particular character from a movie or television show; the user is expected to understand the personality of the selected role, and must "stay in character" (e.g., by using an appropriate means) to achieve the objective; each role of the user is related or correlated to an expected or anticipated means or approach; e.g., an "intellect" means may correspond to a "wizard" character, while an "intimidation" means corresponds to a "knight" character; this system can also be used to evaluate input from the user; e.g., the fitness function can evaluate each line corresponding to the character from the movie, show, or other predefined script, and then evaluate the user input to determine a mathematical distance between the user's word choice and phrasing, as compared to the "canonical" phrases; based on this distance, the AI system can determine whether the user is accurately playing the role; the AI system may search for keywords in user input that have predefined associations with a role, means, and/or objective; some or all of the context may be known or provided to the AI system; e.g., the user can select one or more aspects of the scenario (e.g., the objective, the role, and/or the means), and the AI system can infer the remaining aspects; ¶¶ [0024] and [0036]-[0038]: each ML Model 120A-N was trained based on training data corresponding to a given context (e.g., a specific combination of role, means, and objective); the training data includes samples of input text and a corresponding output text for each sample of input text; e.g., the training data is collected from roleplaying sessions between users (e.g., as part of a development team), and/or from early test users; the input from each user can be used as either exemplar input or target output, depending on the particular model being trained ( e.g., depending on which role the AI system will be playing, and which role the user will be playing); a separate set of training data is used for each ML Model 120, where each set of training data is associated with a respective context (e.g., a role-playing scenario); each ML Model 120 is labeled or otherwise associated with the context that corresponds to the underlying training data);
determine, by the director service, a intent objective based on one or more of the input, specific context and one or more environment guidelines; generate, by the director service, a prompt for a generative machine learning model based on the intent objective (GOSLIN, ¶¶ [0015]-[0019] and [0022]: the AI system can utilize techniques including keyword identification, sentiment analysis, intent evaluation, parsing to determine meaning, and the like; the AI system responds in part based on whether the determined means the user is relying on aligns with the expected means for the determined role the user is playing; each objective corresponds to a suggested or best means or role; in order to respond based on the alignment between the expected means and the actual means the user is utilizing, the AI system uses one or more fitness functions to ensure the output makes logical sense; the AI system similarly uses a fitness function to ensure that the output is authentic to a particular character (e.g., the character that the AI system is playing); the AI system utilizes NLP and/or NLU to determine the objective, role, and/or means the user is using for the role-playing scenario; once the context of a given input is determined, the AI system selects a corresponding ML model to process the input; the AI system includes a respective ML model trained for each roleplaying scenario (e.g., trained for each combination of objective, role, and means); once the scenario is identified, the AI system can select the corresponding ML model for evaluating the input; this can allow each model to be specialized with a constrained context, while allowing the AI system to dynamically shift between contexts by selecting other models; the AI system maintains context-specific weights for each ML model; given an input context, the AI system can probabilistically select a ML model to use based on the context-specific weight associated with each; the AI system can modify these context-specific weights based on user feedback, in order to better select ML models for future interactions; ¶¶ [0024]-[0025] with FIG. 1: the AI System 115 determines the context of the Input 110, in order to select an appropriate ML Model 120A-N; once the context is determined, the AI System 115 selects the corresponding ML Model 120; ¶¶ [0040]-[0046] with FIG. 3: the NLP Component 335 performs a variety of NLP processing such as sentiment analysis, keyword detection, parsing to determine intent or meaning, and the like; once the NLP Component 335 has determined the intent and/or sentiment, identified keywords, or performed any other NLP processing, some or all of the resulting data is passed as input to the ML Models 120; the Context Component 340 determines the current context of the user input in order to facilitate selection of an appropriate ML Model 120; the Context Component 340 infers the user's objective, role, and/or means based on identified keywords, or based on other results from the NLP Component 335; e.g., the Context Component 340 can determine that the objective is to retrieve an item, based on determining that predefined keywords relating to the item were included in one or more inputs received from the user; similarly, the Context Component 340 may determine that the user is roleplaying as a particular character or is using a particular means, based on keywords, and/or based on the sentiment or intent of the input(s), as identified by the NLP Component 335; the Context Component 340 can determine whether the user has shifted the role-playing scenario (such as by deciding to attempt a different means to achieve the objective); the Context Component 340 may identify this transition and select a different ML Model 120 (e.g., based on keywords in the current input, based on sentiment analysis on the input, based on intent analysis, and the like); in one embodiment, the Context Component 340 can allow this change and the role-playing scenario will be shifted accordingly (e.g., by selecting other ML Models 120); in another embodiment, the Context Component 340 can continue to output the previous context; e.g., the Context Component 340 can determine that the user is switching roles, but may nevertheless indicate, to the ML Component 345, that the role remains the same (e.g., because predefined rules indicate that the role cannot be changed mid-scenario); once the Context Component 340 has determined the current context (e.g., the objective, role, and means), the Context Component 340 provides an indication to the ML Component 345 reflecting these determinations; the ML Component 345 receives the current context from the Context Component 340, and selects one or more of the ML Models 120 based on this context; the ML Component 345 identifies and selects the ML Model 120 that is associated with a matching context; the ML Component 345 may select one or more other ML Models 120 (e.g., periodically, or in a probabilistic manner); e.g., the ML Models 120 are associated with context-specific weights, indicating a likelihood that each will be selected given a particular context; given a first context C, a first ML Model 120 (e.g., one trained on the same context C) may be associated with a relatively high weight, such that it will be selected frequently; similarly, a second ML Model 120 (trained on a different context) can be associated with a relatively lower weight, such that it is selected less frequently than the first; the weight of each ML Model 120 is determined based in part on the vector distance between the current context C and the respective context C' of each respective ML Model 120; the ML Component 345 can dynamically modify the context-specific weights of each ML Model 120 during use; ¶¶ [0050]-[0051] with 415 in FIG. 4 and FIG. 3: block 415, where the Interactivity Application 330 identifies and selects one or more ML Models 120 based on the determined context; the Interactivity Application 330 selects the ML Model 120 with a matching context; the Interactivity Application 330 probabilistically selects a model based on the determined context, the confidence in this determination, and the context-specific weights associated with each ML Model 120; the Interactivity Application 330 further refines the context-specific weights of each ML Model 120, and/or the internal weights of the previously-selected model, based on the current input; the Interactivity Application 330 performs one or more NLP operations on the input (e.g., keyword identification, sentiment analysis, intent determination, and the like), and processes the result with the ML Model 120; ¶ [0057] with 615 in FIG.6 and FIG. 3: block 615, where the Interactivity Application 330 selects a first ML model of the plurality of ML models based on the determined first context, wherein the first ML model was trained based at least in part on the first role-playing scenario);
execute the generative machine learning model with the prompt to produce a model output (GOSLIN, ¶ [0014]: the AI system generates responses using one or more ML models that correspond to that context; ¶¶ [0023] and [0026] with FIG. 1: each ML Model 120 corresponds to a machine learning model that was trained to receive textual input and generate or select a corresponding output; this output includes dynamically generated text or audio; the output corresponds to text or audio that is selected from a predefined set of responses; once the AI System 115 has selected a ML Model 120, the Input 110 is provided as input in order to generate a corresponding Response 125; the selected ML Model 120 dynamically generates output text based on weights learned during the training phase; the ML Model 120 acts as a classifier to classify the input, and uses corresponding predefined text for that category as the output; ¶¶ [0038] and [0046]-[0047] with FIG. 3: use each ML Model 120 based on their specific contexts (e.g., based on matching the context of the current input with the ML Models 120) to generate deeper and more specific responses within each context; using a particular ML Model 120 to generate a response; in addition to receiving the current context from the Context Component 340, the ML Component 345 receives results of the NLP analysis from the NLP Component 335; once an ML Model 120 has been selected, the ML Component 345 provides this input to the selected model in order to generate an output; ¶ [0051] with 420 in FIG. 4 and FIG. 3: at block 420, the Interactivity Application 330 generates a response using the identified and selected ML Model 120; the ML Models 120 are trained to dynamically generate a response; ¶ [0057] with 620 in FIG.6 and FIG. 3: at block 620, the Interactivity Application 330 generates a first output by processing the first input using the first ML model.);
evaluate, by the director service, the model output for responsiveness to the input and the environment guidelines (GOSLIN, ¶ [0020]: the AI system may periodically or randomly use a different ML model for a given context, and evaluate the user's response; i.e., for a given a first context C, the AI system may determine to use an ML model trained on context C' to generate the response; the AI system can then analyze how the user responds, in order to determine whether the ML model corresponding to context C' should be used at least occasionally in the future, given context C; e.g. the system may determine whether the user liked the response, or appeared confused or frustrated; this evaluation can include receiving user feedback (e.g., a thumbs up or thumbs down, a score, and the like) and/or using NLP and/or NLU to analyze the user's response (e.g., to determine the sentiment); ¶ [0025]: if user's rate the "flattery" response higher for entertainment value, the system can learn to use this model more often, given the context; ¶ [0033]: the system further utilizes techniques such as NLP to perform semantic analysis in order to determine whether the user enjoyed the scenario; ¶¶ [0040]-[0041] with FIG.3: the NLP Component 335 may perform sentiment analysis to determine whether the user is enjoying the ongoing scenario; the ML Models 120 can be refined based on this analysis; evaluating results output by the NLP Component 335, which may include data relating to the most recent input, as well as from one or more prior inputs; ¶¶ [0046]-[0048] with FIG. 3: after using a particular ML Model 120 to generate a response, the ML Component 345 evaluates the user's next input in order to refine the context-specific weight(s) associated with the ML Model 120; e.g., if the user-response is positive (e.g., with a positive sentiment evaluation from the NLP Component 335), the ML Component 345 may increase the weight of the previously-selected ML Model 120, in order to increase the probability that it will be selected in the future, given the same context; similarly, if the user's subsequent response is negative, the ML Component 345 may reduce the weight of the previously-selected ML Model 120; the ML Model 120 is trained as a classifier that receives input (e.g., text, the results of one or more NLP processes, and the like) and select from a large set of predefined responses; these responses may similarly include textual responses and/or pre-recorded audio; in order to determine the quality of a given output, the ML Component 345 can evaluate explicit ratings from the user, subsequent responses from the user, facial expressions or other non-verbal emotional cues (e.g., laughing) from the user, and the like; if the subsequent user input is positive, the ML Component 345 can refine the model to increase the probability that the previous output will be selected again, given the same input and/or context; similarly, if the subsequent input is negative, the ML Component 345 reduces the probability that the ML Model 120 will select the same response again, given the previous input/context; ¶ [0051]: the models are trained to classify the input in order to select from a predefined set of responses); and
when the model output is responsive, modify the interactive element of the interactive environment based on the model output (GOSLIN, ¶ [0021]: in order to improve responses and user-engagement, the learning system can provide differing responses to different users; the AI system can sometimes reach better solution by randomly (or pseudo-randomly) changing the order of things, prioritizing something that was previously lower priority, and the like; this randomness enables the AI system to continue searching for better solutions; e.g., the system may be at a local maxima for a solution, but there can be one or more better global maxima available which can be discovered through occasional random selections; ¶ [0027] with FIG. 1: these Responses 125 are provided to the User 105; ¶¶ [0047]-[0048] with FIG. 3: the ML Model 120 dynamically generates an output, which may be textual, audio, text converted to audio using text-to-speech models, and the like; this output is then provided to the user; based on the subsequent user input, the ML Component 345 can modify or refine the selected ML Model 120; the Interactivity Application 330 can provide immersive role-playing to the user; ¶¶ [0051]-[0052] with 425 in FIG.4 and FIG. 3: at block 425, the response is returned to the user (e.g., by displaying it on a screen, or by outputting audio); this process can then be repeated until the user exits the scenario, or the scenario otherwise terminates; each objective has one or more predefined termination points; these termination points are associated with particular responses that may be selected by the ML Model 120; e.g., if a particular response is generated by the model, the Interactivity Application 330 may determine that the scenario has ended in success or failure, and end the role-playing interaction after the response is output) (GOSLIN, ¶¶ [0031]-[0033] with FIG. 2: the GUI 200 further includes a Button 235 to generate suggested scenarios based on the user's previous interactions with the system; the suggested scenario is based in part on the number of times a given selection has been used by the user, and/or how recently the selection has occurred; the suggestion is based in part on the length of time the user spends in each scenario; as the user interacts with the system in the scenario, the system maintains data about the interaction, such as the length of time the scenario lasts (e.g., until the user succeeds, fails, or ends the scenario), the result of the scenario (e.g., whether the user succeeded, failed, or quit), and the like; ¶¶ [0053]-[0056] with FIGS. 3 and 5: providing dynamic interactions using machine learning; at block 505, where an Interactivity Application 330 identifies the user for which the suggestion should be tailored (e.g., the user making the request); at block 510, the Interactivity Application 330 then retrieves a profile of the identified user; at block 515, the Interactivity Application 330 evaluates the data contained in the user profile to evaluate the roles the user previously played in prior interactions; at blocks 520 and 525, the Interactivity Application 330 performs similar evaluations of the user's prior objectives and means, respectively; the Interactivity Application 330 further determines, for each such prior scenario, a level of satisfaction the user experienced; this may be determined based on user selection (e.g., indicating a rating at the end of the interaction), and/or inferred based on sentiment analysis of the user input during and/or after the interaction; the Interactivity Application 330 evaluates other user profiles and/or recent interactions from other users in order to identify role(s), objective(s), and/or mean(s) have been recently used by others or that are popular among other users; at block 530, the Interactivity Application 330 generates and suggests one or more new scenarios to the user, based on the above analysis; the generated suggestions include scenarios that closely match with the user's preferences inferred by the Interactivity Application 330 based on the number of times a particular selection was used, the duration of interactivity with the selection, the average satisfaction when the selection was used, how recently the selection was used, and the like).
GOSLIN further discloses a method comprising the set of operations described above, wherein the interactive environment is the gaming environment (GOSLIN, ¶ [0015]: the AI system acts as an intelligent character in a role-playing game; a roleplaying scenario is an interactive session that includes a role for the user to play, an objective for the user to pursue, and/or a means that the user should use to achieve the objective; the role can include any character, such as a child, a strong warrior, a wise wizard, a stealthy archer, and the like; similarly, objectives can include any goal, such as infiltrating a secure area, questioning or interrogating a character for information, retrieving items, rescuing characters, and the like; further, the means can similarly include any methodology of achieving the objective, such as stealth, force, intimidation, flattery, diversion, distraction, and the like)
GOSLIN further discloses a computer storage media including instructions (GOSLIN, ¶¶ [0034]-[0035] with 315 and 320 in FIG. 3: programming instructions
stored in Memory 315; application data residing in Storage 320; the Storage 320
includes one or more ML Models 120, as well as User Profiles 350), which when executed by a processor (GOSLIN, ¶ [0034] with 310 in FIG. 3: Processor 310 retrieves and executes programming instructions stored in Memory 315 as well as stores and retrieves application data residing in Storage 320), cause the processor to perform the set of operations described above (GOSLIN, ¶ [0063]: these computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other device to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the block(s) of the flowchart illustrations or block diagrams).
Claims 2 and 10
GOSLIN discloses all the elements as stated in Claims 1 and 9 respectively and further discloses monitor the interactive environment for a subsequent input based on the provided model output, when there is a subsequent input, analyze, by the director service, the interactive environment for a specific context based on the subsequent input; associate the subsequent input with one or more environment guidelines that provide systemic context about the interactive environment; determine a second intent objective based on one or more of the subsequent input, specific context and one or more environment guidelines; and generate a second prompt for the generative machine learning model based on the intent objective (GOSLIN, ¶¶ [0014]-[0022]: the AI system determines the context of a given input, and generates responses using one or more ML models that correspond to that context; the AI system can dynamically select ML models as the context of the interaction shifts, in order to continue to provide deep conversation; the AI system responds in part based on whether the determined means the user is relying on aligns with the expected means for the determined role the user is playing; in order to respond based on the alignment between the expected means and the actual means the user is utilizing, the AI system uses one or more fitness functions to ensure the output makes logical sense; the AI system similarly uses a fitness function to ensure that the output is authentic to a particular character (e.g., the character that the AI system is playing); the fitness function can evaluate each line corresponding to the character from the movie, show, or other predefined script, and then evaluate the user input to determine a mathematical distance between the user's word choice and phrasing, as compared to the "canonical" phrases; based on this distance, the AI system can determine whether the user is accurately playing the role; the AI system utilizes NLP and/or NLU to determine the objective, role, and/or means the user is using for the role-playing scenario; e.g., the AI system may search for keywords in user input that have predefined associations with a role, means, and/or objective; some or all of the context may be known or provided to the AI system; e.g., the user can select one or more aspects of the scenario (e.g., the objective, the role, and/or the means), and the AI system can infer the remaining aspects; the AI system repeatedly determines the context for each input (which may include analyzing prior input), such that the AI system can respond to shifting contexts (e.g., if the user switches from a stealth methodology to a brute-force methodology); once the context of a given input is determined, the AI system selects a corresponding ML model to process the input; the AI system includes a respective ML model trained for each roleplaying scenario (e.g., trained for each combination of objective, role, and means); the AI system uses a constant mapping from the determined context to the corresponding ML model; the AI system maintains context-specific weights for each ML model; given an input context, the AI system can probabilistically select a ML model to use based on the context-specific weight associated with each; the AI system can modify these context-specific weights based on user feedback, in order to better select ML models for future interactions; ¶ [0027] with FIG. 1: these Responses 125 are provided to the User 105. In tum, the User 105 may provide additional Input 110. In this way, the User 105 and AI System 115 can interact during the role-playing scenario until the User 105 quits, or until predefined criteria are met; ¶¶ [0046]-[0048]: after using a particular ML Model 120 to generate a response, the ML Component 345 evaluates the user's next input in order to refine the context-specific weight(s) associated with the ML Model 120; e.g., if the user-response is positive (e.g., with a positive sentiment evaluation from the NLP Component 335), the ML Component 345 may increase the weight of the previously-selected ML Model 120, in order to increase the probability that it will be selected in the future, given the same context; similarly, if the user's subsequent response is negative, the ML Component 345 may reduce the weight of the previously-selected ML Model 120; based on the subsequent user-input, the ML Component 345 can modify or refine the selected ML Model 120; in order to determine the quality of a given output, the ML Component 345 can evaluate explicit ratings from the user, subsequent responses from the user, facial expressions or other non-verbal emotional cues (e.g., laughing) from the user, and the like; if the subsequent user input is positive, the ML Component 345 can refine the model to increase the probability that the previous output will be selected again, given the same input and/or context. Similarly, if the subsequent input is negative, the ML Component 345 reduces the probability that the ML Model 120 will select the same response again, given the previous input/context; in this way, the Interactivity Application 330 can provide immersive role-playing to the user; ¶¶ [0049]-[0052] with FIGS. 3 and 4: at block 405, where an Interactivity Application 330 receives user input; at block 410, the Interactivity Application 330 evaluates the input to determine the current context; at block 415, where the Interactivity Application 330 identifies and selects one or more ML Models 120 based on the determined context; the Interactivity Application 330 further refines the context-specific weights of each ML Model 120, and/or the internal weights of the previously-selected model, based on the current input; at block 420, the Interactivity Application 330 generates a response using the identified and selected ML Model 120; at block 425, the response is returned to the user (e.g., by displaying it on a screen, or by outputting audio); this process can then be repeated until the user exits the scenario, or the scenario otherwise terminates).
Claims 3 and 11
GOSLIN discloses all the elements as stated in Claims 1 and 9 respectively and further discloses wherein generate a prompt further comprises: associate one or more prompt templates with the intent objective; and combine the one or more prompt templates into a prompt (GOSLIN, ¶¶ [0015]-[0018]: the AI system responds in part based on whether the determined means the user is relying on aligns with the expected means for the determined role the user is playing; each objective corresponds to a suggested or best means or role; in order to respond based on the alignment between the expected means and the actual means the user is utilizing, the AI system uses one or more fitness functions to ensure the output makes logical sense; the AI system similarly uses a fitness function to ensure that the output is authentic to a particular character (e.g., the character that the AI system is playing); this system can also be used to evaluate input from the user; e.g., suppose the user is roleplaying as a particular character from a move or series; the fitness function can evaluate each line corresponding to the character from the movie, show, or other predefined script, and then evaluate the user input to determine a mathematical distance between the user's word choice and phrasing, as compared to the "canonical" phrases; based on this distance, the AI system can determine whether the user is accurately playing the role; the AI system may search for keywords in user input that have predefined associations with a role, means, and/or objective; ¶¶ [0023] and [0026] with FIG. 1: each ML Model 120 corresponds to a machine learning model that was trained to receive textual input and generate or select a corresponding output; the output corresponds to text or audio that is selected from a predefined set of responses; the ML Model 120 acts as a classifier to classify the input, and uses corresponding predefined text for that category as the output; using a classifier as the ML Model 120, the AI System 115 may output pre-recorded phrases as the Response 125; ¶¶ [0028]-[0030] with FIG. 2: the Objective 210 is selected from a predefined set; the Roles 220 and Means 230 are similarly selected from predefined sets; the options available for a given selection can depend on one or more other selections; i.e., the selections for each of the Objectives 210, Roles 220, and/or Means 230 may have predefined relationships defining combinations that can be selected; ¶¶ [0024] and [0036]-[0037]: the training data includes sample input phrases used as input, as well as corresponding target output phrases to train each model; the training data includes samples of input text and a corresponding output text for each sample of input text; e.g., the training data is collected from roleplaying sessions between users (e.g., as part of a development team), and/or from early test users; the input from each user can be used as either exemplar input or target output, depending on the particular model being trained e.g., depending on which role the AI system will be playing, and which role the user will be playing); in one embodiment, the same set of training data can be used to train two ML Models 120, one for each side of the conversation; in this way, the AI System 115 can be trained to play two different roles based on a single set of training data; in some embodiments, a separate set of training data is used for each ML Model 120, where each set of training data is associated with a respective context (e.g., a role-playing scenario); the context of a set of training data is the role-playing scenario that the training data corresponds to; e.g., a first piece of training data may include one or more textual inputs, along with corresponding responses, that were recorded during a role-playing experience (e.g., between users, writers, actors, or other humans); i.e., the training data is collected by recording textual interactions between people (e.g., two writers role-playing as part of a scenario); the context of this first piece of training data can include identifiers for the objective, role, and/or means that the users were engaging in when the text was recorded; in this way, each ML Model 120 can be trained based on the particular underlying scenario, which can improve responses for the given scenario; each ML Model 120 is labeled or otherwise associated with the context that corresponds to the underlying training data; in this way, the AI System 115 can selectively use each ML Model 120 based on their specific contexts (e.g., based on matching the context of the current input with the ML Models 120) to generate deeper and more specific responses within each context; ¶¶ [0041] and [0047] with FIG. 3: the Context Component 340 infers the user's objective, role, and/or means based on identified keywords, or based on other results from the NLP Component 335; e.g., the Context Component 340 can determine that the objective is to retrieve an item, based on determining that predefined keywords relating to the item were included in one or more inputs received from the user; similarly, the Context Component 340 may determine that the user is roleplaying as a particular character or is using a particular means, based on keywords, and/or based on the sentiment or intent of the input(s), as identified by the NLP Component 335; the ML Model 120 is trained as a classifier that receives input ( e.g., text, the results of one or more NLP processes, and the like) and select from a large set of predefined responses; these responses may similarly include textual responses and/or pre-recorded audio; ¶ [0051]: the models are trained to classify the input in order to select from a predefined set of responses).
Claims 7 and 15
GOSLIN discloses all the elements as stated in Claims 1 and 9 respectively and further discloses wherein the set of operations further comprises: store one or more of the input, the one or more intent objectives, the prompt, and the model output (GOSLIN, ¶¶ [0031]-[0033]: each user has a corresponding user profile that maintains interaction history for the user, such as the scenarios they have previously engaged in (e.g., the objective, role, and/or means they used), the length of time they have spent interacting during each scenario, and the like; as the user interacts with the system in the scenario, the system maintains data about the interaction, such as the length of time the scenario lasts (e.g., until the user succeeds, fails, or ends the scenario), the result of the scenario (e.g., whether the user succeeded, failed, or quit), and the like; the system further utilizes techniques such as NLP to perform semantic analysis in order to determine whether the user enjoyed the scenario; this data is maintained in the profile of the user; ¶¶ [0036]-[0038] with FIG. 3: the training data is collected from roleplaying sessions between users (e.g., as part of a development team), and/or from early test users; the input from each user can be used as either exemplar input or target output, depending on the particular model being trained (e.g., depending on which role the AI system will be playing, and which role the user will be playing); the context of a set of training data is the role-playing scenario that the training data corresponds to; e.g., a first piece of training data may include one or more textual inputs, along with corresponding responses, that were recorded during a role-playing experience (e.g., between users, writers, actors, or other humans); i.e., the training data is collected by recording textual interactions between people (e.g., two writers role-playing as part of a scenario); the context of this first piece of training data can include identifiers for the objective, role, and/or means that the users were engaging in when the text was recorded; in this way, each ML Model 120 can be trained based on the particular underlying scenario, which can improve responses for the given scenario; each ML Model 120 is labeled or otherwise associated with the context that corresponds to the underlying training data; in this way, the AI System 115 can selectively use each ML Model 120 based on their specific contexts (e.g., based on matching the context of the current input with the ML Models 120) to generate deeper and more specific responses within each context; ¶¶ [0053]-[0056]: maintain user profiles for each user, where the user profile specifies information about previous interactions, such as the previous scenarios they have participated in, the length of time they interacted with each, and the like; determine, for each such prior scenario, a level of satisfaction the user experienced; this may be determined based on user selection (e.g., indicating a rating at the end of the interaction), and/or inferred based on sentiment analysis of the user input during and/or after the interaction; evaluate other user profiles and/or recent interactions from other users in order to identify role(s), objective(s), and/or mean(s) have been recently used by others or that are popular among other users; infers the user's preferences based on the number of times a particular selection was used, the duration of interactivity with the selection, the average satisfaction when the selection was used, how recently the selection was used, and the like; determine the user's preferred selections with respect to some factors of the scenario (e.g., the role, objective, and/or means), and selects one of these factors to change).
Claims 8, 16, and 20
GOSLIN discloses all the elements as stated in Claims 1, 9, and 19 respectively and further discloses when the model output is not responsive, the set of operations further comprises: generate a new prompt for the generative machine learning model based on the intent objective; and execute the generative machine learning model with the new prompt to produce a new model output (GOSLIN, ¶¶ [0014] and [0018]-[0022]: dynamically select ML models as the context of the interaction shifts, in order to continue to provide deep conversation; the AI system repeatedly determines the context for each input (which may include analyzing prior input), such that the AI system can respond to shifting contexts (e.g., if the user switches from a stealth methodology to a brute-force methodology); the AI system includes a respective ML model trained for each roleplaying scenario (e.g., trained for each combination of objective, role, and means); the AI system can select the corresponding ML model for evaluating the input; this can allow each model to be specialized with a constrained context, while allowing the AI system to dynamically shift between contexts by selecting other models; the AI system can select different ML models for the same context; e.g., the AI system may periodically or randomly use a different ML model for a given context, and evaluate the user's response; i.e., for a given a first context C, the AI system may determine to use an ML model trained on context C' to generate the response; the AI system can then analyze how the user responds, in order to determine whether the ML model corresponding to context C' should be used at least occasionally in the future, given context C; e.g., the system may determine whether the user liked the response, or appeared confused or frustrated; in order to improve responses and user-engagement, the learning system can provide differing responses to different users; the AI system can sometimes reach better solution by randomly (or pseudo-randomly) changing the order of things, prioritizing something that was previously lower priority, and the like; this randomness enables the AI system to continue searching for better solutions; e.g. the system may be at a local maxima for a solution, but there can be one or more better global maxima available which can be discovered through occasional random selections; the AI system maintains context-specific weights for each ML model; given an input context, the AI system can probabilistically select a ML model to use based on the context-specific weight associated with each; the AI system can modify these context-specific weights based on user feedback, in order to better select ML models for future interactions; ¶¶ [0023] and [0025]-[0026] with FIG. 1: each ML Model 120 corresponds to a machine learning model that was trained to receive textual input and generate or select a corresponding output; in one embodiment, this output includes dynamically generated text or audio; in another embodiment, the output corresponds to text or audio that is selected from a predefined set of responses; in one embodiment, the selected ML Model 120 dynamically generates output text based on weights learned during the training phase; in another embodiment, the ML Model 120 acts as a classifier to classify the input, and uses corresponding predefined text for that category as the output; i.e., when the output text or audio is not found in a predefined set of responses, the output text or audio can be dynamically generated; ¶¶ [0042]-[0048] with FIG. 3: the Context Component 340 continues to evaluate the context for each received input; the Context Component 340 can determine whether the user has shifted the role-playing scenario (such as by deciding to attempt a different means to achieve the objective); the Context Component 340 may identify this transition and select a different ML Model 120 (e.g., based on keywords in the current input, based on sentiment analysis on the input, based on intent analysis, and the like); in one embodiment, the Context Component 340 can allow this change and the role-playing scenario will be shifted accordingly (e.g., by selecting other ML Models 120);
dynamically modify the context-specific weights of each ML Model 120 during use; after using a particular ML Model 120 to generate a response, the ML Component 345 evaluates the user's next input in order to refine the context-specific weight(s) associated with the ML Model 120; e.g., if the user-response is positive (e.g., with a positive sentiment evaluation from the NLP Component 335), the ML Component 345 may increase the weight of the previously-selected ML Model 120, in order to increase the probability that it will be selected in the future, given the same context. Similarly, if the user's subsequent response is negative, the ML Component 345 may reduce the weight of the previously-selected ML Model 120; the ML Model 120 dynamically generates an output, which may be textual, audio, text converted to audio using text-to-speech models, and the like; based on the subsequent user-input, the ML Component 345 can modify or refine the selected ML Model 120; in order to determine the quality of a given output, the ML Component 345 can evaluate explicit ratings from the user, subsequent responses from the user, facial expressions or other non-verbal emotional cues (e.g., laughing) from the user, and the like; if the subsequent user input is positive, the ML Component 345 can refine the model to increase the probability that the previous output will be selected again, given the same input and/or context; similarly, if the subsequent input is negative, the ML Component 345 reduces the probability that the ML Model 120 will select the same response again, given the previous input/context; ¶ [0051]: the ML Models 120 are trained to dynamically generate a response).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 4-5 and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over GOSLIN in view of Sharifi et al. (US 2022/0189474 A1, pub. date: 06/16/2022), hereinafter Sharifi.
Claims 4 and 12
GOSLIN discloses all the elements as stated in Claims 3 and 11 respectively and further discloses wherein associate one or more prompt templates further comprises: (GOSLIN, ¶¶ [0015]-[0018]: the AI system responds in part based on whether the determined means the user is relying on aligns with the expected means for the determined role the user is playing; each objective corresponds to a suggested or best means or role; in order to respond based on the alignment between the expected means and the actual means the user is utilizing, the AI system uses one or more fitness functions to ensure the output makes logical sense; the AI system similarly uses a fitness function to ensure that the output is authentic to a particular character (e.g., the character that the AI system is playing); this system can also be used to evaluate input from the user; e.g., suppose the user is roleplaying as a particular character from a move or series; the fitness function can evaluate each line corresponding to the character from the movie, show, or other predefined script, and then evaluate the user input to determine a mathematical distance between the user's word choice and phrasing, as compared to the "canonical" phrases; based on this distance, the AI system can determine whether the user is accurately playing the role; the AI system may search for keywords in user input that have predefined associations with a role, means, and/or objective; ¶¶ [0023] and [0026] with FIG. 1: each ML Model 120 corresponds to a machine learning model that was trained to receive textual input and generate or select a corresponding output; the output corresponds to text or audio that is selected from a predefined set of responses; the ML Model 120 acts as a classifier to classify the input, and uses corresponding predefined text for that category as the output; using a classifier as the ML Model 120, the AI System 115 may output pre-recorded phrases as the Response 125; ¶¶ [0028]-[0030] with FIG. 2: the Objective 210 is selected from a predefined set; the Roles 220 and Means 230 are similarly selected from predefined sets; the options available for a given selection can depend on one or more other selections; i.e., the selections for each of the Objectives 210, Roles 220, and/or Means 230 may have predefined relationships defining combinations that can be selected; ¶¶ [0024] and [0036]-[0037]: the training data includes sample input phrases used as input, as well as corresponding target output phrases to train each model; the training data includes samples of input text and a corresponding output text for each sample of input text; e.g., the training data is collected from roleplaying sessions between users (e.g., as part of a development team), and/or from early test users; the input from each user can be used as either exemplar input or target output, depending on the particular model being trained e.g., depending on which role the AI system will be playing, and which role the user will be playing); in one embodiment, the same set of training data can be used to train two ML Models 120, one for each side of the conversation; in this way, the AI System 115 can be trained to play two different roles based on a single set of training data; in some embodiments, a separate set of training data is used for each ML Model 120, where each set of training data is associated with a respective context (e.g., a role-playing scenario); the context of a set of training data is the role-playing scenario that the training data corresponds to; e.g., a first piece of training data may include one or more textual inputs, along with corresponding responses, that were recorded during a role-playing experience (e.g., between users, writers, actors, or other humans); i.e., the training data is collected by recording textual interactions between people (e.g., two writers role-playing as part of a scenario); the context of this first piece of training data can include identifiers for the objective, role, and/or means that the users were engaging in when the text was recorded; in this way, each ML Model 120 can be trained based on the particular underlying scenario, which can improve responses for the given scenario; each ML Model 120 is labeled or otherwise associated with the context that corresponds to the underlying training data; in this way, the AI System 115 can selectively use each ML Model 120 based on their specific contexts (e.g., based on matching the context of the current input with the ML Models 120) to generate deeper and more specific responses within each context; ¶¶ [0041] and [0047] with FIG. 3: the Context Component 340 infers the user's objective, role, and/or means based on identified keywords, or based on other results from the NLP Component 335; e.g., the Context Component 340 can determine that the objective is to retrieve an item, based on determining that predefined keywords relating to the item were included in one or more inputs received from the user; similarly, the Context Component 340 may determine that the user is roleplaying as a particular character or is using a particular means, based on keywords, and/or based on the sentiment or intent of the input(s), as identified by the NLP Component 335; the ML Model 120 is trained as a classifier that receives input ( e.g., text, the results of one or more NLP processes, and the like) and select from a large set of predefined responses; these responses may similarly include textual responses and/or pre-recorded audio; ¶ [0051]: the models are trained to classify the input in order to select from a predefined set of responses).
GOSLIN fails to explicitly disclose to generate an embedding for the intent objective; and identify one or more prompt templates that are semantically associated with the intent objective based on the embedding.
Sharifi teaches a system and a method relating to perform interactive actions using intelligent assistant (Sharifi, ¶¶ [0001]-[0002]), wherein generate an embedding for the intent objective; and identify one or more prompt templates that are semantically associated with the intent objective based on the embedding (Sharifi, ¶ [0008]: if the NL only clarification prompt is "do you want news about the actor John Doe or the producer John Doe", it can be determined that "actor" and "producer" satisfy a semantic similarity threshold; e.g., embeddings can be generated for "actor" and "producer" using a trained encoder (e.g., a trained neural network model), a distance between the "actor" embedding and the "producer" embedding determined, and the distance determined to satisfy the semantic similarity threshold be determined to provide an enhanced clarification prompt instead, such as one that includes a first image of the actor and a second image of the producer; ¶¶ [0027]-[0029]: dialog manager 118 may be configured to map a representation of a user request to perform some action, e.g., using the annotations, to one or more "responsive actions" of a plurality of candidate responsive actions that are then performed by automated assistant 100; mappings may include mappings between entities and candidate responsive actions that are performable in association with those entities; dialog manager 118 may employ one or more trained machine learning models, alone or in combination with one or more grammars; these trained machine learning models may be trained to identify intents, e.g., by embedding data indicative of a user's utterance into a latent space, and then determining which other embeddings (and therefore, intents) are most proximate, e.g., using techniques such as Euclidean distance, cosine similarity, etc.; various contextual signals may be used to perform various aspects of the natural language processing and dialog managing features; entity or entity type recognition, entity or entity type ranking, identification of candidate responsive actions associated with entities or entity types, ranking of candidate responsive actions, and/or filtering of candidate responsive actions, may be performed based on contextual signals; ¶¶ [0037]-[0038]: the NL only clarification prompt can be generated based on an NL only clarification prompt template; the NL only clarification prompt template may be pre-generated, or may be generated responsive to identifying the two or more candidate responsive actions as corresponding to the user's spoken utterance; the NL only clarification prompts can include slots filled by the natural language characterizations of the candidate responsive actions to be rendered in the prompt; the system may generate such natural language characterizations of the candidate responsive actions based on data generated during the natural language processing; an NL only clarification prompt may be the default clarification prompt used when the automated assistant 100 determines it cannot select between two or more candidate responsive actions for user requests generally, or for certain types of user requests; however, the automated assistant 100 may instead select an enhanced clarification prompt based on a variety of factor (s)/condition(s); ¶ [0051]: the system can generate an embedding based on the identified semantic property (e.g., using a trained neural network encoder) and compare the embedding to a plurality of embeddings of respective semantic properties associated with the candidate responsive actions corresponding to the renderings of the clarification prompt; the plurality of embeddings of respective semantic properties associated with the candidate responsive actions may have been generated by the trained neural network encoder or another trained neural network encoder based on metadata that indicates semantic properties associated with the candidate responsive actions; the system can determine that the given semantic property matches, or most closely matches, a given embedding, of the plurality of embeddings of the respective semantic properties, based on the comparison; e.g., assume the embeddings are word2vec representations; in this example, a cosine distance between the word2vec representation of the semantic property and each of the word2vec representations of the respective semantic properties of the candidate responsive actions of the prompt can be determined, and a given semantic property of a candidate responsive action that is associated with a respective cosine distance that satisfies a distance threshold can be utilized to determine the semantic property of the spoken utterance matches, or most closely matches, the given semantic property that is associated with a given candidate responsive action ( e.g., an exact match or "fuzzy" match); as a result, the given candidate responsive action that is associated with the given semantic property may be selected for performance; ¶ [0059]: the NL only clarification prompt can be generated based on an NL only clarification prompt template; the NL only clarification prompt template may be pre-generated, or may be generated responsive to identifying the two or more candidate responsive actions as corresponding to the user's spoken utterance; in the case of pre-generated NL only clarification prompt templates, the NL only clarification prompt template can be selected from among various NL only clarification templates based on the identified two or more candidate responsive actions that correspond to the user's spoken utterance; e.g., there may be NL only clarification prompt templates for online shopping, viewing or retrieving media content, interactions with a restaurant reservation application, booking flights, etc.; there may also be NL only clarification prompt templates for the various combinations of such actions, e.g., a clarification prompt for selecting between an online shopping action and a flight booking action; the NL only clarification prompt template may be selected from among various NL only clarification templates at least in part based on the natural language characterizations of the candidate responsive actions to be rendered in the prompt; e.g., if the clarification prompt is to include natural language characterizations of candidate responsive actions that are detailed and/or long-winded, then an NL only clarification prompt template that includes long pauses before and/or after the characterizations or that provides a summary at the end may be selected. In implementations in which the NL only clarification prompt template is generated after receiving the spoken utterance, it may likewise be tailored to the candidate responsive actions and/or their characterizations that are to be rendered in the clarification prompt; ¶¶ [0074]-[0076]: the system determines a similarity measure that reflects a textual and/or semantic similarity between the first term(s) and the second term(s); the system can embed the first term(s) as a first embedding in an embedding space and can embed the second term(s) as a second embedding in the embedding space using a trained encoder (e.g., a trained neural network embedding model); the system can use the embeddings to generate the similarity measure; the system can thus determine to provide the enhanced clarification prompt rather than the NL only clarification prompt based on the comparison(s) of the embeddings of the first and second terms and/or based on. the similarity measure; determine to provide the enhanced clarification prompt rather than the NL only clarification prompt based on determining that the similarity measure and/or embeddings indicate threshold level(s) of similarity and/or dissimilarity; the NL only clarification prompt template(s) and/or natural language characterizations of the candidate responsive actions have previously been generated, for this user or for another user as indicated by the historical automated assistant interaction data).
GOSLIN and Sharif are analogous art because they are from the same field of endeavor, a system and a method relating to perform interactive actions using intelligent assistant. It is also well known in the art that embeddings are commonly used in the language processing models. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Sharifi to GOSLIN. Motivation for doing so would help to processing a variety of requests and commands .
Claims 5 and 13
GOSLIN discloses all the elements as stated in Claims 1 and 9 respectively and further discloses wherein generate an intent objective further comprises: (GOSLIN, ¶¶ [0015]-[0018]: the AI system uses natural language processing (NLP) and/or natural language understanding (NLU) to attempt to identify role-playing scenario the user is partaking in; the AI system can utilize techniques including keyword identification, sentiment analysis, intent evaluation, parsing to determine meaning, and the like; a roleplaying scenario is an interactive session that includes a role for the user to play, an objective for the user to pursue, and/or a means that the user should use to achieve the objective; the user plays a more specific role, such as a particular character from a movie or television show; the user is expected to understand the personality of the selected role, and must "stay in character" (e.g., by using an appropriate means) to achieve the objective; each role of the user is related or correlated to an expected or anticipated means or approach; e.g., an "intellect" means may correspond to a "wizard" character, while an "intimidation" means corresponds to a "knight" character; the AI system responds in part based on whether the determined means the user is relying on aligns with the expected means for the determined role the user is playing; similarly, each objective corresponds to a suggested or best means or role; in order to respond based on the alignment between the expected means and the actual means the user is utilizing, the AI system uses one or more fitness functions to ensure the output makes logical sense; the AI system similarly uses a fitness function to ensure that the output is authentic to a particular character (e.g., the character that the AI system is playing); this system can also be used to evaluate input from the user; e.g., suppose the user is roleplaying as a particular character from a move or series; the fitness function can evaluate each line corresponding to the character from the movie, show, or other predefined script, and then evaluate the user input to determine a mathematical distance between the user's word choice and phrasing, as compared to the "canonical" phrases; based on this distance, the AI system can determine whether the user is accurately playing the role; the AI system utilizes NLP and/or NLU to determine the objective, role, and/or means the user is using for the role-playing scenario; e.g., the AI system may search for keywords in user input that have predefined associations with a role, means, and/or objective; some or all of the context may be known or provided to the AI system; e.g., the user can select one or more aspects of the scenario (e.g., the objective, the role, and/or the means), and the AI system can infer the remaining aspects; the AI system repeatedly determines the context for each input (which may include analyzing prior input), such that the AI system can respond to shifting contexts; ¶¶ [0040]-[0045] with FIG. 3: the NLP Component 335 performs a variety of NLP processing such as sentiment analysis, keyword detection, parsing to determine intent or meaning, and the like; once the NLP Component 335 has determined the intent and/or sentiment, identified keywords, or performed any other NLP processing, some or all of the resulting data is passed as input to the ML Models 120; the Context Component 340 determines the current context of the user input in order to facilitate selection of an appropriate ML Model 120; the Context Component 340 infers the user's objective, role, and/or means based on identified keywords, or based on other results from the NLP Component 335; e.g., the Context Component 340 can determine that the objective is to retrieve an item, based on determining that predefined keywords relating to the item were included in one or more inputs received from the user; similarly, the Context Component 340 may determine that the user is roleplaying as a particular character or is using a particular means, based on keywords, and/or based on the sentiment or intent of the input(s), as identified by the NLP Component 335; the Context Component 340 may determine the context based at least in part on the original input provided by the user (e.g., the explicit selection(s) the user made in initiating the scenario); the Context Component 340 can determine whether the user has shifted the role-playing scenario (such as by deciding to attempt a different means to achieve the objective); the Context Component 340 may identify this transition and select a different ML Model 120 (e.g., based on keywords in the current input, based on sentiment analysis on the input, based on intent analysis, and the like); in some embodiments, some or all of the selections are not changeable during the interaction; e.g., the Context Component 340 may infer that the user is attempting to play as a different role, or pursue a different objective; in one embodiment, the Context Component 340 can allow this change and the role-playing scenario will be shifted accordingly (e.g., by selecting other ML Models 120); in another embodiment, the Context Component 340 can continue to output the previous context; e.g., the Context Component 340 can determine that the user is switching roles, but may nevertheless indicate, to the ML Component 345, that the role remains the same (e.g., because predefined rules indicate that the role cannot be changed mid-scenario); once the Context Component 340 has determined the current context (e.g., the objective, role, and means), the Context Component 340 provides an indication to the ML Component 345 reflecting these determinations).
GOSLIN fails to explicitly disclose to generate an embedding for one or more of the input, specific context, and environment guidelines.
Sharifi teaches a system and a method relating to perform interactive actions using intelligent assistant (Sharifi, ¶¶ [0001]-[0002]), wherein generate an embedding for one or more of the input, specific context, and environment guidelines (Sharifi, ¶ [0008]: if the NL only clarification prompt is "do you want news about the actor John Doe or the producer John Doe", it can be determined that "actor" and "producer" satisfy a semantic similarity threshold; e.g., embeddings can be generated for "actor" and "producer" using a trained encoder (e.g., a trained neural network model), a distance between the "actor" embedding and the "producer" embedding determined, and the distance determined to satisfy the semantic similarity threshold be determined to provide an enhanced clarification prompt instead, such as one that includes a first image of the actor and a second image of the producer; ¶¶ [0027]-[0029]: dialog manager 118 may be configured to map a representation of a user request to perform some action, e.g., using the annotations, to one or more "responsive actions" of a plurality of candidate responsive actions that are then performed by automated assistant 100; mappings may include mappings between entities and candidate responsive actions that are performable in association with those entities; dialog manager 118 may employ one or more trained machine learning models, alone or in combination with one or more grammars; these trained machine learning models may be trained to identify intents, e.g., by embedding data indicative of a user's utterance into a latent space, and then determining which other embeddings (and therefore, intents) are most proximate, e.g., using techniques such as Euclidean distance, cosine similarity, etc.; various contextual signals may be used to perform various aspects of the natural language processing and dialog managing features; entity or entity type recognition, entity or entity type ranking, identification of candidate responsive actions associated with entities or entity types, ranking of candidate responsive actions, and/or filtering of candidate responsive actions, may be performed based on contextual signals; ¶¶ [0037]-[0038]: the NL only clarification prompt can be generated based on an NL only clarification prompt template; the NL only clarification prompt template may be pre-generated, or may be generated responsive to identifying the two or more candidate responsive actions as corresponding to the user's spoken utterance; the NL only clarification prompts can include slots filled by the natural language characterizations of the candidate responsive actions to be rendered in the prompt; the system may generate such natural language characterizations of the candidate responsive actions based on data generated during the natural language processing; an NL only clarification prompt may be the default clarification prompt used when the automated assistant 100 determines it cannot select between two or more candidate responsive actions for user requests generally, or for certain types of user requests; however, the automated assistant 100 may instead select an enhanced clarification prompt based on a variety of factor (s)/condition(s); ¶ [0051]: the system can generate an embedding based on the identified semantic property (e.g., using a trained neural network encoder) and compare the embedding to a plurality of embeddings of respective semantic properties associated with the candidate responsive actions corresponding to the renderings of the clarification prompt; the plurality of embeddings of respective semantic properties associated with the candidate responsive actions may have been generated by the trained neural network encoder or another trained neural network encoder based on metadata that indicates semantic properties associated with the candidate responsive actions; the system can determine that the given semantic property matches, or most closely matches, a given embedding, of the plurality of embeddings of the respective semantic properties, based on the comparison; e.g., assume the embeddings are word2vec representations; in this example, a cosine distance between the word2vec representation of the semantic property and each of the word2vec representations of the respective semantic properties of the candidate responsive actions of the prompt can be determined, and a given semantic property of a candidate responsive action that is associated with a respective cosine distance that satisfies a distance threshold can be utilized to determine the semantic property of the spoken utterance matches, or most closely matches, the given semantic property that is associated with a given candidate responsive action ( e.g., an exact match or "fuzzy" match); as a result, the given candidate responsive action that is associated with the given semantic property may be selected for performance; ¶ [0059]: the NL only clarification prompt can be generated based on an NL only clarification prompt template; the NL only clarification prompt template may be pre-generated, or may be generated responsive to identifying the two or more candidate responsive actions as corresponding to the user's spoken utterance; in the case of pre-generated NL only clarification prompt templates, the NL only clarification prompt template can be selected from among various NL only clarification templates based on the identified two or more candidate responsive actions that correspond to the user's spoken utterance; e.g., there may be NL only clarification prompt templates for online shopping, viewing or retrieving media content, interactions with a restaurant reservation application, booking flights, etc.; there may also be NL only clarification prompt templates for the various combinations of such actions, e.g., a clarification prompt for selecting between an online shopping action and a flight booking action; the NL only clarification prompt template may be selected from among various NL only clarification templates at least in part based on the natural language characterizations of the candidate responsive actions to be rendered in the prompt; e.g., if the clarification prompt is to include natural language characterizations of candidate responsive actions that are detailed and/or long-winded, then an NL only clarification prompt template that includes long pauses before and/or after the characterizations or that provides a summary at the end may be selected. In implementations in which the NL only clarification prompt template is generated after receiving the spoken utterance, it may likewise be tailored to the candidate responsive actions and/or their characterizations that are to be rendered in the clarification prompt; ¶¶ [0074]-[0076]: the system determines a similarity measure that reflects a textual and/or semantic similarity between the first term(s) and the second term(s); the system can embed the first term(s) as a first embedding in an embedding space and can embed the second term(s) as a second embedding in the embedding space using a trained encoder (e.g., a trained neural network embedding model); the system can use the embeddings to generate the similarity measure; the system can thus determine to provide the enhanced clarification prompt rather than the NL only clarification prompt based on the comparison(s) of the embeddings of the first and second terms and/or based on. the similarity measure; determine to provide the enhanced clarification prompt rather than the NL only clarification prompt based on determining that the similarity measure and/or embeddings indicate threshold level(s) of similarity and/or dissimilarity; the NL only clarification prompt template(s) and/or natural language characterizations of the candidate responsive actions have previously been generated, for this user or for another user as indicated by the historical automated assistant interaction data).
GOSLIN and Sharif are analogous art because they are from the same field of endeavor, a system and a method relating to . It is also well known in the art that embeddings are commonly used in the language processing models. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Sharifi to GOSLIN. Motivation for doing so would help to processing a variety of requests and commands .
Claims 6 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over GOSLIN in view of Murdock, IV et al. (US 2022/0309246 A1, pub. date: 09/29/2022), hereinafter of Murdock, IV .
Claims 6 and 14
GOSLIN discloses all the elements as stated in Claims 1 and 9 respectively and further discloses wherein evaluate the model output for responsiveness further comprises: environment guidelines based on one or more metrics; (GOSLIN, ¶¶ [0015]-[0022]: the AI system responds in part based on whether the determined means the user is relying on aligns with the expected means for the determined role the user is playing; in order to respond based on the alignment between the expected means and the actual means the user is utilizing, the AI system uses one or more fitness functions to ensure the output makes logical sense; the AI system similarly uses a fitness function to ensure that the output is authentic to a particular character (e.g., the character that the AI system is playing); this system can also be used to evaluate input from the user; e.g., suppose the user is roleplaying as a particular character from a move or series; the fitness function can evaluate each line corresponding to the character from the movie, show, or other predefined script, and then evaluate the user input to determine a mathematical distance between the user's word choice and phrasing, as compared to the "canonical" phrases; based on this distance, the AI system can determine whether the user is accurately playing the role; the AI system repeatedly determines the context for each input (which may include analyzing prior input), such that the AI system can respond to shifting contexts; the AI system can select different ML models for the same context; e.g., the AI system may periodically or randomly use a different ML model for a given context, and evaluate the user's response; i.e., for a given a first context C, the AI system may determine to use an ML model trained on context C' to generate the response; the AI system can then analyze how the user responds, in order to determine whether the ML model corresponding to context C' should be used at least occasionally in the future, given context C; e.g., the system may determine whether the user liked the response, or appeared confused or frustrated; this evaluation can include receiving user feedback (e.g., a thumbs up or thumbs down, a score, and the like) and/or using NLP and/or NLU to analyze the user's response (e.g., to determine the sentiment); in order to improve responses and user-engagement, the learning system can provide differing responses to different users; this randomness enables the AI system to continue searching for better solutions; e.g., the system may be at a local maxima for a solution, but there can be one or more better global maxima available which can be discovered through occasional random selections; the AI system maintains context-specific weights for each ML model; given an input context, the AI system can probabilistically select a ML model to use based on the context-specific weight associated with each; further, the AI system can modify these context-specific weights based on user feedback, in order to better select ML models for future interactions; ¶ [0025] with FIG. 1: the AI System 115 periodically or probabilistically selects an ML Model 120, which may not be the model associated with a perfectly matching context; e.g., the AI System 115 may rely on weights associated with each ML Model 120 in determining which model to select; ¶¶ [0040]-[0048] with FIG. 3: the NLP Component 335 performs a variety of NLP processing such as sentiment analysis, keyword detection, parsing to determine intent or meaning, and the like; e.g., the NLP Component 335 may perform sentiment analysis to determine whether the user is enjoying the ongoing scenario. In some embodiments, the ML Models 120 can be refined based on this analysis; the Context Component 340 determines the current context of the user input, in order to facilitate selection of an appropriate ML Model 120; the Context Component 340 does so at least in part by evaluating results output by the NLP Component 335; this may include data relating to the most recent input, as well as from one or more prior inputs; the Context Component 340 infers the user's objective, role, and/or means based on identified keywords, or based on other results from the NLP Component 335; e.g., the Context Component 340 can determine that the objective is to retrieve an item, based on determining that predefined keywords relating to the item were included in one or more inputs received from the user; similarly, the Context Component 340 may determine that the user is roleplaying as a particular character or is using a particular means, based on keywords, and/or based on the sentiment or intent of the input(s), as identified by the NLP Component 335; the Context Component 340 may determine the context based at least in part on the original input provided by the user (e.g., the explicit selection(s) the user made in initiating the scenario); the Context Component 340 performs this context identification for each received input in the scenario; i.e., even if the Context Component 340 has already inferred the context with a high degree of confidence (or determined it conclusively based on explicit user-selection), the Context Component 340 continues to evaluate the context for each received input; in this way, the Context Component 340 can determine whether the user has shifted the role-playing scenario (such as by deciding to attempt a different means to achieve the objective); the Context Component 340 may identify this transition and select a different ML Model 120 (e.g., based on keywords in the current input, based on sentiment analysis on the input, based on intent analysis, and the like); the ML Component 345 identifies and selects the ML Model 120 that is associated with a matching context; the ML Component 345 may select one or more other ML Models 120 (e.g., periodically, or in a probabilistic manner); e.g., the ML Models 120 are associated with context-specific weights, indicating a likelihood that each will be selected given a particular context; given a first context C, a first ML Model 120 (e.g., one trained on the same context C) may be associated with a relatively high weight, such that it will be selected frequently; similarly, a second ML Model 120 (trained on a different context) can be associated with a relatively lower weight, such that it is selected less frequently than the first; the weight of each ML Model 120 is determined based in part on the vector distance between the current context C and the respective context C' of each respective ML Model 120; the ML Component 345 can dynamically modify the context-specific weights of each ML Model 120 during use; after using a particular ML Model 120 to generate a response, the ML Component 345 evaluates the user's next input in order to refine the context-specific weight(s) associated with the ML Model 120; e.g., if the user-response is positive (e.g., with a positive sentiment evaluation from the NLP Component 335), the ML Component 345 may increase the weight of the previously-selected ML Model 120, in order to increase the probability that it will be selected in the future, given the same context; similarly, if the user's subsequent response is negative, the ML Component 345 may reduce the weight of the previously-selected ML Model 120; i.e., each weight of different ML Models 120 indicates confidence scores of each ML Model 120; in addition to receiving the current context from the Context Component 340, the ML Component 345 receives results of the NLP analysis from the NLP Component 335; once an ML Model 120 has been selected, the ML Component 345 provides this input to the selected model in order to generate an output; based on the subsequent user-input, the ML Component 345 can modify or refine the selected ML Model 120; in order to determine the quality of a given output, the ML Component 345 can evaluate explicit ratings from the user, subsequent responses from the user, facial expressions or other non-verbal emotional cues (e.g., laughing) from the user, and the like; if the subsequent user input is positive, the ML Component 345 can refine the model to increase the probability that the previous output will be selected again, given the same input and/or context; similarly, if the subsequent input is negative, the ML Component 345 reduces the probability that the ML Model 120 will select the same response again, given the previous input/context; ¶¶ [0049]-[0051] with FIGS. 3-4: the Interactivity Application 330evaluates the input to determine the current context; in addition to determining a context, the Interactivity Application 330 generates a confidence in this determination; e.g., if the Interactivity Application 330 is inferring the context, the Interactivity Application 330 may further generate a corresponding confidence in order to aid selection of an appropriate ML Model 120; the Interactivity Application 330 identifies and selects one or more ML Models 120 based on the determined context; in one embodiment, the Interactivity Application 330 selects the ML Model 120 with a matching context; in another embodiment, the Interactivity Application 330 probabilistically selects a model based on the determined context, the confidence in this determination, and the context-specific weights associated with each ML Model 120; the Interactivity Application 330 further refines the context-specific weights of each ML Model 120, and/or the internal weights of the previously-selected model, based on the current input; the Interactivity Application 330 performs one or more NLP operations on the input (e.g., keyword identification, sentiment analysis, intent determination, and the like), and processes the result with the ML Model 120).
GOSLIN fails to explicitly disclose receiving a confidence threshold value for evaluating model output; and comparing the one or more confidence score for the one or more components of the output against the confidence threshold value.
Murdock, IV teaches a system and a method relating to artificial intelligence application (Murdock, IV, ¶ [0002]), wherein receiving a confidence threshold value for evaluating model output; and comparing the one or more confidence score for the one or more components of the output against the confidence threshold value (Murdock, IV, ¶¶ [0092]-[0096] and [0131]-[00137] with in FIG. 6: the score of a particular answer is a confidence score that indicates a relative measure of confidence that the answer is correct; e.g., the score may be a value between 0.0 and 1.0, with 0.0 representing a lowest confidence and 1.0 representing a highest confidence, with the measure of confidence increasing linearly between 0.0 and 1.0; at step 610, receive input defining parameter values (e.g., an initial value of a confidence threshold); at step 610, receive a question from a user device; at step 615, generate one or more answers to the question from step 610 using question answering system techniques; determine a confidence score for each of the one or more answers from step 615 using question answering system technique; determines which of the answers (from step 615) to return to the user that asked the question (at step 610) by comparing the respective confidence scores (determined at step 620) to a confidence threshold; compare the confidence score of each answer to the confidence threshold, returns (to the user device) answers whose confidence score is greater than the confidence threshold, and does not return answers whose confidence score is less than the confidence threshold; adjusts the confidence threshold based on comparing a highest one of the confidence scores (from step 620) to the confidence threshold; either: (i) increasing the confidence threshold for a next question in response to the highest one of the confidence scores being greater than the confidence threshold, or (ii) decreasing the confidence threshold for the next question in response to the highest one of the confidence scores being less than the confidence threshold.).
GOSLIN and Murdock, IV are analogous art because they are from the same field of endeavor, a system and a method relating to artificial intelligence application. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Murdock, IV to GOSLIN. Motivation for doing so would improve .
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over GOSLIN in view of Chappell, III et al. (US 2023/0047787 A1, filed on 10/11/2022), hereinafter Chappell, III.
Claim 17
GOSLIN discloses all the elements as stated in Claim 9 and further discloses wherein an interactive element comprises a non-player character (NPC), (GOSLIN, ¶ [0015]: the AI system acts as an intelligent character in a role-playing game; the AI system infers the context of the conversation based on input as the user interacts with the character; e.g., the AI system uses natural language processing (NLP) and/or natural language understanding (NLU) to attempt to identify role-playing scenario the user is partaking in; a roleplaying scenario is an interactive session that includes a role for the user to play, an objective for the user to pursue, and/or a means that the user should use to achieve the objective; the role can include any character, such as a child, a strong warrior, a wise wizard, a stealthy archer, and the like; similarly, objectives can include any goal, such as infiltrating a secure area, questioning or interrogating a character for information, retrieving items, rescuing characters, and the like; further, the means can similarly include any methodology of achieving the objective, such as stealth, force, intimidation, flattery, diversion, distraction, and the like; the user plays a more specific role, such as a particular character from a movie or television show; ¶¶ [0028]-[0033] with FIG. 2: the user interacts with a scenario creator to select one or more aspects of the role-playing scenario they wish to use; the user can select, in a first Box 205, an Objective 210A-N; the Objectives 210 can include, without limitation, an "Infiltrate" Objective 210A, a "Question" Objective 210B, and a "Retrieve" Objective 210N; the Objective 210 is selected from a predefined set; further, using the Box 215, the user can select a Role 220A-N; the user can select a Wizard 220A, a Warrior 220B, and a Ninja 220N; additionally, using the Box 225, the user can select a Means 230A-N, including Stealth 230A, Force 230B, and Flattery 230N; the Roles 220 and Means 230 are similarly selected from predefined sets; the user can select one or more of the options, and use the Button 245 to launch the selected scenario; the user may select only a subset of the aspects, and leave the AI System 115 to infer the remaining; the options available for a given selection can depend on one or more other selections; i.e., the selections for each of the Objectives 210, Roles 220, and/or Means 230 may have predefined relationships defining combinations that can be selected; e.g., the choice of available Roles 220 may depend on the Objective 210 selected; similarly, the available Means 230 may depend on the Objective 210 and/or the Role 220; the GUI 200 further includes a Button 235 to generate suggested scenarios, a Button 240 to generate random scenarios, and a Button 245 to launch the selected or suggested scenario; the Button 235 is used to generate suggested scenarios based on the user's previous interactions with the system; the Button 245 is used to begin the interactive scenario that has been selected and/or generated).
GOSLIN fails to explicitly disclose wherein an interactive element comprises animated infographic.
Chappell, III teaches a system and a method relating to game/interaction application with immersive content experience using artificial intelligence (Chappell, III, ¶¶ [0002] and [0012]), wherein an interactive element comprises animated infographic (Chappell, III, ¶ [0143]: an immersive content experience emphasizes immersion and interactivity with immersive content; the immersive content may be categorized as one of a spatial immersive content, a strategic immersive content, a narrative immersive content, or a tactical immersive content that supports multiple users; the immersive content may be one of a card game, a bluffing game, an action video game, an adventure video game, a role-playing video game, interactive polls and quizzes, animated data visualizations or infographics, 3D images and video, a simulation video game, a strategy video game, a sports video game and a party video game).
GOSLIN and Chappell, III are analogous art because they are from the same field of endeavor, a system and a method relating to game/interaction application with immersive content experience using artificial intelligence. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of Chappell, III to GOSLIN. Motivation for doing so would expand the capability of (Chappell, III, ¶ [0005]).
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over GOSLIN in view of ALLEN et al. (US 2016/0121218 A1, pub. date: 05/05/2016), hereinafter ALLEN.
Claim 18
GOSLIN discloses all the elements as stated in Claim 9 and further discloses wherein a gaming environment comprises a video game, (GOSLIN, ¶ [0014]: provide AI systems that utilize machine learning to interact with users in a dynamic and immersive manner; ¶ [0015]: the AI system acts as an intelligent character in a role-playing game; ¶ [0016]: the user plays a more specific role, such as a particular character from a movie or television show; ¶ [0048]: the Interactivity Application 330 can provide immersive role-playing to the user).
GOSLIN fails to explicitly disclose wherein a gaming environment comprises online game and MMORPG.
ALLEN teaches a system and a method relating to role-playing games using machine learning technologies (ALLEN, ¶¶ [0001]-[0002]), wherein a gaming environment comprises online game and MMORPG (ALLEN, ¶¶ [0006]-[0007]: Massively multiplayer online role-playing games (MMORPGs) mix the genres of role-playing games and massively multi player online games (e.g., in the form of web browser-based video games) in which a very large number of players interact with one another within a virtual world (VW); as in all role-playing games (RPGs ), a player assumes the role of a character (often in a fantasy world) and takes control over many actions of the character; MMORPGs may be distinguished from single-player or small multi-player online RPGs by the number of players and by the persistence of the VW (which is usually hosted by a game publisher and continues to exist and evolve when individual players are offline and not playing the game); ¶ [0009]: a technique for dynamically generating game activities for a game ( e.g., a role-playing game) includes that (a) Context data (e.g., a question) is received from a client (e.g., a player of the game, a user of the game, another system, or a game engine); and (b) in response to receiving the context data, a game activity is dynamically generated based on the context data and the game information; the game activity is then initiated in the game and presented to the client; ¶¶ [0022]-[0035]: nearly all massively multiplayer online role-playing games (MMORPGs) feature a character progression system in which players earn experience points for their actions and use the experience points to reach character levels that make a character better at whatever the character does; traditionally, combat with other characters and completing quests for non-player characters (NPCs), either alone or in groups, have been the primary ways to earn experience points in most MMORPGs; MMORPGs may also require a player to work with other players in order to progress at an optimal rate; traditionally, when a story-line of a game is changed, a game designer has been required to add/modify events associated with the changed story-line; a game designer is not required to add/modify events as game design activities and scenarios are generated by loading game information into a data processing system (e.g., a question answering system or a cognitive system) and correlating game models to produce game activities within a game environment in real-time (e.g., during game play or at design time) responsive to received player (or designer) questions; a game engine associated with a game may be updated to create a representation of a game activity in response to a data processing system receiving context data from a client; analysis and extraction of key game characteristics is performed and actions and objects are correlated with locations to produce actionable quests from natural language questions and correlated structured data associated with the questions; game scenarios can be dynamically generated by loading game information (e.g., backstory, literature, design documents) into a data processing system and correlating game models (e.g., attributes and maps) to produce game activities (e.g., quests) within a game environment; natural language information associated with a game (e.g., backstory, story boards, a novel, rules, point systems, design documents, locations, and maps) is loaded into a data processing system; the natural language information provides a collection from which certain quest types may be generated; quest types that can be generated by an evidence based probabilistic system, and player context information may then be generated; for each of the quest types, a player's current context (in game knowledge) and loaded game information may be used to generate a quest that includes a game activity, a location, and text describing an action that the player is to perform (e.g., based on player interaction with a non-player character (NPC)); based on generated answers that have high confidence and direct or indirect supporting evidence, actionable gaming activities (e.g., in the form of quests) are generated; when a question is received from a player, a focus and lexical answer type (LAT) of the question may be determined and compared to verify the focus and LAT match one or more categories for a quest; sentence examples may also be generated based on the LAT; associated game quest content may then be generated in game format for the generated sentence; player is dynamically presented with a quest to accept with a location and a designated activity to perform; in the event the player accepts the quest and is able to perform one or more desired actions to complete the quest, rewards associated with the quest are awarded; a player is not restricted to canned dialogue when providing questions to a QA system; ¶¶ [0014]-[0044] with FIG. 4: question and analysis context block 402 receives a question in a natural language; an output of block 402 is provided to a question decomposition block 404, which further analyzes the different textual, grammatical, linguistic, punctuation and/or other components of the question; block 404 provides inputs to multiple hypothesis generation blocks 406, which perform parallel hypothesis generation; hypothesis generation blocks 406 each perform a primary search, collect reference data from different structured and unstructured sources, and generate candidate answers; hypothesis and evidence scoring for each hypothesis is initiated in hypothesis and evidence scoring blocks 408; i.e., the QA system is further configured to score all the candidate answers using hypothesis and evidence scoring techniques (e.g., providing 'M' scores for 'M' candidate answers); in synthesis block 410 the QA system evaluates the candidate answers with the highest scores and determines which hypotheses generated the highest scores; following block 410, the QA system initiates final confidence merging and ranking in block 412; finally, in block 412, the QA system provides an answer (and may provide a confidence score) to the question; assuming, e.g., the candidate answers 'j', 'k', and 'l' have the highest scores, a determination may then be made as to which of the hypotheses generated the best candidate answers; assume that hypotheses 'c' and 'd' generated the best candidate answers 'j', 'k', and 'l'; the QA system may then upload additional data required by hypotheses 'c' and 'd' into the cache and unload data used by other hypotheses from the cache; when a new question is received, the above-described process is repeated; if the hypotheses 'c' and 'd' again produce best candidate answers, the QA system loads more data that is relevant to the hypotheses 'c' and 'd' into the cache and unloads other data; if, on the other hand, hypotheses 'h' and 'g' produce the best candidate answers to the new question, the QA system loads more data relevant to the hypotheses 'h' and 'g' into the cache and unloads other data; at this point, hypotheses 'c' and 'd' probably still have more data in the cache than other hypotheses, as more relevant data was previously loaded into the cache for the hypotheses 'c' and 'd'; the overall process repeats in the above-described manner by basically maintaining data in the cache that answer and evidence scoring indicates is most useful).
GOSLIN and ALLEN are analogous art because they are from the same field of endeavor, a system and a method relating to role-playing games using machine learning technologies. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to apply the teaching of ALLEN to GOSLIN. Motivation for doing so would expand the capability of gaming system and improve engagement of players (.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Chang ("Prompting Large Language Models With the Socratic Method", IEEE Computing and Communication Workshop and Conference (CCWC), Feb. 17, 2023, pp. 1-10) discloses in Abstract of Page 1 that (1) a systematic approach to using the Socratic method in developing prompt templates that effectively interact with large language models, including GPT-3; (2) identify those that yield precise answers and justifications while simultaneously fostering creativity and imagination to enhance creative writing; and (3) discuss how techniques such as definition, elenchus, dialectic, maieutics, generalization, and counterfactual reasoning can be applied in engineering prompt templates, and provide practical examples that demonstrate their effectiveness in performing inductive, deductive, and abductive reasoning. Chang further discloses in Section I of Pages 1-2 that (1) prompting is a technique used to guide the output generation of a pre-trained language model such as GPT-3; (2) this is achieved by providing input in the form of a question or template, which helps to generate specific responses such as Q&A, document summarization, and translations; (3) investigate the Socratic method to identify and evaluate potential prompting strategies, and use the findings to design effective prompt templates; (4) by utilizing prompt templates with large language models (LLMs), these sub-tasks can be delegated to the LLM, freeing the template to focus specifically on dialogue design; (5) in this regard, the Socratic method holds significant relevance, as it is well-known for using questioning (prompting) as a means of promoting critical thinking and delving into complex concepts; (6) list ten widely referenced methods under the Socratic method umbrella and use hypothesis elimination to identify the most relevant ones for our goal of prompt-template development; and (7) the selected methods are definition, hypothesis elimination, elenchus, dialectic, maieutics, generalization, and induction. Chang also discloses in Section II of Page 2 that (1) prompting is a recent innovation in the field, popularized by OpenAI, especially with the release of GPT-3 in 2020; (2) instead of fine-tuning the model for a specific task, the approach involves providing a specific input, or “prompt,” to guide the LLM’s output generation, resulting in greater flexibility and efficiency in generating a wide range of responses; (3) there are several factors that impact prompt template engineering, including the type of LLM used, manual vs automatic design, and static vs continuous prompts: (a) left-to-right vs masked LLMs; (b) manual vs automatic design; and (c) discrete vs continuous prompts; (4) more advanced templates can be constructed by combining basic templates with techniques like ensemble methods; (5) this involves forming a committee of basic templates that ask the same question using different phrasing; (6) in contrast, the Socratic method aims to employ deductive, inductive, and abductive reasoning to ensure consistency and accuracy of inference; (7) the Socratic method deals with all aspects of critical thinking, including definition clarification and cross-examination; (8) this comprehensive approach to template engineering can lead to improved output quality and consistency; (9) the primary objective of this study is to design continuous prompts that enhance response quality and foster guided creativity in generative tasks, such as verifying information, evaluating source credibility, proposing alternatives, recommending plot ideas in creative writing, and generating task-specific surprises; and (10) when designing a prompt, it is important to consider the category and utilize the most suitable strategies and techniques to achieve the best results. Chang further teaches in Section III of Pages 2-5 that (1) the Socratic method is a questioning technique used in teaching and philosophy to encourage critical thinking and self-discovery; (2) the method involves asking a series of questions to explore complex ideas and help individuals arrive at their own understanding of a concept; (3) it is based on the belief that knowledge cannot be simply imparted, but must be discovered through a process of questioning and dialogue; (4) some of the Socratic method’s key principles and guidelines to conduct critical thinking include: (a) posing open-ended questions; (b) clarifying key terms; (c) providing examples and evidence; (d) challenging reason-to-conclusion argument; (e) summarizing and drawing conclusions; and (f) reflecting on the process; (5) Socrates uses questioning to explore complex ideas and stimulate critical thinking in his interlocutors: (a) Definition; (b) Generalization; (c) Induction; (d) Elenchus; (e) Hypothesis Elimination; (f) Maieutics; (g) Dialectic; (h) Recollection; (i) Irony; and (j) Analogy; (6) critical reading is a crucial component of critical thinking, which involves evaluating the quality and credibility of written materials, from research papers to blog posts; (7) introduce a template called CRIT, which stands for Critical Reading Inquisitive Template; (8) starts in its step #1, asking GPT-3 to identify the conclusion of a document; (9) step #2 in Table I prompts GPT-3 to find a set of supporting reasons; (9) step #3 of the CRIT algorithm (in Table I) prompts GPT-3 to assess the validity of each reason as justification for the conclusion; (10) step #4 asks GPT-3 to provide missing rival reasons, and then pair rival reasons with the conclusion to conduct validation; (11)finally, in step #5, CRIT computes an aggregated score by performing a weighted sum on the validation multiplied by credibility score, of both arguments and counter arguments, and then output the final assessment score; (12) CRIT can prompt GPT-3 for a report, and then readers and students can compare their notes; and (13) by incorporating counterfactual reasoning into prompt engineering, one can facilitate exploration of alternative possibilities and promote more nuanced and complex understanding of a given topic. Chang also teaches in Section IV of Pages 5-8 with Tables II-IV that (1) prompt template engineering involves creating templates to provide input, or “prompts,” to a language model to guide its output generation; (2) there are three main design considerations when engineering a basic prompt: (a) Input style; (b) LLM capability; (c) Cost; (3) before submitting a template to an LLM, the application (e.g., a chatbot) that uses the template should check if all input slots are filled, and perform a sanity check; (4) regarding mapping a natural language input to a prompt template, existing techniques of knowledge representation and reasoning can be very helpful; (5) more specifically, ontology alignment and semantic parsing can help map an NL input to a structured representation of knowledge and infer implicit concepts and relationships; (5) these algorithms can be used to generate more precise and accurate prompts for LLMs, and to improve the effectiveness of the Socratic method in dialogue formulation; (6) the main purposes of conducting cross examination in a template are to validate the credibility of the information sources and to identify inconsistencies in the process Chang further teaches in; (7) in the context of template engineering, the goal is to formulate a productive dialogue that can be used to assess the reliability of an LLM’s output; (8) additionally, template engineering can be used to query an LLM for opposing views of its output, including sources and credibility, and then evaluate if a different perspective is strong; (9) if the original question is phrased in a positive tone, the prompt template can reformulate the question with a negative tone to elicit a contrasting viewpoint; (10) if the original answer came from a democratic right-leaning source, the prompt template may post the same question to a source of a republican-left persuasion, and vice versa; (11) this approach allows for a more comprehensive examination of the topic by considering multiple perspectives; (12) the template to examine the semantic relation between two sentences S1 and S2 can be written as “<S1>, [R], [S2],” where R is one of the three most important types of semantic relations: paraphrase, entailment, and contradiction; (13) two sentences that have the same meaning are called paraphrases of each other; (14) two sentences that have different meanings can be called disagreement or contradiction; and (15) the template can be trained to identify the degree of agreement (or disagreement) between two sentences.
Song et al. ("A Unified Framework for Multi-intent Spoken Language Understanding with prompting", ARXIV ID: 2210.03337, Oct. 7, 2022, pp. 1-11) discloses in Abstract that (1) jointly modeling Intent Detection and Slot Filling in it provides a channel to exploit the correlation between intents and slots; (2) however, current approaches are apt to formulate these two sub-tasks differently, which leads to two issues: a) it hinders models from effective extraction of shared features; b) pretty complicated structures are involved to enhance expression ability while causing damage to the interpretability of frameworks; (3) in this work, describe a Prompt-based Spoken Language Understanding (PromptSLU) framework, to intuitively unify two sub-tasks into the same form by offering a common pretrained Seq2Seq model; (4) in detail, ID and SF are completed by concisely filling the utterance into task-specific prompt templates as input, and sharing output formats of key-value pairs sequence; (5) furthermore, variable intents are predicted first, then naturally embedded into prompts to guide slot-value pairs inference from a semantic perspective; and (6) finally, inspired by prevalent multi-task learning to introduce an auxiliary sub-task, which helps to learn relationships among provided labels.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HWEI-MIN LU whose telephone number is (313)446-4913. The examiner can normally be reached Mon - Fri: 9:00 AM - 6:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela D. Reyes can be reached at (571) 270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HWEI-MIN LU/Primary Examiner, Art Unit 2142