Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1, 2, 4-7, 11, 12, 14, 16, 17, and 21-24 are pending.
Claims 1, 2, 4, 6, 11, 12, and 14 are amended.
Claims 3, 5, 8-10, 13, 15, and 18-20 are cancelled.
Claim 21-24 are added.
Response to Arguments
Applicant’s arguments with respect to Sections 102 and 103 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. In as much the instant action continues to rely on the U.S. Patent Application Publication No. 20230077338 to Mukund et al., that reliance is newly applied against new claim limitations not previously recited.
Applicant's arguments with respect to Section 101 have been fully considered but they are not persuasive. Applicant argues that the claims are directed to patent- eligible subject matter because the claims are directed to a method for operating an LLM that improves the operation of the LLM. While Applicant argues that the “the claim reflects the disclose improvement”, However, as noted in the updated rejection, the claims do not recite clear improvements to the LLM. Rather, the claims recite steps like identifying unreliable content in the intermediate draft version using contextual analysis; and removing the unreliable content from the intermediate draft version of the incident report to generate a reviewable draft version of the incident report that are commonly performed by a human in the process of drafting and revising reports. This process is a human organized activity that is commonly performed by supervisors and subordinates as training or supervision before finalizing reports. In view of the forgoing, the arguments are not persuasive. The rejection has been updated to address the amended claims.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 2, 4-7, 11, 12, 14, 16, 17, and 21-24 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Representative claim 1 recites “receiving one or more initial inputs from a user, the one or more initial inputs comprising unstructured text and structured data for an incident report of a specified event type; performing a semantic search over a reports data store based on the one or more initial inputs to identify one or more example incident reports of the specified event type, the performing comprising … selecting the one or more example incident reports …; constructing one or more prompts …, the one or more prompts comprising a general instruction that specifies a desired format and context, one or more few-shot exemplars that are generated based on the one or more example incident reports, and one or more target inputs that are generated based on the one or more initial inputs; … generate an intermediate draft version of the incident report; identifying hallucinatory content or unreliable content in the intermediate draft version using one or more of contextual analysis, heuristic based analysis, pattern-recognition, or blacklist checks; removing the hallucinatory content or unreliable content from the intermediate draft version of the incident report to generate a reviewable draft version of the incident report; presenting… the reviewable draft version of the incident report in a human-readable format; receiving, …, one or more user-provided modifications comprising at least one of free-form text, one or more revised structured inputs, or instructions for revisions; and sending, …, the one or more user-provided modifications and the reviewable draft version of the incident report … to generate an updated version of the incident report; …”. Therefore, the claim as a whole is directed to “Incident Report Writing”, which is an abstract idea because it is a method of organizing human activity, including legal interactions (including legal obligations and business relations); managing personal behavior or relationships or interactions between people, including following rules or instructions. “Incident Report Writing” is considered to be is method of organizing human activity because the writing and revising of incident reports is a legal requirement of police and insurance personnel for documenting interactions. The writing of such incident reports is generally based on formats specified by the organization and the training and knowledge of previously submitted incident reports in order to write and revise new incident reports based on best practices of the organization. As such, claim 1 is directed to organizing human activity.
This judicial exception is not integrated into a practical application. In particular, claim 1 recites the following additional element(s): the performing comprising generating one or more embedding representations of the one or more initial inputs and selecting the one or more example incident reports based on the one or more embedding representations; constructing one or more prompts for a large language model (LLM), sending the one or more prompts to the LLM to cause the LLM to generate an intermediate draft version of the incident report, presenting via a graphical user interface (GUI), the reviewable draft version of the incident report in a human-readable format, and the method is performed by one or more computing devices. It is noted that these additional elements describe the use of technology (i.e. an LLM), but do not limit that technology beyond generic recitations of a class of technology or change how that technology is implemented. These additional elements individually or in combination do not integrate the exception into a practical application. The recitations of these additional elements amount merely reciting the words ‘‘apply it’’ (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (see MPEP 2106.05(f)). That is, the claimed additional elements amount to limitations that require the use of commercially available LLM products such as Chat GPT. Nothing in the additional elements is understood to be addressing a technological problem or providing a technological solution. Rather, the additional elements recite using the general purpose LLM, which may include one of many commercially available off the shelf products, as a tool to perform the abstract idea. That is, the claimed additional element limitations do no more than generally link the use of a judicial exception to a particular technological environment or field of use (see MPEP 2106.05(h)). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Claim 1 is directed to an abstract idea.
Claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, individually and in combination, are merely being used to apply the abstract idea to a technological environment. As noted above, nothing in the additional elements is understood to be addressing a technological problem or providing a technological solution. Rather, the additional elements recite using the general purpose LLM, which may include one of many commercially available off the shelf products, as a tool to perform the abstract idea. Accordingly, claim 1 is ineligible.
Claims 11 recite substantially similar features to those recited in representative claim 1 and are ineligible based on substantially the same reasons.
Dependent claims 2, 4-7, 12, 14, 16, 17, and 21-24 merely further limit the abstract idea and are thereby considered to be ineligible.
Dependent claims 2 and 12 further limit the abstract idea of “Incident Report Writing” by introducing the element of adjusting the one or more temperature hyperparameters of the LLM, which does not include an improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful limitations beyond generally linking the use of the abstract idea to a particular technological environment. Therefore, dependent claims 2 and 12 are also non-statutory subject matter.
Dependent claims 4 and 14 further limit the abstract idea of “Incident Report Writing” by introducing the element of receiving, via one or more input fields…., the structured data, which does not include an improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful limitations beyond generally linking the use of the abstract idea to a particular technological environment. Therefore, dependent claims 4 and 14 are also non-statutory subject matter.
Dependent claims 6 and 16 further limit the abstract idea of “Incident Report Writing” by introducing the element of selecting the one or more example incident reports based at least in part on a centroid or clustering analysis, which does not include an improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful limitations beyond generally linking the use of the abstract idea to a particular technological environment. Therefore, dependent claims 6 and 16 are also non-statutory subject matter.
Dependent claims 7 and 17 further limit the abstract idea of “Incident Report Writing” by introducing the element of the incident report comprises a police report, an inspection report, or an insurance report, which does not include an improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful limitations beyond generally linking the use of the abstract idea to a particular technological environment. Therefore, dependent claims 7 and 17 are also non-statutory subject matter.
Dependent claims 21 and 23 further limit the abstract idea of “Incident Report Writing” by introducing the element of adjusting the one or more temperature hyperparameters of the LLM causes the LLM to generate multiple responses, and employ self-voting to select the updated version of the incident report from the multiple responses based on the update version being a most similar response of the multiple responses to other responses of the multiple responses, where the most similar response is based on a distance between the most similar response and the other responses, which does not include an improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful limitations beyond generally linking the use of the abstract idea to a particular technological environment. That is, the use of self-voting or self-assessment is a common tool built into commercially available LLM products. Therefore, dependent claims 21 and 23 are also non-statutory subject matter.
Dependent claims 22 and 24 further limit the abstract idea of “Incident Report Writing” by introducing the element of causing the LLM to generate a subsequent updated version of the incident report based on additional modifications and one or more prior updated versions of the incident report; and adjusting the one or more temperature hyperparameters of the LLM comprises reducing the one or more temperature hyperparameters between the one or more prior updated versions and the subsequent updated version of the incident report, which does not include an improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful limitations beyond generally linking the use of the abstract idea to a particular technological environment. That is, the use of temperature hyperparameters to adjust the randomness or creativity in generated data is a common tool built into commercially available LLM products. Therefore, dependent claims 22 and 24 are also non-statutory subject matter.
Dependent claims 2, 4-7, 12, 14, 16, 17, and 21-24 also do not integrated into a practical application. The dependent claims recite adjusting one or more temperature hyperparameters of the LLM and receiving data via one or more input fields of the GUI, and sending and receiving data from an LLM. These additional elements merely generally link the abstract idea to a particular technological environment or field of use. MPEP 2106.04(d)(I) indicates that generally linking an abstract idea to a particular technological environment or field of use cannot provide a practical application. Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application. This has been re-evaluated under the “significantly more” analysis and has also been found insufficient to provide significantly more. MPEP 2106.05(A) indicates that generally linking an abstract idea to a particular technological environment or field of use cannot provide significantly more. That is, the claims provide no practical limits or improvements to any technology, but rather indicate the use of commercially available technology to perform the abstract idea. Accordingly, dependent claims 2, 4-7, 12, 14, 16, 17, and 21-24 are also ineligible.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 3, 4, 7, 10 ,11, 13, 14, 17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication No. 20230077338 to Mukund et al. in view of U.S. Patent Application Publication No. 20250307572 to Shtar et al.
With regards to claims 1 and 11, Mukund et al. teaches
receiving one or more initial inputs from a user, the one or more initial inputs comprising
unstructured text and structured data for an incident report of a specified event type, wherein each of the one or more initial inputs comprise multiple different initial inputs sections (paragraph [0029], “As the user provides inputs and makes selections as prompted by the presented (200) interface, the system may receive (202) structured inputs (e.g., menu selections, radio button selections, or selections of other pre-configured inputs), and may receive (204) unstructured inputs (e.g., free-form text descriptions and text strings).”);
separating the one or more initial inputs into multiple different initial input sections: for each initial input section paragraph [0029], “As the user provides inputs and makes selections as prompted by the presented (200) interface, the system may receive (202) structured inputs (e.g., menu selections, radio button selections, or selections of other pre-configured inputs), and may receive (204) unstructured inputs (e.g., free-form text descriptions and text strings).”; paragraph [0030], “In some implementations, these inputs may be received (202, 204) by the system in real-time as they are entered into the form by the user, and prior to submission of the form (e.g., prior to the user clicking a submit button as illustrated in FIG. 6 ), while in other implementations these inputs may be received (202, 204) by the system after the form is completed and submitted by the user (e.g., by clicking a submit button as illustrated in FIG. 6 ).”), performing iterations of
performing a semantic search over a reports data store based on the one or more initial inputs to identify one or more example incident reports of the specified event type (paragraph [0032], “Factors considered by the analytical function (e.g., expert function, NLP function, AI function) will vary based on the one or more types of functions used during analysis (206) in a particular implementation. With respect to quality, factors may include for example ….similarity to other high quality descriptions (e.g., based upon an AI function analysis of the incident description, where the AI function has been configured or trained based upon manually curated and annotated datasets that exhibit positive and negative aspects of high quality descriptions), and other factors.”; paragraph [0051], “By reducing the several paragraph long description to a minimized set of core components, the system is able to rapidly analyze (206) those core components using additional NLP functions, or a machine learning or other AI function to identify patterns within core components, or similarities to incidents in historic incident datasets, which enables the system to rapidly provide feedback and suggestions, as has been described.”), …..;
constructing one or more prompts for a large language model (LLM), the one or more prompts comprising a general instruction that specifies a desired format and context (paragraph [0030], “In some implementations, these inputs may be received (202, 204) by the system in real-time as they are entered into the form by the user, and prior to submission of the form (e.g., prior to the user clicking a submit button as illustrated in FIG. 6 ), while in other implementations these inputs may be received (202, 204) by the system after the form is completed and submitted by the user (e.g., by clicking a submit button as illustrated in FIG. 6 ).”), one or more few-shot exemplars that are generated based on the one or more example incident reports (paragraph [0025], “The system may be configured to receive (120) historic incident datasets from one or several sources (102, 104), and may store those received (120) datasets for use in analysis of current incident datasets, which in varying implementations may include search and comparison of historic incidents, production of analytical models or other functions based on historic incidents, or both.”), and one or more target inputs that are generated based on the one or more initial inputs (paragraph [0034], “Based upon an analysis (206), the system may then determine and display (208) a quality score or other indication of incident description quality via the presented (200) interface, may display one or more suggestions related to the description's quality or completeness, or both. As an example, a displayed (208) quality score may be a scaled numerical description (e.g., between 1 and 100, between 1 and 6, etc.) that is based upon analysis (206) of the unstructured input's quality.”);
sending the one or more prompts to the LLM to cause the LLM to generate an intermediate draft version of the incident report (claim 1, “cause an incident submission interface to display on a display of a user device, and receive a set of partial inputs via the incident submission interface, wherein the set of partial inputs comprises a provisional incident description that is received as unstructured data; (ii) analyze the provisional incident description to determine a quality score that indicates a level of descriptiveness and, where the quality score is less than a maximal quality score, display one or more suggested changes via the incident submission interface;”; paragraph [0035], “When displaying (208) suggestions related to description quality, the system may provide pre-configured feedback based on the quality determination. Continuing the prior example, where the system determines that the description includes 30 meaningful words and falls within a “moderate quality” range for length, and also includes 10 noun modifiers and falls within a “moderate quality” range for descriptiveness, the system may display (208) suggestions such as “Your description is too short, please add 2-3 more sentences,” or “Your description isn't very detailed, please add adjectives and adverbs to modify the nouns.”);
….;
presenting via a graphical user interface (GUI), the reviewable draft version of the incident report in a human-readable format (paragraph [0028], “In some implementations, this may include providing feedback on the incident dataset while the user is composing the event description so that the user can see when their event description is low quality or missing key information and make any corrections.”; paragraph [0035], “When displaying (208) suggestions related to description quality, the system may provide pre-configured feedback based on the quality determination. Continuing the prior example, where the system determines that the description includes 30 meaningful words and falls within a “moderate quality” range for length, and also includes 10 noun modifiers and falls within a “moderate quality” range for descriptiveness, the system may display (208) suggestions such as “Your description is too short, please add 2-3 more sentences,” or “Your description isn't very detailed, please add adjectives and adverbs to modify the nouns.””);
receiving, via the GUI, one or more user-provided modifications comprising at least one of free-form text, one or more revised structured inputs, or instructions for revisions (paragraph [0028], “In some implementations, this may include providing feedback on the incident dataset while the user is composing the event description so that the user can see when their event description is low quality or missing key information and make any corrections.”; paragraph [0037], “In some implementations, the system may be configured to prevent submission of the incident dataset until requirements for quality and or completion (210) are met. Continuing the above example, the system may prevent submission of the incident dataset until the text description is revised to indicate the height from which the fall occurred. Where the system determines that a description is not complete (210), the system may prompt (212) the user for additional structured and/or unstructured inputs by providing suggestions, as described above, and/or additional structured input elements, such as by causing the presented (200) interface to display a new structured input element that the user may interact with to specify the height of the fall.”); and
sending, to the LLM, the one or more user-provided modifications and the reviewable draft version of the incident report to cause the LLM to generate an updated version of the incident report (paragraph [0037], “In some implementations, the system may be configured to prevent submission of the incident dataset until requirements for quality and or completion (210) are met. Continuing the above example, the system may prevent submission of the incident dataset until the text description is revised to indicate the height from which the fall occurred. Where the system determines that a description is not complete (210), the system may prompt (212) the user for additional structured and/or unstructured inputs by providing suggestions, as described above, and/or additional structured input elements, such as by causing the presented (200) interface to display a new structured input element that the user may interact with to specify the height of the fall.”);
wherein the method is performed by one or more computing devices (paragraphs [0018], “The incident server (100) may include one or more physical, virtual, cloud, or other servers or computing environments, with each server comprising one or more processors, memories, communication devices, user interface devices, and other components as may be useful in receiving, transmitting, storing, modifying, analyzing, and otherwise processing data. When described herein, other computing devices should be understood to include some or all of the preceding components described in the context of servers.”).
Mukund et al. teaches identifying missing content in reports, but fails to explicitly teach generating embedding representations for similarity searching and identifying hallucinatory content. However Shtar et al. teaches
the performing comprising generating one or more embedding representations of the one or more initial inputs and selecting the one or more example incident reports based on the one or more embedding representations (paragraph [0005], “creating embedding representations of the responses; calculating, based on the embedding representations, a degree of semantic similarity between a response of the responses that is in the language associated with the user query and a different response of the responses that is in a different language than the language associated with the user query;”),
identifying hallucinatory content or unreliable content in the intermediate draft version using one or more of contextual analysis, heuristic based analysis, pattern-recognition, or blacklist checks (paragraph [0020], “Certain embodiments provide that the results of determining a degree of similarity between responses may be used to determine whether or not one or more of the responses contains a hallucination. As discussed above, accurate responses (i.e., responses that do not contain hallucinations) are generally semantically consistent across multiple languages. In other words, non-hallucinatory responses generally convey the same semantic meaning regardless of the language in which the responses are generated. Responses with hallucinations, however, generally exhibit a large amount of variance compared to responses to the same query generated in different languages”); and
removing the hallucinatory content or unreliable content from the intermediate draft version of the incident report to generate a reviewable draft version of the incident report (paragraph [0021], “In certain embodiments, the results of determining a degree of similarity between responses may be used to determine which languages to include in the set of languages. For example, a clustering algorithm may be used to determine that a particular language is more prone to causing hallucinations than other languages and/or that the particular language otherwise resulted in an outlier result in at least one case. A language may be prone to causing hallucinations because, for example, the corpus corresponding to the language is deficient. A languages that is prone to causing hallucinations may result in high degrees of variance between embeddings even when responses in other languages do not contain hallucinations. Thus, excluding the particular language from the set of languages may allow for more accurate hallucination detection (i.e., reduction of false positives)”).
This part of Shtar et al. is applicable to the system of Mukund et al. as they both share characteristics and capabilities, namely, they are directed to providing LLMs for processing user requests. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Mukund et al. to include the embedding representations, and hyperparameter modifications as taught by Shtar et al. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Mukund et al. in order to prevent hallucinations involve manually detecting hallucinations and then modifying and/or re-training language models to reduce and/or eliminate hallucinations (see the paragraph [0003] of Shtar et al.).
With regards to claims 2, 11, and 12, Mukund et al. teaches the use of large language models including pre-configured expert function, NLP function, other artificial intelligence function, or other analytical function (see paragraph [0031]), but fails to explicitly teach adjusting temperature hyperparameters. However Shtar et al. teaches adjusting one or more temperature hyperparameters of the LLM, wherein one or more temperature hyperparameters of the LLM are adjusted across iterations to improve accuracy of subsequent versions (paragraph [0022], “According to certain embodiments, the language model may be re-trained or fine-tuned based on a detected hallucination. For example, the language model may be an LLM, and one or more parameters of the LLM may be adjusted based on the detected hallucination. For instance, the temperature (a parameter that determines how much risk the LLM takes in generating content) of the LLM may be adjusted. As another example, the language model may be retrained using other machine learning techniques, such as supervised or semi-supervised learning.”; paragraph [0038], “Hallucination response engine 140 may include indication engine 215. Indication engine 215 may comprise one or more processors configured to provide a user with an indication that a hallucination has occurred in a response 205 (e.g., via user interface 104 of FIG. 1 ). For example, a response that has been determined to contain a hallucination may be provided to the user, along with the indication that the response contains a hallucination. The indication may comprise, for example, a warning message displayed on a user interface.”; paragraph [0039], “Language model training engine 220 may also re-train language model 110, such as through a supervised, unsupervised, semi-supervised, and/or “few shot” learning process based on one or more detected hallucinations (e.g., using a detected hallucination as a negative training example and/or using updated training data generated and/or received based on a detected hallucination, such as based on input from a user).”)
This part of Shtar et al. is applicable to the system of Mukund et al. as they both share characteristics and capabilities, namely, they are directed to providing LLMs for processing user requests. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Mukund et al. to include the embedding representations, and hyperparameter modifications as taught by Shtar et al. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Mukund et al. in order to prevent hallucinations involve manually detecting hallucinations and then modifying and/or re-training language models to reduce and/or eliminate hallucinations (see the paragraph [0003] of Shtar et al.).
With regards to claims 4 and 14, Mukund et al. teaches receiving, via one or more input fields of the GUI, the structured data (paragraph [0029], “As the user provides inputs and makes selections as prompted by the presented (200) interface, the system may receive (202) structured inputs (e.g., menu selections, radio button selections, or selections of other pre-configured inputs), and may receive (204) unstructured inputs (e.g., free-form text descriptions and text strings).”).
With regards to claims 6 and 16, Mukund et al. teaches performing natural language processing tasks including the similarity to historic incident descriptions to other high quality descriptions (paragraphs [0031]-[0032]), but fails to explicitly teach a centroid or clustering analysis. However, Shtar et al. teaches
selecting the one or more example incident reports based at least in part on a centroid or clustering analysis (paragraph [0019], “For example, the semantic similarity may be determined by calculating the average distance (e.g., Euclidean distance) between pairs of embeddings and/or the standard deviation of the distances between pairs of embeddings. In some embodiments, a clustering algorithm is applied to the embeddings to determine the semantic similarity. In one example, the embedding of the response in the target language (e.g., the language in which the user query was submitted) is compared to each of the other embeddings (e.g., corresponding to responses in other languages) in order to determine a set of distances (e.g., Euclidean distance), and the set of distances may be averaged and/or otherwise aggregated to determine a degree of similarity between the response in the target language and the responses in the other languages.”; paragraph [0023], “Certain embodiments provide that the indication includes a cluster map generated by applying a clustering algorithm to embeddings of responses. The indication may include suggestions for improving the query in order to reduce or eliminate the hallucinations.”; paragraph [0036], “For example, response embeddings 210 may be compared by evaluating the average distance (e.g., based on cosine similarity and/or other Euclidean distance determination) or standard deviation of the distance between pairs of embeddings within response embeddings 210 and/or an average or other aggregation of the respective distance or other similariy measure between the response embedding 210 corresponding to the target language (e.g., the language of user query 200) and each other embedding 210. In some embodiments, embedding comparison engine 130 may apply a clustering algorithm to the response embeddings 210. For example, k-means clustering may be applied to the response embeddings 210. Other techniques for comparing embedding similarity known in the art may be used as well.”).
This part of Shtar et al. is applicable to the system of Mukund et al. as they both share characteristics and capabilities, namely, they are directed to providing LLMs for processing user requests. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Mukund et al. to include the cluster analysis for semantic searching as taught by Shtar et al. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Mukund et al. in order to prevent hallucinations involve manually detecting hallucinations and then modifying and/or re-training language models to reduce and/or eliminate hallucinations (see the paragraph [0003] of Shtar et al.).
With regards to claims 7 and 17, Mukund et al. teaches the incident report comprises a police report, an inspection report, or an insurance report (paragraph [0027], “The system may also perform (128) one or several risk assessments based on the incident dataset, which may include providing information and context for comparable incidents contained in the historic incident datasets, and may include identifying undiscovered underlying risk factors that may contribute to future incidents.”; paragraph [0028], “Because of this strong correlation, it is advantageous for an incident management system to perform real-time analysis of the incoming event descriptions so that they may be corrected or updated by the submitting user prior to submission, or immediately after submission. In some implementations, this may include providing feedback on the incident dataset while the user is composing the event description so that the user can see when their event description is low quality or missing key information and make any corrections.”; where the incident reports are inspections of risks associated with safety events).
With regards to claims 21 and 23, Mukund et al. teaches the use of large language models including pre-configured expert function, NLP function, other artificial intelligence function, or other analytical function (see paragraph [0031]), but fails to explicitly teach adjusting temperature hyperparameters. However Shtar et al. teaches adjusting the one or more temperature hyperparameters of the LLM causes the LLM to generate multiple responses (paragraph [0022], “According to certain embodiments, the language model may be re-trained or fine-tuned based on a detected hallucination. For example, the language model may be an LLM, and one or more parameters of the LLM may be adjusted based on the detected hallucination. For instance, the temperature (a parameter that determines how much risk the LLM takes in generating content) of the LLM may be adjusted. As another example, the language model may be retrained using other machine learning techniques, such as supervised or semi-supervised learning.”; paragraph [0038], “Hallucination response engine 140 may include indication engine 215. Indication engine 215 may comprise one or more processors configured to provide a user with an indication that a hallucination has occurred in a response 205 (e.g., via user interface 104 of FIG. 1 ). For example, a response that has been determined to contain a hallucination may be provided to the user, along with the indication that the response contains a hallucination. The indication may comprise, for example, a warning message displayed on a user interface.”; paragraph [0039], “Language model training engine 220 may also re-train language model 110, such as through a supervised, unsupervised, semi-supervised, and/or “few shot” learning process based on one or more detected hallucinations (e.g., using a detected hallucination as a negative training example and/or using updated training data generated and/or received based on a detected hallucination, such as based on input from a user).”), and
employ self-voting to select the updated version of the incident report from the multiple responses based on the update version being a most similar response of the multiple responses to other responses of the multiple responses (paragraph [0035], “Responses 205 may be provided to embedding generator 120. Embedding generator 120 may comprise an embedding model, such as a neural network or other type of machine learning model that learns a representation (embedding) for an entity through a training process that trains the neural network based on a data set, such as a plurality of features of a plurality of entities. As discussed above, embeddings generally refer to a vector representation of an entity that represents the entity as a vector in n-dimensional space such that similar entities are represented by vectors that are close to one another in the n-dimensional space.”), where the most similar response is based on a distance between the most similar response and the other responses (paragraph [0036], “Embedding comparison engine 130 may comprise one or more processors that are configured to compare response embeddings 210 to determine the degree of semantic similarity between the responses 205. For example, response embeddings 210 may be compared by evaluating the average distance (e.g., based on cosine similarity and/or other Euclidean distance determination) or standard deviation of the distance between pairs of embeddings within response embeddings 210 and/or an average or other aggregation of the respective distance or other similariy measure between the response embedding 210 corresponding to the target language (e.g., the language of user query 200) and each other embedding 210. In some embodiments, embedding comparison engine 130 may apply a clustering algorithm to the response embeddings 210. For example, k-means clustering may be applied to the response embeddings 210. Other techniques for comparing embedding similarity known in the art may be used as well.”).
This part of Shtar et al. is applicable to the system of Mukund et al. as they both share characteristics and capabilities, namely, they are directed to providing LLMs for processing user requests. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Mukund et al. to include the embedding representations, and hyperparameter modifications as taught by Shtar et al. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Mukund et al. in order to prevent hallucinations involve manually detecting hallucinations and then modifying and/or re-training language models to reduce and/or eliminate hallucinations (see the paragraph [0003] of Shtar et al.).
With regards to claims 22 and 24, Mukund et al. teaches causing the LLM to generate a subsequent updated version of the incident report based on additional modifications and one or more prior updated versions of the incident report (claim 1, “cause an incident submission interface to display on a display of a user device, and receive a set of partial inputs via the incident submission interface, wherein the set of partial inputs comprises a provisional incident description that is received as unstructured data; (ii) analyze the provisional incident description to determine a quality score that indicates a level of descriptiveness and, where the quality score is less than a maximal quality score, display one or more suggested changes via the incident submission interface;”; paragraph [0035], “When displaying (208) suggestions related to description quality, the system may provide pre-configured feedback based on the quality determination. Continuing the prior example, where the system determines that the description includes 30 meaningful words and falls within a “moderate quality” range for length, and also includes 10 noun modifiers and falls within a “moderate quality” range for descriptiveness, the system may display (208) suggestions such as “Your description is too short, please add 2-3 more sentences,” or “Your description isn't very detailed, please add adjectives and adverbs to modify the nouns.”); but fails to explicitly teach adjusting temperature hyperparameters. However Shtar et al.
adjusting the one or more temperature hyperparameters of the LLM comprises reducing the one or more temperature hyperparameters between the one or more prior updated versions and the subsequent updated version of the incident report (paragraph [0022], “According to certain embodiments, the language model may be re-trained or fine-tuned based on a detected hallucination. For example, the language model may be an LLM, and one or more parameters of the LLM may be adjusted based on the detected hallucination. For instance, the temperature (a parameter that determines how much risk the LLM takes in generating content) of the LLM may be adjusted. As another example, the language model may be retrained using other machine learning techniques, such as supervised or semi-supervised learning.”; paragraph [0038], “Hallucination response engine 140 may include indication engine 215. Indication engine 215 may comprise one or more processors configured to provide a user with an indication that a hallucination has occurred in a response 205 (e.g., via user interface 104 of FIG. 1 ). For example, a response that has been determined to contain a hallucination may be provided to the user, along with the indication that the response contains a hallucination. The indication may comprise, for example, a warning message displayed on a user interface.”; paragraph [0039], “Language model training engine 220 may also re-train language model 110, such as through a supervised, unsupervised, semi-supervised, and/or “few shot” learning process based on one or more detected hallucinations (e.g., using a detected hallucination as a negative training example and/or using updated training data generated and/or received based on a detected hallucination, such as based on input from a user).”),
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
“Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting” by Preethi Lahoti et al. discusses in-context reasoning, self-critique and revision used to improve model responses on a variety of tasks, including a new technique called collective-critique and self-voting.
“The potential and pitfalls of using a large language model such as ChatGPT, GPT-4, or LLaMA as a clinical assistant” by Zhang J, Sun, et al. discusses the use of large language models for developing narrative summaries including temperature settings and their effects on output.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Joshua D Schneider whose telephone number is (571)270-7120. The examiner can normally be reached on Monday - Friday, 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jessica Lemieux can be reached on (571)270-3445. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.D.S./Examiner, Art Unit 3626
/ASFAND M SHEIKH/Primary Examiner, Art Unit 3626