Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1:
Claim 1 is directed to a process, which is a statutory category of invention. (Step 1: YES)
Step 2A, Prong One:
Claim 1 recites:
A computer-implemented method for generating a context-enriched response, the method comprising:
generating additional context for a prompt input based on a context input (generating additional context for a prompt (words) encompasses mental observations or evaluations that are practically performed in the human mind by a human observing the prompt and context input (i.e. an image) and mentally generating additional context);
combining the additional context with the prompt input to generate a context-enriched prompt (combining the additional context with the prompt encompasses mental observations or evaluations practically performed in the human mind by a human mentally combining the context and prompt or writing them with a pen and paper); and
executing one or more generative machine learning (ML) models on the context-enriched prompt to generate the context-enriched response (setting aside the recitation of generative machine learning models, generating a context-enriched response encompasses mental observations or evaluations practically performed in the human mind by a human mentally determining a response).
Thus, a broadest reasonable interpretation of the claim as set forth above encompasses a series of mental steps that can practically be performed in the human mind. Claim 1 therefore recites an abstract idea (Step 2A, Prong One: YES).
Step 2A, Prong Two:
Claim 1 recites two additional elements beyond the judicial exception: the “computer” recited in the preamble, and “one or more generative machine learning (ML) models”. The computer recited in the preamble is recited at a high level of generality and amounts to no more than mere instructions to apply the exception using a computer. The step of “executing one or more generative machine learning (ML) models” on the context-enriched prompt also provides nothing more than mere instructions to implement an abstract idea using a generic computer. MPEP 2106.05(f) provides the following considerations for determining whether a claim simply recites a judicial exception with the words “apply it” (or an equivalent), such as mere instructions to implement an abstract idea on a computer: (1) whether the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished; (2) whether the claim invokes computers or other machinery merely as a tool to perform an existing process; and (3) the particularity or generality of the application of the judicial exception. In this case, the claim merely indicates to “execute” machine learning models without any details as to how the models function. The machine learning models merely act as a generic tool to perform the otherwise abstract idea of generating a response to the context-enriched prompt.
Even when viewed in combination, the two additional elements discussed above do not integrate the judicial exception into a practical application because they amount to no more than mere instructions to implement the abstract idea on a generic computer using generic machine learning models. Claim 1 is therefore directed to the judicial exception. (Step 2A, YES).
Step 2B: As discussed above, the additional elements amount to no more than mere instructions to implement the abstract idea on a generic computer using generic machine learning models, even when considered in combination. The additional elements therefore do not provide an inventive concept. Step 2B: No)
Claim 2 merely specifies that the context input is an image. The analysis above applies equally to claim 2.
Claim 3 requires “causing a generative ML to generate a description of the first portion of the image”. This additional element amounts to no more than mere instructions to implement the abstract idea on a generic computer using generic ML models for similar reasoning as applied to claim 1, above. Even when considered in combination, this additional element does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 4 merely requires determining a first set of annotation for the image, which can practically be performed in the human mind.
Claim 5 merely requires identifying an object in an image and generating data corresponding to the object, which can practically be performed in the human mind.
Claim 6 merely requires the additional context and the prompt to be text, and to concatenate the additional context and the prompt together, which can practically be performed in the human mind or using a pen and paper.
Claim 7 merely requires receiving a compound prompt, which can practically be performed in the human mind.
Claim 8 merely specifies that the compound prompt comprises multimodal input, which does not change the analysis above with respect to claim 7.
Claim 9 merely specifies that the context comes from a “domain catalog” corresponding to a “domain of knowledge”. The claim does not specify where these “domains” are located and could practically exist in the human mind. Claim 9 therefore does not include any additional elements.
Claim 10 merely requires the additional context to be associated with a prompt history. A prompt history could also practically exist in the human mind. Claim 10 therefore does not include any additional elements.
Claims 11-19 are directed to non-transitory computer readable media comprising instructions to perform the steps of claims 1-9, respectively. The “one or more processors” recited in the preamble of claim 11 amounts to no more than mere instructions to implement the abstract idea using a generic computer. Claims 11-19 are therefore rejected for the same reasons as claims 1-9.
Claim 20 is directed to a system, comprising one or more memories, where one or more processors execute instructions to perform the method of claim 1. These generic additional computer elements do not change the analysis applied to claim 1. See also MPEP 2106.04(a)(2) (“both product claims (e.g., computer system, computer-readable medium, etc.) and process claims may recite mental processes.”)
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 19 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 19 recites the limitation "the first domain of knowledge" in lines 2-3 of the claim. There is insufficient antecedent basis for this limitation in the claim.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-8 and 10-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Siegenthaler et al. (U.S. Patent Application Pub. No. 2025/0061146, hereinafter “Siegenthaler”).
In regard to claim 1, Siegenthaler discloses a computer-implemented method for generating a context-enriched response (Fig. 4, 400), the method comprising:
generating additional context for a prompt input based on a context input (a query comprising an input text and an input image is received, paragraph [0076]);
combining the additional context with the prompt input to generate a context-enriched prompt (the system generates an input prompt for an LLM from the text query and natural language descriptors of the input image, paragraph [0086]); and
executing one or more generative machine learning (ML) models on the context-enriched prompt to generate the context-enriched response (the system generates a response to the query by providing the input prompt to a large language model (LLM), paragraph [0087]).
In regard to claim 2, Siegenthaler discloses the context input comprises a first portion of an image (objects/entities within the image, paragraph [0079]).
In regard to claim 3, Siegenthaler discloses generating the additional context comprises causing a generative ML model to generate a description of the first portion of the image (an object/entity detection model generates text identifying objects/entities in the image, paragraphs [0079-0080]).
In regard to claim 4, Siegenthaler discloses generating the additional context comprises determining a first set of annotations corresponding to the first portion of the image (one or more texts extracted from the image, paragraph [0079]).
In regard to claim 5, Siegenthaler discloses generating the additional context comprises:
identifying a first object within the first portion of the image (objects/entities within the image, paragraph [0079]); and
generating a first set of data corresponding to the first object (the contextual information is provided as structured data, see Abstract and paragraph [0055]).
In regard to claim 6, Siegenthaler discloses the additional context comprises a first portion of text (natural language descriptions of the image, paragraph [0045]), the prompt input comprises a second portion of text (input text query, paragraph [0045]), and combining the additional context with the prompt input comprises concatenating the first portion of text and the second portion of text (a natural language prompt comprising the contextual information and query text, paragraph [0046]).
In regard to claim 7, Siegenthaler discloses receiving a compound prompt that includes the prompt input and the context input (a combined text and image query, paragraph [0038]).
In regard to claim 8, Siegenthaler discloses the compound prompt comprises a multimodal prompt (a combined text and image query, paragraph [0038]).
In regard to claim 10, Siegenthaler discloses at least a portion of the additional context comprises a prompt history associated with the generative ML model (the prompt is enriched with conversation history, paragraph [0046]).
In regard to claim 11, Siegenthaler discloses one or more non-transitory computer-readable media (paragraph [0121]) including instructions that, when executed by one or more processors, cause the one or more processors to generate a context-enriched response by performing the steps of:
generating additional context for a prompt input based on a context input (a query comprising an input text and an input image is received, paragraph [0076]);
combining the additional context with the prompt input to generate a context-enriched prompt (the system generates an input prompt for an LLM from the text query and natural language descriptors of the input image, paragraph [0086]); and
executing one or more generative machine learning (ML) models on the context-enriched prompt to generate the context-enriched response (the system generates a response to the query by providing the input prompt to a large language model (LLM), paragraph [0087]).
In regard to claim 12, Siegenthaler discloses the context input comprises a first portion of an image (objects/entities within the image, paragraph [0079]).
In regard to claim 13, Siegenthaler discloses the step of generating the additional context comprises causing a generative ML model to generate a description of the first portion of the image (an object/entity detection model generates text identifying objects/entities in the image, paragraphs [0079-0080]).
In regard to claim 14, Siegenthaler discloses the step of generating the additional context comprises determining a first set of annotations corresponding to the first portion of the image (one or more texts extracted from the image, paragraph [0079]).
In regard to claim 15, Siegenthaler discloses the step of generating the additional context comprises:
identifying a first object within the first portion of the image (objects/entities within the image, paragraph [0079]); and
generating a first set of data corresponding to the first object (the contextual information is provided as structured data, see Abstract and paragraph [0055]).
In regard to claim 16, Siegenthaler discloses the additional context comprises a first portion of text (natural language descriptions of the image, paragraph [0045]), the prompt input comprises a second portion of text (input text query, paragraph [0045]), and combining the additional context with the prompt input comprises concatenating the first portion of text and the second portion of text (a natural language prompt comprising the contextual information and query text, paragraph [0046]).
In regard to claim 17, Siegenthaler discloses the step of receiving a multimodal prompt that includes the prompt input and the context input, wherein the multimodal prompt includes data from at least two different modalities (a combined text and image query, paragraph [0038]).
In regard to claim 18, Siegenthaler discloses the context input comprises a portion of domain data corresponding to a first domain of knowledge (the context comprises knowledge domains associated with a user profile, paragraph [0077]).
In regard to claim 19, Siegenthaler discloses at least a portion of the additional context comprises a prompt history associated with the first domain of knowledge. (the prompt is enriched with conversation history, paragraph [0046]).
In regard to claim 20, Siegenthaler discloses a system (Fig. 6, 610) comprising:
one or more memories storing instructions (memory subsystem 625); and
one or more processors coupled to the one or more memories (processors 614) that, when executing the instructions, perform the steps of:
generating additional context for a prompt input based on a context input (a query comprising an input text and an input image is received, paragraph [0076]);
combining the additional context with the prompt input to generate a context-enriched prompt (the system generates an input prompt for an LLM from the text query and natural language descriptors of the input image, paragraph [0086]); and
executing one or more generative machine learning (ML) models on the context-enriched prompt to generate the context-enriched response (the system generates a response to the query by providing the input prompt to a large language model (LLM), paragraph [0087]).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Siegenthaler, in view of Aung et al. (U.S. Patent Application Pub. No. 2023/0289530, hereinafter “Aung”).
In regard to claim 9, Siegenthaler does not expressly disclose the context input comprises a portion of domain data derived from a domain catalog, and the domain data corresponds to a first domain of knowledge, and the domain catalog corresponds to a plurality of different domains of knowledge.
Aung discloses a method for determining intent based on context input, wherein the context input comprises a portion of domain data derived from a domain catalog, and the domain data corresponds to a first domain of knowledge, and the domain catalog corresponds to a plurality of different domains of knowledge (a catalog of items associated with a knowledge domain is used as contextual input to determine a user’s intent from natural language input, paragraphs [0015-0016], [0019], and [0023]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize a portion of domain data derived from a domain catalog, and the domain data corresponds to a first domain of knowledge, and the domain catalog corresponds to a plurality of different domains of knowledge as context input, because it would allow the user to query information within the knowledge domain.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Mcmorran et al., De Wynter et al., Mikhailiuk et al., Ye et al., Goswami et al., Kharbanda et al., and Kuan disclose additional systems and methods for adding context information to prompts and/or generating descriptions of images.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIAN LOUIS ALBERTALLI whose telephone number is (571)272-7616. The examiner can normally be reached M-F 8AM-3PM, 4PM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached at 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
BLA 3/2/26
/BRIAN L ALBERTALLI/ Primary Examiner, Art Unit 2656