DETAILED ACTION
This communication is in response to the Amendments and Arguments filed on January 20, 2026. Claims 1-12 are pending and have been examined. Hence, this action has been made FINAL.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Response to Arguments
The reply filed on January 20, 2026 has been entered. Applicant’s arguments with respect to claims 1-12 have been considered but are not persuasive.
35 U.S.C. § 101
With respect to the applicant’s arguments to claim rejections under 35 U.S.C § 101, Applicant asserts that “The claimed invention requires operations that cannot be practically performed in the human mind. For example, claim 1 recites "conducting OCR on the uploaded document to extract text content" and "transforming, using a large language model (LLM), the text content to create LLM generated key-value pairs." As described in the specification at paragraphs [0048] and [0058], the LLM is implemented as a transformer-based neural network model that processes large volumes of text data to generate structured outputs. These operations require substantial computational resources and specialized machine learning algorithms, which are far beyond human capability.” The examiner respectfully disagrees with these assertions. MPEP 2106.04(a)(2) states “In evaluating whether a claim that requires a computer recites a mental process, examiners should carefully consider the broadest reasonable interpretation of the claim in light of the specification. … An example of a case identifying a mental process performed on a generic computer as an abstract idea is Voter Verified, Inc. v. Election Systems & Software, LLC, 887 F.3d 1376, 1385, 126 USPQ2d 1498, 1504 (Fed. Cir. 2018). In this case, the Federal Circuit relied upon the specification in explaining that the claimed steps of voting, verifying the vote, and submitting the vote for tabulation are "human cognitive actions" that humans have performed for hundreds of years. The claims therefore recited an abstract idea, despite the fact that the claimed voting steps were performed on a computer.” Similarly, the claimed steps of “conducting [optical character recognition] to extract text content” and “transforming text content to create key-value pairs” are human cognitive actions that humans perform independent of computers. The claimed LLM is recited at a high level of generality and merely equates to “apply it” or otherwise merely uses a generic computer as a tool to perform an abstract which are not indicative of integration into a practical application as per MPEP 2106.05(f).
Applicant further asserts that “converting a natural language query into a database query using an LLM involves schema-aware encoding and query generation techniques that cannot be performed mentally.” The examiner respectfully disagrees with these assertions. A human can cognitively convert a natural language query into an SQL query using memorized knowledge of the SQL schema. The limitation of an LLM is recited at a high level of generality and merely equates to “apply it” or otherwise merely uses a generic computer as a tool to perform an abstract which are not indicative of integration into a practical application as per MPEP 2106.05(f).
Applicant further asserts that “The 2025 Al Memo reinforces that claims reciting Al-based processing steps, such as natural language interpretation and structured data generation, should be evaluated as technological processes rather than mental steps.” The examiner respectfully disagrees with these assertions. The USPTO Memorandum on AI Patent Eligibility, dated August 4, 2025, makes no mention of natural language interpretation or structured data generation. Further, the applicant fails to cite portions of the August 2025 Memorandum which would support these assertions.
Applicant further asserts that “The invention provides a specific improvement to computer functionality by automating document ingestion, indexing, and search using advanced Al models. Paragraph [0048] explains that the LLM is not a generic component but a specialized model trained to interpret document content and generate structured key-value pairs. Paragraph [0058] further details how the LLM transforms natural language queries into optimized database queries, improving search accuracy and efficiency. These improvements address a technical problem-how to enable accurate and scalable search of heterogeneous document data-and provide a technical solution that enhances computer performance.” The examiner respectfully disagrees with these assertions. As per MPEP 2106.05(a), “it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology. For example, in Trading Technologies Int’l v. IBG, 921 F.3d 1084, 1093-94, 2019 USPQ2d 138290 (Fed. Cir. 2019), the court determined that the claimed user interface simply provided a trader with more information to facilitate market trades, which improved the business process of market trading but did not improve computers or technology.” Similarly, the steps of document ingestion, indexing, key-value pair generation, and database query generation are considered abstract ideas. Improvements in these activities are not considered to be improvements to computer technology, but instead improvements to the abstract idea itself which is not patent eligible.
Applicant further asserts that “Paragraph [0048] explains that the LLM is not a generic component but a specialized model trained to interpret document content and generate structured key-value pairs” The examiner respectfully disagrees with these assertions. The applicant asserts that paragraph [0048] of the specification describes how an LLM is not a generic component, but such language is not found in paragraph [0048] of the specification. To the examiner’s understanding, paragraph [0048] of the specification describes how an LLM can infer key-value pairs by reading a document, but this activity can similarly be performed in the human mind. The examiner finds no language in the specification that directs the LLM towards a non-generic implementation.
Applicant further asserts that “This integration is analogous to, e.g., Example 39 in the USPTO's Subject Matter Eligibility Examples, which describes a method for training a neural network for facial detection. In that example, the USPTO found the claims patent-eligible because they improved the functioning of a computer system by training a neural network in a specific way that enhanced image recognition performance. Similarly, the present claims recite a specific implementation of a large language model to transform document content and natural language queries into structured data, thereby improving the functioning of document processing and search systems.” The examiner respectfully disagrees with these assertions. Training a neural network to better recognize facial features is considered a specific improvement to computer technology itself because the neural network itself is actively being trained to achieve greater performance. The limitation of an LLM transforming document content cannot be considered analogous to the cited example because the limitation merely uses an LLM to perform an abstract idea, failing to actually modify or update the LLM itself in any meaningful way.
Applicant further asserts that “As noted in the specification at paragraph [0048], the LLM employs transformer-based architectures (which can be, e.g., BERT or GPT models or the like) that are highly specialized and not conventional in prior art indexing systems. These elements provide an inventive concept because they enable the system to interpret unstructured text and natural language queries in a way that conventional systems cannot. The integration of these components into a unified workflow for document ingestion, indexing, and search constitutes a technical improvement over prior art approaches that rely on static taxonomies and keyword matching.” The examiner respectfully disagrees with these assertions. As noted above, improvements to an abstract idea (e.g. document ingestion, indexing, and search) are not analogous to improvements in computer or other field of technology according to MPEP 2106.05(a).
Applicant further asserts that “The 2025 Al Memo further supports this conclusion by clarifying that Al-driven improvements to data processing and query generation, when tied to a specific technological solution, generally satisfy the "significantly more" requirement under Step 2B.” The examiner respectfully disagrees with these assertions. “AI-driven improvements to data processing and query generation,” are considered improvements to an abstract idea via usage of AI, but cannot be considered improvements to computer technology itself. As per USPTO Memorandum dated August 4, 2025, “Claims that are determined to improve computer capabilities or improve technology or a technical field support a finding that the claim integrates the judicial exception into a practical application or amounts to significantly more than the judicial exception itself.” The Memorandum states that an improvement to computer capabilities or an improvement to technology or technical field itself could integrate a judicial exception into practical application. Nowhere does the Memorandum explicitly state that “Al-driven improvements to data processing and query generation, when tied to a specific technological solution, generally satisfy the "significantly more" requirement under Step 2B.” The applicant is encouraged to cite the exact wording used in the USPTO Memorandum that could support this conclusion. As amended, there is no language in the independent claims that would prevent a human from performing these steps, as addressed in further detail below with respect to claim rejections under 35 USC § 101.
35 U.S.C. § 103
With respect to the applicant’s arguments to claim rejections under 35 U.S.C § 103, the applicant asserts with respect to independent claim 1 that “The claimed key-value pairs are not mere field mappings; they represent normalized associations between heterogeneous document labels and standardized taxonomy keys, enabling consistent indexing across diverse document types. Katzman teaches indexing using reserved fields, which are static and predefined for known document types, but does not teach dynamically mapping arbitrary field labels to taxonomy keys or generating additional normalized pairs using Al.” The examiner respectfully disagrees with these assertions. The applicant asserts that Katzman et al. does not teach dynamically mapping arbitrary field labels to taxonomy keys, but nowhere in the claim language of independent claim 1 is this feature recited. Accordingly, independent claim 1 merely recites “indexing the document content according to a taxonomy to create key-value pairs; wherein the taxonomy comprises a set of known keys that field labels can be mapped to”. If the applicant wishes to include in the claimed invention the ability to “dynamically [map] arbitrary field labels to taxonomy keys,” amendments must be made the claim language specifying such a feature. Of similar note are the asserted “heterogenous documents” which are noticeably absent from the claim limitations of the currently filed application.
The Applicant further asserts with respect to claim 1 that “Yuan teaches semantic encoding for schema mapping, but its approach is limited to converting text into embeddings for NLP tasks and does not produce persistent, searchable key-value structures integrated with taxonomy-based indexing. Neither reference suggests combining taxonomy-driven normalization with LLM-generated key-value pairs in a unified workflow.” The examiner respectfully disagrees with these assertions. As explained in further detail below under Claim Rejections - 35 USC § 103, Katzman et al. in view of Yuan et al. sufficiently teach a unified workflow of taxonomy-driven key-value generation. The step of “normalization” is not present in the claim language as currently filed, and thus cannot be considered a feature of the claimed invention.
The Applicant further asserts with respect to independent claim 7 that “Katzman does not teach OCR or LLM transformation, and its reserved fields do not adapt to unknown document types.” The examiner respectfully disagrees with these assertions. The step of “adapting to unknown document types” is not present in the claim language as currently filed, and thus cannot be considered a feature of the claimed invention.
The Applicant further asserts with respect to independent claim 7 that “Yuan does not teach taxonomy-based indexing or unified storage of normalized key-value pairs.” The examiner respectfully disagrees with these assertions. As explained in further detail below under Claim Rejections - 35 USC § 103, Katzman et al. sufficiently teaches a taxonomy-based indexing and unified storage of key-value pairs.
The Applicant further asserts with respect to independent claim 7 that “Boskovic does not teach integration of query generation with document ingestion and indexing.” The examiner respectfully disagrees with these assertions. As explained in further detail below under Claim Rejections - 35 USC § 103, Katzman et al. in view of Yuan et al. sufficiently teach document ingestion and indexing. Boskovic et al. is merely relied upon for its generation of database search queries from plan language search queries.
The Applicant further asserts that “The Office's rationale relies on hindsight. … Integrating these disparate systems into a unified workflow as claimed would require substantial redesign and is not a simple substitution.” The examiner respectfully disagrees with these assertions. First, the office action does not rely upon any motivation related to “substitution” as the applicant claims. Second, the applicant fails to explain or justify why unifying these separate art into a unified workflow would require substantial redesign. As per MPEP 2144.03, “To adequately traverse a finding based on official notice, an applicant must specifically point out the supposed errors in the examiner’s action, which would include stating why the noticed fact is not considered to be common knowledge or well-known in the art.” The applicant fails to specifically point out supposed errors in the previous office action, instead opting to merely assert that a “substantial redesign” would be required. The applicant fails to cite specific errors in the previous office action, and further fails to explain why the art in view of the current claim limitations are incompatible. Accordingly, such a traversal cannot be considered proper, and thus the applicant’s remarks cannot be considered valid justification for overcoming rejections under 35 USC § 103.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-12 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. All of the claims are method claims (1-12), apparatus/machine claims or manufacture claim under (Step 1), but under Step 2A all of these claims recite abstract ideas and specifically mental processes. These mental processes are more particularly recited in claims 1 and 7 as:
extracting document content from the uploaded document…
indexing the document content according to a taxonomy to create key-value pairs…
conducting OCR on the uploaded document to extract text content…
transforming, using a large language model (LLM), the text content to create LLM generated key-value pairs…
storing the key-value pairs and the LLM generated key-value pairs to a database…
passing a plain language search query to the LLM to create a database search query…
conducting a search of the database using the database search query…
Under Step 2A Prong One, claims 1 and 7 are directed to an abstract idea and specifically a mental process. As detailed above, the steps of extracting, indexing, conducting, transforming, storing, etc. may be practically performed in the human mind with the use of a physical aid such as a pen and paper. For example, a human could receive a document from their boss, extract key-value pairs from the document content, index the document in a filing cabinet with a pre-defined organization hierarchy, manually read and understand the document, create their own key-value pairs based off of their understanding, and then store both the key-pairs from the document content and their own personally generated key-pairs in a filing cabinet. The human can then receive a natural language (NL) request from their boss, convert the NL request into a schema-based query, and then search their filing cabinet using the schema-based query.
Under Step 2A Prong Two, this judicial exception is not integrated into a practical application because claims 1-12 do not recite additional elements that integrate the exception into a practical application. In particular, claims 1 and 7 recite the additional elements of a large language model (¶ [0048], [0058]). These additional elements are recited at a high level of generality and merely equate to “apply it” or otherwise merely uses a generic computer as a tool to perform an abstract which are not indicative of integration into a practical application as per MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Under Step 2B, the claims do not recite additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements of using a computer is noted as a general computer {large language model (¶ [0048], [0058])}. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
With respect to claims 2 and 8, the claim relates to using a user-defined taxonomy. This relates to a human identifying key-value pairs from document content according to their own definitions of key and value fields. No additional limitations are present. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 3 and 9, the claim relates to using a built-in taxonomy. This relates to a human identifying key-value pairs from document content according to their organization’s definitions of key and value fields. No additional limitations are present. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claim 4, the claim relates to performing database searches using natural language user queries. This relates to a human receiving a natural language (NL) request from their boss, converting the NL request into a schema-based query, and then searching their filing cabinet using the schema-based query. No additional limitations are present. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claim 5 and 10, the claim relates to editing database queries before using them to conduct a search. This relates to a human modifying their schema-based query before searching their filing cabinet. No additional limitations are present. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claim 6 and 11, the claim relates to editing the NL query before converting it to a database query. This relates to a human modifying their boss’ NL query before creating their own schema-based query. No additional limitations are present. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claim 12, the claim relates to the database query containing a key from the taxonomy. This relates to a human’s schema-based query containing a label from their own pre-defined schema specific to their filing cabinet. No additional limitations are present. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
For all of the above reasons, taken alone or in combination, claims 1-12 recite a non-statutory mental process.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 20210157861 A1, (Katzman et al.) in view of US Patent Publication 20240220727 A1 (Yuan et al.).
Claim 1
Regarding claim 1, Katzman et al. disclose extracting document content from the uploaded document (Katzman et al. ¶ [0025], "Documents are acquired, text from the documents extracted and indexed, etc. to make them searchable using term-based or question-based queries."), the document content comprising field labels having associated field content (Katzman et al. ¶ [0035], "each document is a collection of fields, which are the key-value pairs that contain data."); indexing the document content according to a taxonomy (Katzman et al. ¶ [0034], "The indexing service 140 indexes documents that have been acquired by the ingestion service 130 into one or more indexes 107.") to create key-value pairs (Katzman et al. ¶ [0035], "An index can be thought of as an optimized collection of documents and each document is a collection of fields, which are the key-value pairs that contain data."); wherein the taxonomy comprises a set of known keys that field labels can be mapped to (Katzman et al. ¶ [0088], "The generation of the index includes mapping labels of the document into fields for the index entry. In some embodiments, this mapping utilizes reserved fields." Reserved fields are considered analogous to known keys); wherein each key-value pair comprises a field label matched to a key and field content matched to a value (Katzman et al. ¶ [0090], "Documents can use labels to indicate what the text is. For example, a label indicating the text is the title. As discussed above, index entries include fields for the text content and these fields correspond to labels."); and
storing the key-value pairs [and the LLM generated key-value pairs] to a database (Katzman et al. ¶ [0034], "The ingestion service 130 ... calls an indexing service to generate index entries for text, and causes the documents (or subset thereof) to be stored.").
Katzman et al. do not explicitly disclose all of conducting OCR on documents to extract text content or inputting the OCR text content into an LLM to generate key-value pairs.
However, Yuan et al. disclose conducting OCR on the uploaded document to extract text content (Yuan et al. ¶ [0047], "The portions of text are illustratively extracted from document 218 using optical character recognition (OCR) engine 224"); and
transforming, using a large language model (LLM) (Yuan et al. ¶ [0048], "In one or more embodiments, NLP model 204 is implemented as a pre-trained, transformer-based machine learning model, such as the Bidirectional Encoder Representations from Transformers (BERT) model."), the text content to create LLM generated key-value pairs (Yuan et al. ¶ [0053]-[0054], "In block 308, the semantic encoder of enhanced NLP model 206 is used to re-encode each of the discrete portions of text 220 of document 218. The portion of text identified as the schema key is re-encoded as a key vector. Each of the other portions of text are also re-encoded by the semantic encoder of enhanced NLP model 206. ... In block 310, enhanced NLP model 206 generates key-value pairs from the key vector and other re-encoded vector.").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify Katzman et al.’s document indexing method to include Yuan et al.’s LLM-generated key-value pairs because such a modification is the result of combining prior art elements according to known methods to yield predictable results. More specifically, Katzman et al.’s document indexing method as modified by Yuan et al.’s LLM-generated key-value pairs can yield a predictable result of improving the user query experience since storing additional LLM-generated key-value pairs would increase the number of results a given user query could match and return to the user. Thus, a person of ordinary skill would have appreciated including in Katzman et al.’s document indexing method the ability to do Yuan et al.’s LLM-generated key-value pairs since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Claim 2
Regarding claim 2, the rejection of claim 1 is incorporated. Katzman et al. in view of Yuan et al. disclose all the elements of the claimed invention as stated above. Katzman et al. further disclose wherein the taxonomy is user-defined (Katzman et al. ¶ [0092], "a GUI 1000 allows for a user to adjust reserved fields including adding a field, removing a field, and updating a field.").
Claim 3
Regarding claim 3, the rejection of claim 1 is incorporated. Katzman et al. in view of Yuan et al. disclose all the elements of the claimed invention as stated above. Katzman et al. further disclose wherein the taxonomy is built-in (Katzman et al. ¶ [0090]-[0091], "In the enterprise search service 102 described herein a set of “reserved” labels may be utilized in index entry fields. ... In some embodiments, body and title are default.").
Claims 4-12 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 20210157861 A1 (Katzman et al.) in view of US Patent Publication 20240220727 A1 (Yuan et al.), and further in view of US Patent Publication 20240248896 A1 (Boskovic et al.).
Claim 4
Regarding claim 4, the rejection of claim 1 is incorporated. Katzman et al. in view of Yuan et al. disclose all the elements of the claimed invention as stated above.
Katzman et al. in view of Yuan et al. do not explicitly disclose all of generating a database search query from a natural language query using a LLM.
However, Boskovic et al. disclose receiving a plain language search query (Boskovic et al. ¶ [0030], "User(s) 114 may enter database queries in natural language (NL), e.g., NL question 108.");
passing the plain language search query to the LLM to create a database search query (Boskovic et al. ¶ [0019], "A schema-aware input encoding may be used, for example, in combination with a generative pre-trained transformer (GPT) model, such as a third generation GPT model (GPT-3 model) to generate an SQL query from an underlying NL input.");
receiving, from the LLM, the database search query that is based on the plain language search query (Boskovic et al. ¶ [0019], "A schema-aware input encoding may be used, for example, in combination with a generative pre-trained transformer (GPT) model, such as a third generation GPT model (GPT-3 model) to generate an SQL query from an underlying NL input."); and
conducting a search of the database using the database search query (Boskovic et al. ¶ [0031], "Query server(s) 120 may be configured to send, or not send, query 126 to database server(s) 130, for example, based on indication 112 provided by user(s) 114.").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify Katzman et al. in view of Yuan et al. to incorporate LLM-generated database queries as taught by Boskovic et al.
The suggestion/motivation for doing so would have been that, “[queries] generated by a database query generator 122 (e.g., based on an SQL code generator model) may more accurately reflect NL question 108 relative to tables 134, make execution of query 126 by DBMS 132 more efficient, and/or reduce duplicative NL questions, question encoding, query generation, and query execution in search of accurate query results,” as noted by the Boskovic et al. disclosure in paragraph [0045].
Claim 5
Regarding claim 5, the rejection of claim 4 is incorporated. Katzman et al. in view of Yuan et al. in view of Boskovic et al. disclose all the elements of the claimed invention as stated above. Boskovic et al. further disclose wherein the database search query is subject to editing by a user before it is used to conduct the search (Boskovic et al. ¶ [0029]-[0031], "client computing device(s) 116 may execute NL interface 118, which may provide a user interface (e.g., a graphical user interface (GUI)) for user(s) 114 to interact with, such as to ... receive, review, edit, and/or submit query 126 ... Query server(s) 120 may generate query 126 in a Structured Query Language SQL) or SQL-like dialect (e.g., SCOPE)" Reviewing or editing an SQL query is considered analogous to editing a database search query).
Claim 6
Regarding claim 6, the rejection of claim 4 is incorporated. Katzman et al. in view of Yuan et al. in view of Boskovic et al. disclose all the elements of the claimed invention as stated above. Boskovic et al. further disclose wherein the plain language search query is subject to editing by a user before it is used to create a database search query (Boskovic et al. ¶ [0029], "client computing device(s) 116 may execute NL interface 118, which may provide a user interface (e.g., a graphical user interface (GUI)) for user(s) 114 to interact with, such as to enter, review and edit NL questions 108").
Claim 7
Regarding claim 7, Katzman et al. disclose extracting document content from the uploaded document (Katzman et al. ¶ [0025], "Documents are acquired, text from the documents extracted and indexed, etc. to make them searchable using term-based or question-based queries."), the document content comprising field labels having associated field content (Katzman et al. ¶ [0035], "each document is a collection of fields, which are the key-value pairs that contain data."); indexing the document content according to [a taxonomy (Katzman et al. ¶ [0034], "The indexing service 140 indexes documents that have been acquired by the ingestion service 130 into one or more indexes 107.") to create key-value pairs] (Katzman et al. ¶ [0035], "An index can be thought of as an optimized collection of documents and each document is a collection of fields, which are the key-value pairs that contain data."); wherein the taxonomy comprises a set of known keys that field labels can be mapped to (Katzman et al. ¶ [0088], "The generation of the index includes mapping labels of the document into fields for the index entry. In some embodiments, this mapping utilizes reserved fields." Reserved fields are considered analogous to known keys); wherein each key-value pair comprises a field label matched to a key and field content matched to a value (Katzman et al. ¶ [0090], "Documents can use labels to indicate what the text is. For example, a label indicating the text is the title. As discussed above, index entries include fields for the text content and these fields correspond to labels."); and
storing the key-value pairs [and the LLM generated key-value pairs] to a database (Katzman et al. ¶ [0034], "The ingestion service 130 ... calls an indexing service to generate index entries for text, and causes the documents (or subset thereof) to be stored.").
Katzman et al. do not explicitly disclose all of conducting OCR on documents to extract text content or inputting the OCR text content into an LLM to generate key-value pairs.
However, Yuan et al. disclose conducting OCR on the uploaded document to extract text content (Yuan et al. ¶ [0047], "The portions of text are illustratively extracted from document 218 using optical character recognition (OCR) engine 224"); and
transforming, using a large language model (LLM) (Yuan et al. ¶ [0048], "In one or more embodiments, NLP model 204 is implemented as a pre-trained, transformer-based machine learning model, such as the Bidirectional Encoder Representations from Transformers (BERT) model."), the text content to create LLM generated key-value pairs (Yuan et al. ¶ [0053]-[0054], "In block 308, the semantic encoder of enhanced NLP model 206 is used to re-encode each of the discrete portions of text 220 of document 218. The portion of text identified as the schema key is re-encoded as a key vector. Each of the other portions of text are also re-encoded by the semantic encoder of enhanced NLP model 206. ... In block 310, enhanced NLP model 206 generates key-value pairs from the key vector and other re-encoded vector.").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify Katzman et al.’s document indexing method to incorporate Yuan et al.’s LLM-generated key-value pairs.
The suggestion/motivation for doing so is similar to the suggestion/motivation described above with respect to claim 1.
Katzman et al. in view of Yuan et al. do not explicitly disclose all of generating a database search query from a natural language query using a LLM.
However, Boskovic et al. disclose receiving a plain language search query (Boskovic et al. ¶ [0030], "User(s) 114 may enter database queries in natural language (NL), e.g., NL question 108.");
passing the plain language search query to the LLM to create a database search query (Boskovic et al. ¶ [0019], "A schema-aware input encoding may be used, for example, in combination with a generative pre-trained transformer (GPT) model, such as a third generation GPT model (GPT-3 model) to generate an SQL query from an underlying NL input.");
receiving, from the LLM, the database search query that is based on the plain language search query (Boskovic et al. ¶ [0019], "A schema-aware input encoding may be used, for example, in combination with a generative pre-trained transformer (GPT) model, such as a third generation GPT model (GPT-3 model) to generate an SQL query from an underlying NL input."); and
conducting a search of the database using the database search query (Boskovic et al. ¶ [0031], "Query server(s) 120 may be configured to send, or not send, query 126 to database server(s) 130, for example, based on indication 112 provided by user(s) 114.").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify Katzman et al. in view of Yuan et al. to incorporate Boskovic et al.’s LLM-generated database queries.
The suggestion/motivation for doing so is similar to the suggestion/motivation described above with respect to claim 4.
Claim 8
Regarding claim 8, the rejection of claim 7 is incorporated. The limitations of claim 8 are similar in scope to that of claim 2 and therefore are rejected for similar reasons as described above.
Claim 9
Regarding claim 9, the rejection of claim 7 is incorporated. The limitations of claim 9 are similar in scope to that of claim 3 and therefore are rejected for similar reasons as described above.
Claim 10
Regarding claim 10, the rejection of claim 7 is incorporated. The limitations of claim 10 are similar in scope to that of claim 5 and therefore are rejected for similar reasons as described above.
Claim 11
Regarding claim 11, the rejection of claim 7 is incorporated. The limitations of claim 11 are similar in scope to that of claim 6 and therefore are rejected for similar reasons as described above.
Claim 12
Regarding claim 12, the rejection of claim 7 is incorporated. Katzman et al. in view of Yuan et al. in view of Boskovic et al. disclose all the elements of the claimed invention as stated above. Boskovic et al. further disclose wherein the database search query comprises at least one key from the taxonomy (Boskovic et al. ¶ [0036], "Schema aware input encoder 104 may generate encoded question 106, e.g., based on the one or more databases indicated by NL question 108 and schema information in DB schema 124." ¶ [0040], "DB query generator 122 may receive encoded question 106, generate query 126, receive query reject/approve indication 112, and send query 126 to database server(s) 130" Generating SQL queries based on specific DB schema is considered analogous to a database search query including at least one key from a taxonomy).
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB B VOGT whose telephone number is (571)272-7028. The examiner can normally be reached Monday - Friday 9:30am - 7pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras D Shah can be reached at (571)270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JACOB B VOGT/Examiner, Art Unit 2653
/Paras D Shah/Supervisory Patent Examiner, Art Unit 2653
05/05/2026