DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant’s Amendments, filed January 12, 2026, have been entered. No claims have been amended, and claims 1-20 are currently pending.
Response to Arguments
Applicant's arguments filed January 12, 2026 have been fully considered but they are not persuasive. Applicant argues that Lanka et al. (Pub. No. US 2024/030362 A1, hereinafter “Lanka”) does not teach the limitation “grouping and reordering the related data chunks…” because Lanka teaches a relevance-based sorting step rather than a reconstruction of fragmented content into a structure-preserving context (Remarks p. 2). In response, examiner respectfully submits that Lanka teaches in Fig. 5, 550 the method proceeds with late interaction model sorting, using the particular output vector, candidate records with respective similarity scores that satisfy a threshold score. Late interaction model may sort (i.e. grouping and reordering), using output vector 125, one of candidate records 135 with respective similarity scores that satisfy a threshold score (Lanka [0065]). Lanka also teaches that each record is made up chunks where a vector representation of each chunk is created so that there is a strong match between the particular chunk and a given output vector (Lanka [0050]). Also, the sorting includes re-ranking candidate records (Lanka [0055]). In other words, where records are sorted, chunks, which each record is made up of (i.e. structure aware) and are also sorted (i.e. grouping and reordered).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 2, 9-12, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lanka in view of Graham (GB 2637721 A, hereinafter “Graham”).
Regarding claim 1, Lanka teaches:
training a large language model to obtain a domain-specific large language model with one or more domain-specific training data by: obtaining one or more data structures from the domain-specific data respectively; and creating a vector database containing context-aware data chunks generated from the one or more data structures (Lanka – see [0050], where in response to receiving a particular record (i.e. data structure) to place in the datastore, trained retrieval model generates a plurality of vector representations for the received record. One of vector representations 438aa-438ac in Fig. 4 correspond to different portions of the received record, e.g. chunks. Records may correspond to documents, each potentially having a plurality of paragraphs, figures, charts, etc. By dividing records into respective sets of chunks, a finer granularity towards the characteristics included in each of records 435 may be maintained.)
conducting semantic search on the vector database to obtain one or more data chunks related to the query (Lanka – see [0062], where at Fig. 5, 520 the trained retrieval model generates a particular output vector representing a received semantic search query. Trained retrieval model receives search query 145, generates output vector 125 as a numeric representation of search query 145. Trained retrieval model is trained to perform semantic searches, and considers a contextual meaning of a search string included in the search query.)
grouping and reordering the related data chunks based on corresponding data structures from which the related data chunks are generated to obtain one or more structure-aware contexts (Lanka – see [0065], where in Fig. 5, 550 the method proceeds with late interaction model sorting, using the particular output vector, candidate records with respective similarity scores that satisfy a threshold score. Late interaction model may sort (i.e. grouping and reordering), using output vector 125, one of candidate records 135 with respective similarity scores that satisfy a threshold score.)
Lanka does not appear to teach:
performing semantic compression on the one or more structure-aware contexts based on the query to obtain a query-aware compressed prompt; and using the trained domain-specific large language model to generate the answer in response to the query-aware compressed prompt
However, Graham teaches:
performing semantic compression on the one or more structure-aware contexts based on the query to obtain a query-aware compressed prompt; and using the trained domain-specific large language model to generate the answer in response to the query-aware compressed prompt (Graham – see p. 17, lines 4-30, where a large language model generates, for each of the plurality of input chunks, a reduced summary of the input chunk. Where the method comprises receiving one or more questions and determining contextual information associated with the questions, it may be that the generating of the one or more prompts is performed on the basis of the retrieved information. A ninth step, represented by item 217 of method 200 comprises collating the plurality of reduced summaries to generate the compressed input data (i.e. semantic compression). An optional eleventh step, represented by item 221, of the method 200, comprises operating an artificial neural network to generate, on the basis of the compressed input data, a response to each of the one or more questions.)
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Lanka and Graham before them, to modify the system of Lanka with the teachings of Graham as shown above. One would have been motivated to make such a modification to improve the output of a generic model in respect of specialist subject matter by providing the model with further information on the specialist field as prompts when operating the model (Graham - [p. 1 lines 26-29]).
Claim 11 corresponds to claim 1 and is rejected accordingly.
Regarding claim 2, Lanka teaches:
wherein the one or more domain-specific training data are one or more domain- specific documents respectively (Lanka – see [0047], where trained retrieval model generates output vector 425 representing search query 445. Output vector 425 may include a plurality of numeric elements, each element representative of a characteristic that may be found in one or more records in data store 130. How may characteristics are represented in output vector 425 may depend, for example, on how varied are the topics of records stored in data store 130. For example, if data store 130 only stores records associated with electronic exchanges (i.e. domain-specific), then output vector 425 may only include tens to hundreds of characteristics to sufficiently describe the stored records.)
and the one or more data structures are one or more documents trees of the one or more domain-specific documents respectively; and the one or more context-aware data chunks are one or more context-aware document chunks respectively (Lanka – see [0050], where in respond to receiving a particular record (i.e. data structure) to place in the datastore, trained retrieval model generates a plurality of vector representations for the received record. One of vector representations 438aa-438ac in Fig. 4 correspond to different portions of the received record, e.g. chunks. Records may correspond to documents, each potentially having a plurality of paragraphs, figures, charts, etc. By dividing records into respective sets of chunks, a finer granularity towards the characteristics included in each of records 435 may be maintained.)
Claim 12 corresponds to claim 2 and is rejected accordingly.
Regarding claim 9, Lanka teaches:
wherein conducting semantic search on the vector database to obtain one or more data chunks related to the query comprises: calculating a similarity value between the query and each document chunk in the vector database; and identifying the document chunk as a related document chunk if the calculated similarity value is greater than or equal to a similarity threshold (Lanka – see [0065], where in Fig. 5, 550 the method proceeds with late interaction model sorting, using the particular output vector, candidate records with respective similarity scores that satisfy a threshold score. Late interaction model may sort (i.e. grouping and reordering), using output vector 125, one of candidate records 135 with respective similarity scores that satisfy a threshold score. Only records with a particular degree of similarity to the search query are included in the candidate record list, which may be presented to the user.)
Claim 19 corresponds to claim 9 and is rejected accordingly.
Regarding claim 10, Lanka teaches:
wherein grouping and reordering the related data chunks based on corresponding data structures from which the related data chunks are generated to obtain one or more structure-aware contexts comprises grouping and reordering the related document chunks based on corresponding document trees from which the related document chunks are generated to obtain the one or more structure-aware contexts (Lanka – see [0065], where in Fig. 5, 550 the method proceeds with late interaction model sorting, using the particular output vector, candidate records with respective similarity scores that satisfy a threshold score. Late interaction model may sort (i.e. grouping and reordering), using output vector 125, one of candidate records 135 with respective similarity scores that satisfy a threshold score.)
Claim 20 corresponds to claim 10 and is rejected accordingly.
Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Lanka in view of Graham in view of Larson et al. (Pub. No. US 2025/0131289 A1, hereinafter “Larson”).
Regarding claim 3, Lanka modified by Graham does not appear to teach:
wherein each document tree is obtained by parsing a corresponding document and the document tree is expressed as a hierarchy of nodes, each node storing text contents of the corresponding document
However, Larson teaches:
wherein each document tree is obtained by parsing a corresponding document and the document tree is expressed as a hierarchy of nodes, each node storing text contents of the corresponding document (Larson – see [0021-0023], where metadata extraction 208 identifies entities in the dataset 102 (e.g., in the data chunks) and relationships between the entities. An entity content aggregator 320 can combine entity content from the semantic search database 304 with the results of the graph induction 306 to produce a semantic summary 322. The semantic summary 322 can involve a hierarchical set partition structure including one or more selected root communities 324 and various sub-communities 326(1)-326(N). The root communities 324 and the sub-communities 326(1)-326(N) can progress in depth all the way down to the node level if desired.)
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Lanka, Graham and Larson before them, to modify the system of Lanka and Graham with the teachings of Larson as shown above. One would have been motivated to make such a modification to provide meaningful information related to the dataset (Larson - [0003]).
Claim 13 corresponds to claim 3 and is rejected accordingly.
Claims 4-8 and 14-18 are rejected under 35 U.S.C. 103 as being unpatentable over Lanka in view of Graham in view of Larson further in view of Zhang (CN 202410053415 A, hereinafter “Zhang”).
Regarding claim 4, Lanka modified by Graham and Larson does not appear to teach:
wherein the hierarchy of nodes include: a root node storing the entire document; one or more branch nodes storing the title and sections of the document respectively; and one or more leave nodes storing text contents under the title and sections of the document respectively
However, Zhang teaches:
wherein the hierarchy of nodes include: a root node storing the entire document; one or more branch nodes storing the title and sections of the document respectively; and one or more leave nodes storing text contents under the title and sections of the document respectively (Zhang – see p. 6 lines 1-13, where the process of creating the vector database comprises performing analysis on the semantic relationship between the pre-input text to be retrieved, generating a corresponding document semantic tree, invoking a preset semantic question and answer model to perform traversal analysis on the document semantic tree, obtaining semantic summary information corresponding to each text in the text to be retrieved, and adding the semantic summary information to the node of the document semantic tree. Also see p. 6 lines 15-37, where it is necessary to segment the text to generate a chunk of text. The main attributes of the chunks include: text block ID, text block content, text block type, where the text block type comprises: title, paragraph, title abstract, chapter abstract, and the node ID to which the node belongs.)
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Lanka, Graham, Larson and Zhang before them, to modify the system of Lanka, Graham and Larson with the teachings of Zhang as shown above. One would have been motivated to make such a modification to quickly generate accurate answer information, which effectively improves the efficiency and accuracy of question-answer processing so as to greatly improve the use experience of the user (Zhang - [Abstract]).
Claim 14 corresponds to claim 4 and is rejected accordingly.
Regarding claim 5, Lanka modified by Graham and Larson does not appear to teach:
initializing a new chunk as a current chunk; b) initializing a new sub-chunk as a current sub-chunk
c) searching the document tree to acquire a new leaf node as a current leaf node; d) extracting a current text embedding from text content of the current leaf node
e) checking whether a size of the current sub-chunk exceeds a sub-chunk size limit; f) if the size of the current sub-chunk does not exceed the sub-chunk size limit, adding the current text embedding into the current sub-chunk, and repeating the steps c) to e); g) if the size of the current sub-chunk exceeds the sub-chunk size limit, determining that the current sub-chunk is full, initializing a new sub- chunk as the current sub-chunk and adding the current text embedding into the current sub-chunk
h) checking whether a size of a current chunk exceeds a chunk size limit; i) if the size of the current chunk does not exceed the chunk size limit, adding the full sub-chunk into the current chunk, and repeating the steps c) to h); j) if the size of the current chunk exceeds the chunk size limit, determining that the current chunk is full, initializing a new chunk as the current chunk and adding the full sub-chunk into the current chunk; k) checking whether all of the leaf nodes of the document tree are acquired;1) if not all leaf nodes are acquired, returning to step c); and m) if all of the leaf nodes are acquired, outputting the context-aware document chunks
However, Zhang teaches:
initializing a new chunk as a current chunk; b) initializing a new sub-chunk as a current sub-chunk (Zhang – see p. 6 lines 15-37, where the Emdata model has the size of the context window. If the length of the text exceeds the window limit, it is necessary to segment the text to generate a chunk of text. The text to be retrieved may refer to a document which is not disclosed within the user, and the hierarchical structure of the document is generally composed of chapters, sub-chapters and paragraphs. A chapter may include a plurality of paragraphs and sub-chapters (i.e. sub-chunk).)
c) searching the document tree to acquire a new leaf node as a current leaf node; d) extracting a current text embedding from text content of the current leaf node (Zhang – see p. 7 lines 5-34, where as shown in Fig. 2, the text can be divided into a plurality of chapters, sub-chapters, paragraphs according to the content framework, and so are segmented to obtain the text to be retrieved, the document semantic tree is constructed based on the association relation between the texts to be retrieved.)
e) checking whether a size of the current sub-chunk exceeds a sub-chunk size limit; f) if the size of the current sub-chunk does not exceed the sub-chunk size limit, adding the current text embedding into the current sub-chunk, and repeating the steps c) to e); g) if the size of the current sub-chunk exceeds the sub-chunk size limit, determining that the current sub-chunk is full, initializing a new sub- chunk as the current sub-chunk and adding the current text embedding into the current sub-chunk (Zhang – see p. 7 lines 5-34, where the node in the document semantic tree represents text having semantic relationships, such as chapter 1, chapter 2, chapter 1.1, chapter 1.2, chapter 1.2.1. Specifically, this step is a text recall flow, and may include: searching the 10 chunks with the highest matching degree from the vector database, searching the node of these text chunk in the semantic tree of the document, selecting the text in the semantic tree according to the strategy; if the length of the text chunks is greater than the size of the context window of the large model, discarding the ranked text chunks according to the demand; if there are a plurality of text chunks belonging to the same paragraph, the text of the largest interval covered by the chunks in the paragraph is taken.)
h) checking whether a size of a current chunk exceeds a chunk size limit; i) if the size of the current chunk does not exceed the chunk size limit, adding the full sub-chunk into the current chunk, and repeating the steps c) to h); j) if the size of the current chunk exceeds the chunk size limit, determining that the current chunk is full, initializing a new chunk as the current chunk and adding the full sub-chunk into the current chunk; k) checking whether all of the leaf nodes of the document tree are acquired;1) if not all leaf nodes are acquired, returning to step c); and m) if all of the leaf nodes are acquired, outputting the context-aware document chunks (Zhang – see p. 6 lines 15-37, where the Emdata model has the size of the context window. If the length of the text exceeds the window limit, it is necessary to segment the text to generate a chunk of text. Also see p. 6 lines 15-37, where to generate a document semantic tree, traversing the document semantic tree, transferring the large model (namely semantic question and answer model) interface to generate the abstract of the paragraphs and chapters from the bottom to the top. The generated abstract is added into the document semantic tree.)
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Lanka, Graham, Larson and Zhang before them, to modify the system of Lanka, Graham and Larson with the teachings of Zhang as shown above. One would have been motivated to make such a modification to quickly generate accurate answer information, which effectively improves the efficiency and accuracy of question-answer processing so as to greatly improve the use experience of the user (Zhang - [Abstract]).
Claim 15 corresponds to claim 5 and is rejected accordingly.
Regarding claim 6, Lanka modified by Graham and Larson does not appear to teach:
wherein the document tree is searched using a depth-first search algorithm
However, Zhang teaches:
wherein the document tree is searched using a depth-first search algorithm (Zhang – see p. 6 lines 15-37, where the Emdata model has the size of the context window. If the length of the text exceeds the window limit, it is necessary to segment the text to generate a chunk of text. Also see p. 6 lines 15-37, where to generate a document semantic tree, traversing the document semantic tree, transferring the large model (namely semantic question and answer model) interface to generate the abstract of the paragraphs and chapters from the bottom to the top. The generated abstract is added into the document semantic tree.)
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Lanka, Graham, Larson and Zhang before them, to modify the system of Lanka, Graham and Larson with the teachings of Zhang as shown above. One would have been motivated to make such a modification to quickly generate accurate answer information, which effectively improves the efficiency and accuracy of question-answer processing so as to greatly improve the use experience of the user (Zhang - [Abstract]).
Claim 16 corresponds to claim 6 and is rejected accordingly.
Regarding claim 7, Lanka modified by Graham and Larson does not appear to teach:
wherein the chunk size limit is adaptively determined on basis of a context length limit of the domain-specific large language model
However, Zhang teaches:
wherein the chunk size limit is adaptively determined on basis of a context length limit of the domain-specific large language model (Zhang – see p. 6 lines 15-37, where the Emdata model has the size of the context window. If the length of the text exceeds the window limit, it is necessary to segment the text to generate a chunk of text.)
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Lanka, Graham, Larson and Zhang before them, to modify the system of Lanka, Graham and Larson with the teachings of Zhang as shown above. One would have been motivated to make such a modification to quickly generate accurate answer information, which effectively improves the efficiency and accuracy of question-answer processing so as to greatly improve the use experience of the user (Zhang - [Abstract]).
Claim 17 corresponds to claim 7 and is rejected accordingly.
Regarding claim 8, Lanka modified by Graham and Larson does not appear to teach:
wherein the sub-chunk size limit is adaptively determined on basis of an embedding size of a text embedding layer of the large language model
However, Zhang teaches:
wherein the sub-chunk size limit is adaptively determined on basis of an embedding size of a text embedding layer of the large language model (Zhang – see p. 6 lines 15-37, where the Emdata model has the size of the context window. If the length of the text exceeds the window limit, it is necessary to segment the text to generate a chunk of text.)
Accordingly, it would have been obvious to a person of ordinary skill in the art at the time the invention was effectively filed, having the teachings of Lanka, Graham, Larson and Zhang before them, to modify the system of Lanka, Graham and Larson with the teachings of Zhang as shown above. One would have been motivated to make such a modification to quickly generate accurate answer information, which effectively improves the efficiency and accuracy of question-answer processing so as to greatly improve the use experience of the user (Zhang - [Abstract]).
Claim 18 corresponds to claim 8 and is rejected accordingly.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANJIT P DORAISWAMY whose telephone number is (571)270-5759. The examiner can normally be reached Monday-Friday 9:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sanjiv Shah can be reached at (571) 272-4098. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RANJIT P DORAISWAMY/ Examiner, Art Unit 2166
/SANJIV SHAH/ Supervisory Patent Examiner, Art Unit 2166