Last updated: April 19, 2026
Application No. 18/477,209
INDEXING SPLIT DOCUMENTS FOR DATA RETRIEVAL AUGMENTING GENERATIVE MACHINE LEARNING RESULTS

Final Rejection §101§103
Filed
Sep 28, 2023
Examiner
SOLAIMAN, FOUZIA HYE
Art Unit
2653
Tech Center
2600 — Communications
Assignee
Amazon Technologies, Inc.
OA Round
2 (Final)
Interview Optional

— +55.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 63 resolved cases, 2023–2026
Examiner Intelligence

SOLAIMAN, FOUZIA HYE View full profile →
Grants 67% — above average
Career Allow Rate
42 granted / 63 resolved
+4.7% vs TC avg
Strong +56% interview lift
Without
With
+55.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
16 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
28.5%
-11.5% vs TC avg
§103
47.1%
+7.1% vs TC avg
§102
16.0%
-24.0% vs TC avg
§112
2.7%
-37.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 63 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Information Disclosure Statement
IDS dates are12/04/2025. All IDS are considered.

Response to Amendments and Arguments
Applicant's arguments filed 11/13/2025 have been fully considered but they are not persuasive. 
	With regard to the 35 U.S.C. 101 rejection, applicant argues that the pending claims are eligible under 35 U.S.C. 101 because the system provide “significant technical advantages including improved computing efficiency and resource utilization by indexing documents …”(page  14, last para, Remark page). Examiner respectfully disagrees with this assertion. Claim does not have training of the model and training steps. Human can do document processing, it’s human activity.  The claimed generative model is cited at very high level of generality. The processor is just general computer components which is not enough to make claim patent eligible. Automating an abstract idea using a machine does not transform the abstract idea into an eligible subject matter. Therefore, examiner maintains 101 rejection.
 	With regard to the 35 U.S.C. 102 rejection, applicant newly amended claims require newly ground of rejection. After extensive search examiner found a prior art and made 103 rejection. Please see office action below.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20  are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim(s) 1, 5, and 14, the limitation(s) of  “receiving”, “generating”,  “performing”,  “ranking”, “including”,  and “returning”, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. More specifically, the mental process of a human can perform a task upon receiving, human knows what keyword/search representation needs to be done for specific task, and task is finding some data from plurality of documents, human can perform search of an index and return result based on search, template or index can be created and different field of template can be filled out with pursed or tokenized document, and human can segment portion of the document as long text or sentence or paragraph and this long text predetermined with some number, human can rank or score relevant document portions, human can assign a number as rank/score and consider higher rank chunked document for prompting, claim mention “generative machine learning system”, additional claim limitation element are present in this part of claim 1.also, human can return  a result as query requested. Liu teaches in figure 1. (“(abstract, page 1, figure 1, section "2.3 Prompt Engineering"). by Akide Liu, External Reasoning : Towards Multi-Large-Language-Models Interchangeable Assistance with Human Feedback . {IDS provided}. In figure 1, shows “trained generative machine learning model” If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the --Mental Processes-- grouping of abstract ideas. The claim recites mental process and human activity (human analyzing data) both.  Accordingly, the claim(s) recite(s) an abstract idea.

This judicial exception is not integrated into a practical application because the recitation of “system” and “processor”, and  “memory” in Claims 1, and 5, “non-transitory computer-readable memory storing instructions” and “processor” of claim 14 reads to generalized computer components, based upon the claim interpretation wherein the structure is interpreted using para. [0104]- [0106] in the filed specification. Further, the additional limitation “natural language generative application service”, in the claims 1, 3, 4, 12, 13, and 20 are directed towards insignificant solution activity. The claims are not patent eligible.
Further, the recitation of a “generative machine learning model” in claims 1, 3,  5, 8, 11, and 14, 17, and Liu teaches ChatGPT  (i.e. generative machine learning model) in figure 1. (“(abstract, page 1, figure 1, section "2.3 Prompt Engineering"). by Akide Liu, External Reasoning : Towards Multi-Large-Language-Models Interchangeable Assistance with Human Feedback . {IDS provided}. In figure 1, shows “trained generative machine learning model” These additional elements are common use of in this area. The claim recites mental process and human activity (human analyzing data) both.  Accordingly, the claim(s) recite(s) an abstract idea.
The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using generalized computer components “receiving”, “generating”,  “performing”,  “ranking”, “including”,  and “returning”, amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim(s) is/are not patent eligible.
With respect to claim(s) 2, 6, and 15, the claim(s) recite(s) “wherein the search representation and the search is performed according to a hybrid sparse and density-based retrieval technique and wherein the ranking of the candidate document portions according to the respective relevance analysis with the natural language request is performed according to a density-based ranking.”, which reads on a human can analyze document portion and rank document chunk relevancy based on specific search and  ranking technique. No additional limitations are present.
With respect to claim(s) 3, 8, and 17, the claim(s) recite(s) “receiving”, “splitting”, “parsing”, ….”, “starting at the beginning of individual ones of the documents”, and “storing”, which relates to  human can analyze data mentally and using pen and paper. Human can recognize each individual document and sentence can be splited as token with delimeter or padded special token and token size is fixed or not. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim(s) is/are directed to an abstract idea.
With respect to claim(s) 4, 12, 13, and 20 the claim(s) recite(s) “wherein the natural language generative application service is implemented as part of a provider network and wherein the natural language request to perform the natural language task is received from a natural language generative application created and hosted at the natural language generative application service.”, which reads on a human extra solution activity and post solution activity at the same time. The claim(s) is/are directed to an abstract idea.
With respect to claim(s) 7 and 16, the claim(s) recite(s) “wherein the different document portions are non-overlapping.”, which reads on a human understand document are overlapping or not. This is human mental activity and human activity, both. The claim(s) is/are directed to an abstract idea.
With respect to claim(s) 9 and 18, the claim(s) recite(s), “wherein storing the split individual ones of the plurality of documents includes storing document-wide metadata obtained from the plurality of documents.;” reads on a human can write document analysis and document indexing using pen and paper and saves those documents This judicial exception is not integrated into a practical application because the recitation of  “one or more non-transitory, computer-readable storage media” of claim 18 reads to generalized computer components, based upon the claim interpretation wherein the structure is interpreted using para. [0100] in the filed specification. Further, the additional limitation “storing”, in the claims are directed towards post solution activity. The claims are not patent eligible.
With respect to claim(s) 10 and 19, the claim(s) recite(s), “wherein, in generating the ranking the candidate document portions according to a respective relevance analysis with the natural language request to perform the natural language task, the program instructions cause the one or more computing devices to implement distributing individual ones of the candidate document portions into respective buckets associated with different relevance confidences.” reads on a human can assign buckets for relevant documents and irrelevant documents. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using the program instructions cause the one or more computing devices to implement, is generalized computer components, based upon the claim interpretation wherein the structure is interpreted using para. [0100] in the filed specification. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Further, the additional limitation in the claims noted above are directed towards insignificant solution activity. The claims are not patent eligible.
With respect to claim(s) 11, the claim(s) recite(s) “…comprising determining that a minimum number of the candidate document portions are not in a lowest relevance confidence one of the respective buckets before prompting the generative machine learning model.” reads on a human can compare the score and decide numerical values. This claim relates human analyzing data and math algorithm. The claim(s) is/are directed to an abstract idea.
These claims further do not remedy the judicial exception being integrated into a practical application and further fail to include additional elements that are sufficient to amount to significantly more than the judicial exception.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3, 4, 5, 8, 9, 10, 11, 12, 13, 14, 17, 18, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Shih et al. US 11321329 B1 in view of Mohanty et al. US 9898457 B1

Regarding Claim 1, Shih teaches:
1. A system, comprising: a plurality of computing devices, respectively implementing at least one processor and a memory,  Shih teaches  (“(147) System memory 2220 may store instructions and data accessible by processor(s) 2210. …”) col. 24, Lines 9-10 )  that implement a natural language generative application service, configured to: FIG. 16 -19 Shih teaches a question generation model (i.e. (“(41) … such as through use of application programming interface (API) calls, via a console implemented as a website or application, etc. The interface(s) may be part of, or serve as a front-end to, a control plane (e.g., control plane 170) of the provider network 100 that includes “backend” services supporting and enabling the services that may be more directly offered to customers.” Col. 5, lines 20-29)  (“(125) FIG. 16 illustrates embodiments of a method for training and use of a question generation model. In certain embodiments, the training data to be generated is a set of one or more candidate questions from a third party's data (e.g., the third party's documents, passages, etc.) containing answers. Depicted method in FIG. 16 includes training of a question generation model at 1601. In one embodiment, the training of question generation model at 1601 includes training a (e.g., language) machine learning model with known question and answer pairs to predict a question from an answer at 1603. The known question and answer pairs may be data that does not include the third party's data (e.g., data not from a third party), e.g., this data may be public data, however, in some embodiments the pairs are provided by a third party. Examples of “known question and answer pairs” are within a MAchine Reading COmprehension (MARCO) dataset. The known question and answer pairs may be public data, e.g., in contrast to the user's private data (e.g., that is hosted in storage 109 in FIG. 1).” Col. 19, Lines 38-55 ) by Shih et al. US 11321329 B1
       receive a natural language request to perform a natural language task; Shih teaches a request to provide an answer to a question using the question-answer model is received. provide an answer (i.e. provide an answer). (“(107) A model may include an input of a search query for a search of ingested data (e.g., the user's documents) …” col. 16, Lines 13-25 ) (“(83) FIG. 7 illustrates embodiments of the enterprise search service 102 used for providing ingestion functionality. The frontend 104 takes in intake requests, index creation requests, etc. and passes those requests to the ingestion service 130. The ingestion service 130 performs document validation on documents retrieved from data sources 105/106. In some embodiments, the ingestion service 130 co-ordinates various services to perform index creation, index updating, and other ingestion tasks detailed below. In other embodiments, the ingestion service places documents to be processed in a queue 735 which includes an extraction and pre-preprocess pipeline. intake requests ask that a set of documents be in taken such that the documents are acquired, indexed, pre-processed, stored, etc.” col. 12, lines 30-42) by Shih et al. US 11321329 B1
   generate a search representation for the natural language request to perform the natural language task to obtain data from one or more data sets comprising a plurality of documents to perform the natural language task;   Shih teaches queries can be more fine-tuned to perform better searches. (“(48) FIG. 2 illustrates embodiments of the enterprise search service 102 used for providing inference functionality. In particular, the aspects shown may be used to respond to a search query on a set of documents. The frontend 104 takes in a search request (or query) and provides that request to an inference orchestrator 220 of the inference service 120. In some embodiments, the inference orchestrator 220 utilizes one or more machine learning models to provide one or more of query suggestion, query normalization, and spelling correction. This may be performed using one or more ML models. As such, queries can be more fine-tuned to perform better searches.”)  (“(79) The search query is performed to generate one or more results at 603. …”)  (“(83) FIG. 7 illustrates embodiments of the enterprise search service 102 used for providing ingestion functionality. The frontend 104 takes in intake requests, index creation requests, etc. and passes those requests to the ingestion service 130. The ingestion service 130 performs document validation on documents retrieved from data sources 105/106. In some embodiments, the ingestion service 130 co-ordinates various services to perform index creation, index updating, and other ingestion tasks detailed below. In other embodiments, the ingestion service places documents to be processed in a queue 735 which includes an extraction and pre-preprocess pipeline. intake requests ask that a set of documents be in taken such that the documents are acquired, indexed, pre-processed, stored, etc.”) by col. 3, line 40-52) by Shih et al. US 11321329 B1
         generate a ranking of the candidate document portions according to a respective relevance analysis with the natural language request to perform the natural language task;  Shih teaches (“(39) Examples of models utilized by the inference service 120 that are run in the model hosting service 110 include, but are limited to a question/answer (e.g., reading comprehension) which extracts answers from passages, a document/passage ranking model which sorts documents in an order of relevance with respect to the query, and a FAQ matching model which attempts to identify a correct the right answer for a given question from a given FAQ document.” Col. 4, lines 48-55) Shih teaches (“(67) In some embodiments, a proper subset of the identified (fetched) set of documents are determined using a first machine learning model at 413. For example, in some embodiments, the fetched documents are reranked using a first model (e.g., DCN model) according to relevance scores and then a second model (e.g., BERT-based) looks at some top number of those reranked documents (e.g., 100) and uses top passages from the retrieved passages for those top documents to determine a relevance score per document. The relevance scores from the first and second models are combined to generate a set of top ranked documents. In other embodiments, only the reranking using the first model is performed.”)   by Shih et al. US 11321329 B1 
     select one or more of the candidate document portions according to the ranking as context for prompting a generative machine learning model  trained to perform natural language tasks; and   Shih teaches (“(105) A model (e.g., at or after initial use) may have improved functionality with further training. The training may be based at least in part on feedback input 1106, e.g., feedback provided by a user. In certain embodiments, a model management service 150 is to pull the model (e.g., from model storage service 1104) and refresh it (e.g., based on feedback input 1106). Refreshing the model may include utilizing the feedback (e.g., from feedback input 1106 or other feedback) in a next training iteration of the model. A next version of the model formed from the next training iteration may then be used by saving the model to model hosting service 110, e.g., where the updated model is one or any combination of: document/passage ranking model(s) 212A or 212D, question/answer model 212B or 212E, and FAQ model 212C or 212F. Note that the 3.sup.rd party models 212D-F are trained using 3.sup.rd party data, or are provided by a 3.sup.rd party having already been trained. A model refresh (or displaying of a proper subset of the data for labeling by the user in active learning) may be triggered when a confidence value (e.g., score) of a proper subset of the data (e.g., answers and/or documents) the model returns for a search query falls below a confidence threshold. Additionally or alternatively, a model refresh (or displaying of a proper subset of the data for labeling by the user in active learning) may be triggered in response to exceeding a confidence difference threshold for a difference between a first confidence score for a first section (e.g., a first, highest scored candidate answer or candidate document) of the proper subset of the data with respect to its relevance to the search query and a second confidence score for a second section (e.g., a second, next highest scored candidate answer or candidate document) of the proper subset of the data with respect to its relevance to the search query. A proper subset of the data (e.g., answers and/or documents) for presentation to the user (e.g., to be used for labeling by the user in active learning) may be selected based on a confidence value of the proper subset of the data the model returns for a search query.” Col. 15, lines 39-67 and col. 16, Lines 1-5) (“(114) … the passage of surrounding text also included (e.g., to provide context to the user for their reading comprehension of the answer and its possible relevancy to the query). A link 1314 may also be included to the source document.”) (“(132) FIG. 18 illustrates a second set of example candidate questions generated by a question generation model trained on known question and answer pairs. In FIG. 18, example synthetic question generation 1800 includes a known question 1802 (e.g., with an end of question token), a passage with a known answer 1804 to the known question 1802, and illustrates the candidate questions (e.g., and their end of question token <eoq>) generated by the model 1806 trained to generate synthetic questions from those two inputs 1802 and 1804. The model trained to do so may thus be used on a user's data to generate specific training data based on the user's (e.g., customer's) documents, e.g., instead of only being trained on other's documents. In one embodiment, a same model is used to generate questions 1806 in FIG. 18 and questions 1706 in FIG. 17. Trained model may then be used to generate training data that is subsequently used to train a second machine learning model (e.g., document/passage ranking model(s) 212A or 212D, question/answer model 212B or 212E, and FAQ model 212C or 212F), e.g., and the trained, second machine learning model used as discussed herein” Col. 20, Lines 58-67, col. 21, lines 1-10 ) . by Shih et al. US 11321329 B1
 return a response to the natural language request to perform the natural language task according to a result obtained from prompting the generative machine learning model. Shih teaches (“(107) A model may include an input of a search query for a search of ingested data (e.g., the user's documents) and an output of a best answer from a plurality of answers from the data and/or an output of a best document from a plurality of documents from the data. Active learning may be utilized to train the model where a user is to request that the user indicate the desired output (e.g., answer(s) or document(s)) for an input (e.g., search query). …” col. 16, lines 13-24) (“(126) One example of a language machine learning (ML) model is a transformer-based language model that predicts the next word of a string of text based on the previous words within the string of text.  …”) by Shih et al. US 11321329 B1
             access an index generated for the one or more data sets and perform a search to return a number of candidate document portions based on respective similarity to the search representation,  Shih teaches  (“(46) The processing of the request includes accessing one or more indexes 107 via the indexing service 140 at circle 3 to get identifiers of sets of documents to analyze, accessing the identified sets of documents (or text thereof) from document storage 109, and providing the documents (or text thereof) and the query to one or more machine learning models in the model hosting service 110 at circle 5 to determine one or more of top documents, a top passage, and/or a top FAQ.” Col. 6, Lines 18-25 ) (“(105) … A model refresh (or displaying of a proper subset of the data for labeling by the user in active learning) may be triggered when a confidence value (e.g., score) of a proper subset of the data (e.g., answers and/or documents) the model returns for a search query falls below a confidence threshold. Additionally or alternatively, a model refresh (or displaying of a proper subset of the data for labeling by the user in active learning) may be triggered in response to exceeding a confidence difference threshold for a difference between a first confidence score for a first section (e.g., a first, highest scored candidate answer or candidate document) of the proper subset of the data with respect to its relevance to the search query and a second confidence score for a second section (e.g., a second, next highest scored candidate answer or candidate document) of the proper subset of the data with respect to its relevance to the search query. …”) col. 15, Lines 60-67, & col 16, lines 1-8) 
wherein the index comprises a plurality of entries corresponding to different document portions determined based on a number of tokens for splitting individual ones of the plurality of documents Shih teaches (“(33) The ingestion service 130 also extracts text from documents, pre-processes the extracted text (e.g., tokenize, normalize, and/or remove noise), and calls an indexing service to generate index entries for text, and causes the documents (or subset thereof) to be stored. The indexing service 140 indexes documents that have been acquired by the ingestion service 130 into one or more indexes 107. An index is a data structure of organized data that maps the data to a plurality of fields. Each document or subset of a document (e.g., passage) is identified with a unique identifier. In some embodiments, the index is comprised a plurality of JSON documents.” by col. 3, line 40-52) by Shih et al. US 11321329 B1 
Shih does not explicitly teach sliding window size or boundary and token sentence.
Mohanty teaches:
splitting documents up to a threshold number of tokens without splitting a sentence of the plurality of documents into the different document portions Mohanty teaches (“(21) The tokenization component 120, when executed, causes the processor 106 to segment the document into a plurality of sequential terms (e.g., clauses, phrases, sentences, and paragraphs). For each of the plurality of sentences, the tokenization component 120 parses the sentences into tokens based on a plurality of delimeters. Delimiters are generally divided into two groups: “white space” delimiters and “punctuation/special character” delimiters. White space delimiters include, for example, spaces, tabs, newlines, and carriage returns. Punctuation/special character delimiters include, for example, non-alphanumeric characters such as a comma, a period, an exclamation mark, a percent sign, a plus sign, a parenthesis, a slash, an asterisk, an ampersand, a dollar sign, a number sign, a hyphen-minus, and the like. In one example, the list of delimiters is configurable and may be adjusted by a user/administrator to add or remove particular punctuation and/or special characters from the list of punctuations and special characters listed as delimiters. For example, an underscore and/or quotations may be removed given they are more closely related to/a part of many natural language text. Example 1 provided below illustrates a segmented sentence before and after it is tokenized with an underscore being removed from the list of delimeters.” Col. 5, Lines 60-67 and col. 6, lines 1-16)  (“(42) To generate feature vectors with a predefined number of encoded tokens, the feature vector component 124 utilizes a sliding window with a boundary size equal to the predefined number used in the feature vectors. Thus, in Example 3 below, the boundary size of the sliding window used by the feature vector 124 is 5. Starting with the first encoded token in the segmented sentence, the sliding window “slides” (logically) over the encoded tokens until the number of encoded tokens equals the boundary size (e.g., 5) of the sliding window. For example, as shown in Example 3 below, the encoded token “KEYWORD-10” is the first encoded token in the segmented sentence. The sliding window “slides” (logically) over the encoded tokens until the number of encoded tokens equals the boundary size of the sliding window (e.g., 5). As shown in Example 3, Feature Vector 1 includes 5 encoded tokens, the first encoded token in Feature Vector 1 being “<KEYWORD-10>” and the last encoded token (the fifth encoded token in the segmented sentence) in Feature Vector 1 being “<TEXT>”. To generate another feature vector from the segmented sentence (e.g., Feature Vector 2 in Example 3), the sliding window “slides” (logically) over by one encoded token resulting in Feature Vector 2 shown in Example 3 below. This process is repeated until the sliding window encompasses the last encoded token in the segmented sentence (e.g., <NUM> in Example 3) creating a feature vector each time the sliding window “slides”.” Col. 9, lines 9-35) by Mohanty et al. US 9898457 B1
Mohanty is considered to be analogous to the claimed invention because it relates to  involve a computer-implemented method for detecting and removing non-natural language within natural language to enhance performing analysis on the natural language. 
Therefore, it would have been obvious for someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Shih to incorporate the teachings of Mohanty in order to include sliding window size and token number.
One could have been motivated to do so because the system can  improving the accuracy at which artificial language can be separated from the natural language text in the document as well as improving a user's experience when evaluating the document by providing appropriate formatting and visual delineation of the non-natural languages and allowing visual distinction to be added between the artificial language and the natural language text through text styling, spacing, and the like.(“(7) … As such, smaller portions of artificial language (e.g., code fragments) are identifiable, thus improving the results of the content analysis. Further, by analyzing and identifying each pivot term in a sliding window as either artificial language or not artificial language, a defined boundary between natural language text and artificial language in the document can be created, improving the accuracy at which artificial language can be separated from the natural language text in the document as well as improving a user's experience when evaluating the document by providing appropriate formatting and visual delineation of the non-natural languages and allowing visual distinction to be added between the artificial language and the natural language text through text styling, spacing, and the like.” Col. 2, lines 25-50) Mohanty et al. US 9898457 B1

Claim 5 is a method claim with a limitation similar to the limitation of system Claim 1 and is rejected under similar rationale. Additionally, 
Regarding Claim 5, Shih teaches:
5. A method, comprising: 
      including, by the generative machine learning system, one or more of the candidate document portions according to the ranking as context for prompting a generative machine learning model trained to perform natural language tasks; and Shih teaches (“(122) In certain embodiment, the models discussed herein (e.g., document/passage ranking model(s) 212A or 212D, question/answer model 212B or 212E, and FAQ model 212C or 212F) are trained with a set of training data. Training data may include a question and a corresponding answer from a user's data (e.g., in contrast to public data or data from other enterprises).  …” col. 19, lines 1-7)(“(126) One example of a language machine learning (ML) model is a transformer-based language model that predicts the next word of a string of text based on the previous words within the string of text. Certain embodiments herein modify a language ML model used to predict the next word for each successive word of a string of text to predict each next (e.g., successive) word of a question for a given answer, e.g., to predict each successive word of a known question (e.g., a multiple word question) from its known answer (e.g., a multiple word answer). One example language ML model is a transformer model (e.g., a GPT-2 model) that is first trained on (e.g., a very large amount of) data in an unsupervised manner using language modeling as a training signal, and is second fine-tuned on much smaller supervised datasets (e.g., known questions and their corresponding known answers) to help it solve specific tasks.” Col. 19, lines 57-67and col. 20, lines 1-5  ) by Shih et al. US 11321329 B1

Claim 14 is a non-transitory, computer-readable storage media, claim with a limitation similar to the limitation of system Claim 1 and is rejected under similar rationale. Additionally, 
Regarding Claim 14, Shih teaches:
14. One or more non-transitory, computer-readable storage media, storing program instructions that when executed on or across one or more computing devices cause the one or more computing devices to implement:   Shih teaches (“(60) FIG. 4 illustrates embodiments of a method for document querying. Some or all of the operations (or other processes described herein, or variations, and/or combinations thereof) are performed under the control of one or more computer systems configured with executable instructions and are implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) executing collectively on one or more processors, by hardware or combinations thereof. The code is stored on a computer-readable storage medium, for example, in the form of a computer program comprising instructions executable by one or more processors. The computer-readable storage medium is non-transitory. …” Col. 8, lines 50-62) (“(77) …  The code is stored on a computer-readable storage medium, for example, in the form of a computer program comprising instructions executable by one or more processors. The computer-readable storage medium is non-transitory.  …”) by Shih et al. US 11321329 B1

Regarding Claim 3, the combination teaches the system of claim 1 as identified above.
Shih  further teaches:
3. The system of claim 1, wherein the natural language generative application service is further configured to: 
 receive a request to add the one or more data sets for data retrieval to perform natural language requests using the generative machine learning model  ;  Shih teaches (“53) The inference orchestrator 220 retrieves the identified documents (e.g., an entire document, passage, or FAQ) from text/document storage 109 using document storage service 208. The retrieved documents are then supplied, along with aspects of the query, to one or more of the models 212A-F of the model hosting service 110 to identify one or more of: one or more top ranked documents, one or more top ranked passages, and/or one or more top ranked FAQs. Note that the models 212A-F provide confidence scores of their outputs. Note too that the document storage service 208 stores document artifacts that will be used at the time of inference to extract the answer for a given query.” Col. 7, lines 40-51)  (“(83) …  The ingestion service 130 performs document validation on documents retrieved from data sources 105/106. In some embodiments, the ingestion service 130 co-ordinates various services to perform index creation, index updating, and other ingestion tasks detailed below. … … … intake requests ask that a set of documents be in taken such that the documents are acquired, indexed, pre-processed, stored, etc. col. 7, lines 40-50”) 11 by Shih et al. US 11321329 B1 
store the split individual ones of the plurality of documents as the different portions of the plurality of documents in the index. Shih teaches (“(86) When an index has been created, it can be updated. This updating may be in the form of a single update or batch update which causes an ingestion of text and unstructured text into an index, the addition of custom attributes to the documents (should it be desired), an attachment of an access control list to the documents added to the index, a storage of the text, a pre-processing of the text (and storage thereof), etc. In some embodiments, a update request includes one or more fields to inform the behavior of the indexing service 124, the document storage service 208, the queue 735 include extraction and pre-processing pipeline, and the connector service 180 and includes one or more fields such as a field for a location of one or more documents, a field for the documents themselves, a field for the index's name, a field for the role that gives permission for logs and metrics, etc. The extraction and pre-processing pipeline extracts text from documents and pre-processes them (e.g., tokenize, etc.). In some embodiments, the extracted text (e.g., tokens) is broken down into overlapping passages using sliding windows by the extraction and pre-process pipeline. The overlapping passages are then indexed and/or stored.”) by Shih et al. US 11321329 B1 
   
 split individual ones of the plurality of documents, wherein to split the documents, the natural language generative application service is configured to: parse the plurality of documents into tokens; and Shih teaches (“(86) …  the queue 735 include extraction and pre-processing pipeline, and the connector service 180 and includes one or more fields such as a field for a location of one or more documents, a field for the documents themselves, a field for the index's name, a field for the role that gives permission for logs and metrics, etc. The extraction and pre-processing pipeline extracts text from documents and pre-processes them (e.g., tokenize, etc.). In some embodiments, the extracted text (e.g., tokens) is broken down into overlapping passages using sliding windows by the extraction and pre-process pipeline. The overlapping passages are then indexed and/or stored.” Col. 13, lines 6-17) by Shih et al. US 11321329 B1
starting at the beginning of individual ones of the documents and using a sliding window that specifies the threshold number of tokens, include tokens in a document portion up to the  document; and    Shih teaches the extracted text (e.g., tokens) is broken down into overlapping passages using sliding windows by the extraction and pre-process pipeline. The overlapping passages are then indexed and/or stored. (“(86) When an index has been created, it can be updated. This updating may be in the form of a single update or batch update which causes an ingestion of text and unstructured text into an index, the addition of custom attributes to the documents (should it be desired), an attachment of an access control list to the documents added to the index, a storage of the text, a pre-processing of the text (and storage thereof), etc. In some embodiments, a update request includes one or more fields to inform the behavior of the indexing service 124, the document storage service 208, the queue 735 include extraction and pre-processing pipeline, and the connector service 180 and includes one or more fields such as a field for a location of one or more documents, a field for the documents themselves, a field for the index's name, a field for the role that gives permission for logs and metrics, etc. The extraction and pre-processing pipeline extracts text from documents and pre-processes them (e.g., tokenize, etc.). In some embodiments, the extracted text (e.g., tokens) is broken down into overlapping passages using sliding windows by the extraction and pre-process pipeline. The overlapping passages are then indexed and/or stored.”) (“(130) In certain embodiments, the language ML model is also trained to detect an end of question (EOQ) token. In certain embodiments, the input to a language model to generate training data (e.g., generate questions for known answers) includes a passage having the answer and the question, for example, in the format of a beginning of service indicator (e.g., <bos>) followed by beginning of question indicator (e.g., <boq>), followed by the question, and then followed by an end of question indicator (e.g., <eoq>).” Col. 20, lines 36-44) (“(132) FIG. 18 illustrates a second set of example candidate questions generated by a question generation model trained on known question and answer pairs. In FIG. 18, example synthetic question generation 1800 includes a known question 1802 (e.g., with an end of question token), a passage with a known answer 1804 to the known question 1802, and illustrates the candidate questions (e.g., and their end of question token <eoq>) generated by the model 1806 trained to generate synthetic questions from those two inputs 1802 and 1804. The model trained to do so may thus be used on a user's data to generate specific training data based on the user's (e.g., customer's) documents, e.g., instead of only being trained on other's documents. In one embodiment, a same model is used to generate questions 1806 in FIG. 18 and questions 1706 in FIG. 17. Trained model may then be used to generate training data that is subsequently used to train a second machine learning model (e.g., document/passage ranking model(s) 212A or 212D, question/answer model 212B or 212E, and FAQ model 212C or 212F), e.g., and the trained, second machine learning model used as discussed herein.” col. 20, lines 58-67 && col 21, lines 1-11 by Shih et al. US 11321329 B1 
Shih does not explicitly teach each of the sentence token with delimeter.
Mohanty teaches:
threshold number of tokens without splitting sentence of the individual ones of the document Mohanty teaches (“(21) The tokenization component 120, when executed, causes the processor 106 to segment the document into a plurality of sequential terms (e.g., clauses, phrases, sentences, and paragraphs). For each of the plurality of sentences, the tokenization component 120 parses the sentences into tokens based on a plurality of delimeters. Delimiters are generally divided into two groups: “white space” delimiters and “punctuation/special character” delimiters. White space delimiters include, for example, spaces, tabs, newlines, and carriage returns. Punctuation/special character delimiters include, for example, non-alphanumeric characters such as a comma, a period, an exclamation mark, a percent sign, a plus sign, a parenthesis, a slash, an asterisk, an ampersand, a dollar sign, a number sign, a hyphen-minus, and the like. In one example, the list of delimiters is configurable and may be adjusted by a user/administrator to add or remove particular punctuation and/or special characters from the list of punctuations and special characters listed as delimiters. For example, an underscore and/or quotations may be removed given they are more closely related to/a part of many natural language text. Example 1 provided below illustrates a segmented sentence before and after it is tokenized with an underscore being removed from the list of delimeters.” Col. 5, Lines 60-67 and col. 6, lines 1-16)  (“(42) To generate feature vectors with a predefined number of encoded tokens, the feature vector component 124 utilizes a sliding window with a boundary size equal to the predefined number used in the feature vectors. Thus, in Example 3 below, the boundary size of the sliding window used by the feature vector 124 is 5. Starting with the first encoded token in the segmented sentence, the sliding window “slides” (logically) over the encoded tokens until the number of encoded tokens equals the boundary size (e.g., 5) of the sliding window. For example, as shown in Example 3 below, the encoded token “KEYWORD-10” is the first encoded token in the segmented sentence. The sliding window “slides” (logically) over the encoded tokens until the number of encoded tokens equals the boundary size of the sliding window (e.g., 5). As shown in Example 3, Feature Vector 1 includes 5 encoded tokens, the first encoded token in Feature Vector 1 being “<KEYWORD-10>” and the last encoded token (the fifth encoded token in the segmented sentence) in Feature Vector 1 being “<TEXT>”. To generate another feature vector from the segmented sentence (e.g., Feature Vector 2 in Example 3), the sliding window “slides” (logically) over by one encoded token resulting in Feature Vector 2 shown in Example 3 below. This process is repeated until the sliding window encompasses the last encoded token in the segmented sentence (e.g., <NUM> in Example 3) creating a feature vector each time the sliding window “slides”.” Col. 9, lines 9-35) by Mohanty et al. US 9898457 B1
Mohanty is considered to be analogous to the claimed invention because it relates to  involve a computer-implemented method for detecting and removing non-natural language within natural language to enhance performing analysis on the natural language. 
Therefore, it would have been obvious for someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Shih to incorporate the teachings of Mohanty in order to include sliding window size and token number.
One could have been motivated to do so because the system can  improving the accuracy at which artificial language can be separated from the natural language text in the document as well as improving a user's experience when evaluating the document by providing appropriate formatting and visual delineation of the non-natural languages and allowing visual distinction to be added between the artificial language and the natural language text through text styling, spacing, and the like.(“(7) … As such, smaller portions of artificial language (e.g., code fragments) are identifiable, thus improving the results of the content analysis. Further, by analyzing and identifying each pivot term in a sliding window as either artificial language or not artificial language, a defined boundary between natural language text and artificial language in the document can be created, improving the accuracy at which artificial language can be separated from the natural language text in the document as well as improving a user's experience when evaluating the document by providing appropriate formatting and visual delineation of the non-natural languages and allowing visual distinction to be added between the artificial language and the natural language text through text styling, spacing, and the like.” Col. 2, lines 25-50) Mohanty et al. US 9898457 B1


Claim 8 is a method claim with a limitation similar to the limitation of system Claim 3 and is rejected under similar rationale.
Claim 17 is a non-transitory, computer-readable storage media, claim with a limitation similar to the limitation of system Claim 3 and is rejected under similar rationale.

Regarding Claim 9,  Shih teaches the method of claim 8 as identified above.
Shih further teaches:
9. The method of claim 8, wherein storing the split individual ones of the plurality of documents includes storing document-wide metadata obtained from the plurality of documents.   Shih teaches  the metadata 210 may provide the physical location of those indexes 107, (i.e. storing document-wide metadata, location of indexed document)  and the document storage service 208 also pulls documents from the queue 735 to store documents to store documents and chunks thereof in text/document storage 109. (“(84) As shown, the ingestion service 130 is coupled to a plurality of services. The connector service 180 receives (either as a push or a pull) documents from data sources 105/106 where the physical location may be provided by metadata 210. The indexing service 124 is pulls documents and/or text from the queue 735 (which may be pre-processed) and to create or update indexes 107 associated with documents (including passages and FAQs). The metadata 210 may provide the physical location of those indexes 107. The document storage service 208 also pulls documents from the queue 735 to store documents to store documents and chunks thereof in text/document storage 109.” col. 12, lines 43-54) (“(86) … The extraction and pre-processing pipeline extracts text from documents and pre-processes them (e.g., tokenize, etc.). the extracted text (e.g., tokens) is broken down into overlapping passages using sliding windows by the extraction and pre-process pipeline. The overlapping passages are then indexed and/or stored.”) col. 13,  Lines 10-17 ) (“(92) Text is extracted from the acquired one or more documents and pre-processed at 809. Metadata may also be extracted. For example, text may be extracted from a document that includes non-text such as images. The pre-processing of the extracted text includes one or more of tokenizing, normalizing, and/or removing noise. This is performed by the extraction and pre-processing pipeline 735 per document acquired. Note that extracted text may include passages.” Col. 13, lines 50-59)  by Shih et al. US 11321329 B1 
Claim 18 is a non-transitory, computer-readable storage media, claim with a limitation similar to the limitation of method Claim 9 and is rejected under similar rationale.

Regarding Claim 4, Shih teaches the system of claim 1 as identified above.
Shih further teaches:
4. The system of claim 1, wherein the natural language generative application service is implemented as part of a provider network and  Shih teaches  (“(41) … Users may interact with a provider network 100 across one or more intermediate networks 101 (e.g., the internal via one or more interface(s), such as through use of application programming interface (API) calls, via a console implemented as a website or application, etc. The interface(s) may be part of, or serve as a front-end to, a control plane (e.g., control plane 170) of the provider network 100 that includes “backend” services supporting and enabling the services that may be more directly offered to customers.” Col. 5, lines 20-29 ) 
 wherein the natural language request to perform the natural language task is received from a natural language generative application created and    Shih teaches  a create index API call is received by the frontend 104 which calls the indexing service. (“(85) Prior to updating an index, it needs to be created. In some embodiments, a create index API call is received by the frontend 104 which calls the indexing service 124 to generate an index of indexes 107. The create index request includes one or more fields to inform the behavior of the indexing service 124 such as a field for a description of the index, field for an index name, a field for the role that gives permission for logs and metrics, a field identifying an encryption key, etc” col. 12, lines 55-63). by Shih et al. US 11321329 B1
 hosted at the natural language generative application service.  Shih teaches  (“(52) The result (e.g., document identifiers) of various index queries are received by the inference orchestrator 220 to use to retrieve one or more documents for use by one or more machine learning models (e.g., FAQ model 212C, question/answer model 212B, document/passage ranking model(s) 212A, third-party FAQ model 212F, third-party question/answer model 212E, third-party document/passage ranking model(s) 212D, and/or other language models 212G) hosted by the model hosting service 110. …” col.  7, lines 22-30) by Shih et al. US 11321329 B1 

Regarding Claim 10, The Shih teaches the method of claim 5 as identified above.
Shih further teaches:
10. The method of claim 5, wherein ranking the candidate document portions according to a respective relevance analysis with the natural language request to perform the natural language task comprises distributing individual ones of the candidate document portions into respective buckets associated with different relevance confidences. Shih teaches proper subset (i.e. respective buckets) and  a question/answer (e.g., reading comprehension) which extracts answers from passages, a document/passage ranking model which sorts documents in an order of relevance with respect to the query. (“Abstract … returning one or more of the top ranked passage and the proper subset of documents”) . (“(39) Examples of models utilized by the inference service 120 that are run in the model hosting service 110 include, but are limited to a question/answer (e.g., reading comprehension) which extracts answers from passages, a document/passage ranking model which sorts documents in an order of relevance with respect to the query,  …” col. 4, lines 48-53) (“(67) …  The relevance scores from the first and second models are combined to generate a set of top ranked documents. In other embodiments, only the reranking using the first model is performed.” col. 10, lines 12-15 )  (“(105) … A model refresh (or displaying of a proper subset of the data for labeling by the user in active learning) may be triggered when a confidence value (e.g., score) of a proper subset of the data (e.g., answers and/or documents) the model returns for a search query falls below a confidence threshold. Additionally or alternatively, a model refresh (or displaying of a proper subset of the data for labeling by the user in active learning) may be triggered in response to exceeding a confidence difference threshold for a difference between a first confidence score for a first section (e.g., a first, highest scored candidate answer or candidate document) of the proper subset of the data with respect to its relevance to the search query and a second confidence score for a second section (e.g., a second, next highest scored candidate answer or candidate document) of the proper subset of the data with respect to its relevance to the search query. A proper subset of the data (e.g., answers and/or documents) for presentation to the user (e.g., to be used for labeling by the user in active learning) may be selected based on a confidence value of the proper subset of the data the model returns for a search query.” col. 15, Lines 55-67 & col. 16, Lines 1-8)  (“(112) Active learning at 1207 includes generating a confidence score (e.g., by the machine learning model) based on the result of the search at 1209, selecting a proper subset of the data based at least in part on a confidence score of the proper subset of the data at 1211, displaying the proper subset of the data to the user at 1213, receiving an indication from the user of one or more sections of the proper subset of the data for use in a next training iteration of the machine learning model for the search query at 1215, performing the next training iteration of the machine learning model with the one or more sections of the proper subset of the data at 1217, …”col. 17, Lines 15-25 ) by Shih et al. US 11321329 B1
Claim 19 is a non-transitory, computer-readable storage media, claim with a limitation similar to the limitation of method Claim 10 and is rejected under similar rationale. 

Regarding Claim 11, The combination teaches the method of claim 10 as identified above.
Shih further teaches:
11. The method of claim 10, further comprising determining that a minimum number of the candidate document portions are not in a lowest relevance confidence one of the respective buckets before prompting the generative machine learning model  Shih teaches (“(67) In some embodiments, a proper subset of the identified (fetched) set of documents are determined using a first machine learning model at 413. For example, in some embodiments, the fetched documents are reranked using a first model (e.g., DCN model) according to relevance scores and then a second model (e.g., BERT-based) looks at some top number of those reranked documents (e.g., 100) and uses top passages from the retrieved passages for those top documents to determine a relevance score per document. The relevance scores from the first and second models are combined to generate a set of top ranked documents. In other embodiments, only the reranking using the first model is performed.” Col. Entire column 10) (“(68) In some embodiments, a proper subset of the identified (and fetched) set of passages is identified using a second machine learning model based upon the query and fetched passages at 417. This proper subset is a reranking of the passages This may be the same model as the BERT-based model detailed as being used at 413. This reranked subset is provided to a third model (along with aspects of the query) which determines a top passage from the reranked subset at 419. The third model is a BERT-based model in some embodiments.” Entire column 10) (“(69) In some embodiments, a proper subset of the identified (and fetched) set of FAQS is determined using a fourth machine learning model on the fetched FAQs and the query at 421. This proper subset includes the top ranked FAQ. In some embodiments, the fourth machine learning model is BERT-based.” Entire column 10) by Shih et al. US 11321329 B1

Regarding Claim 12, The combination teaches the method of claim 5 as identified above.
Shih further teaches:
12. The method of claim 5, wherein the generative machine learning system is a natural language generative application service and Shih teaches  (“(52) The result (e.g., document identifiers) of various index queries are received by the inference orchestrator 220 to use to retrieve one or more documents for use by one or more machine learning models (e.g., FAQ model 212C, question/answer model 212B, document/passage ranking model(s) 212A, third-party FAQ model 212F, third-party question/answer model 212E, third-party document/passage ranking model(s) 212D, and/or other language models 212G) hosted by the model hosting service 110. As noted above, the third-party FAQ model 212F, the third-party question/answer model 212E, and the third-party document/passage ranking model(s) 212D may be provided by a third-party or be versions of the FAQ model 212C, question/answer model 212B, document/passage ranking model(s) 212A trained using data provided by the third-party. Which model to use may be dictated by the who is making the request, where the request is coming from (e.g., which webpage), what documents are being searched, etc.”) by Shih et al. US 11321329 B1
    wherein the natural language request to perform the natural language task is received from a natural language generative application created at the natural language generative application service and    Shih teaches request include accessing document and providing document (i.e. natural language request to perform the natural language task) (“(46) The processing of the request includes accessing one or more indexes 107 via the indexing service 140 at circle 3 to get identifiers of sets of documents to analyze, accessing the identified sets of documents (or text thereof) from document storage 109, and providing the documents (or text thereof) and the query to one or more machine learning models in the model hosting service 110 at circle 5 to determine one or more of top documents, a top passage, and/or a top FAQ.” Col. 6, Lines 18-25)  (“(85) Prior to updating an index, it needs to be created. In some embodiments, a create index API call is received by the frontend 104 which calls the indexing service 124 to generate an index of indexes 107. The create index request includes one or more fields to inform the behavior of the indexing service 124 such as a field for a description of the index, field for an index name, a field for the role that gives permission for logs and metrics, a field identifying an encryption key, etc” col. 12, lines 55-63). by Shih et al. US 11321329 B1

Claim 20 is a non-transitory, computer-readable storage media, claim with a limitation similar to the limitation of method Claim 12 and is rejected under similar rationale.

Regarding Claim 13, The combination teaches the method of claim 12 as identified above.
Shih further teaches:
13. The method of claim 12, wherein the index is created in response to a request received at the natural language generative application service and associated with the natural language generative application. Shih teaches  a create index API call (i.e.. natural language generative application service) is received by the frontend 104 which calls the indexing service. (“(85) Prior to updating an index, it needs to be created. In some embodiments, a create index API call is received by the frontend 104 which calls the indexing service 124 to generate an index of indexes 107. The create index request includes one or more fields to inform the behavior of the indexing service 124 such as a field for a description of the index, field for an index name, a field for the role that gives permission for logs and metrics, a field identifying an encryption key, etc” col. 12, lines 55-63). by Shih et al. US 11321329 B1

Claims 2, 6 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Shih and Mohanty in view of Jiao et al US 12111837 B1
Regarding Claim 2, the combination teaches the system of claim 1 as identified above.
The combination does not explicitly teach wherein the search representation and the search is performed according to a sparse retrieval technique wherein the ranking of the candidate document portions according to the respective relevance analysis with the natural language request is performed according to a density-based ranking.
Jiao teaches:
2. The system of claim 1, wherein the search representation and the search is performed according to a sparse retrieval technique and  Jiao teaches  (“(23) In other examples, the dense retriever 212 can be configured to assign similarity scores to only a subset of the potential results in the results pool 210. For instance, the dense retrieval component can further include a sparse retriever 221 (e.g., that identifies potential search results according to any of various techniques) that is configured to identify a subset of the potential results 210 based upon the query received by the server computing system 200. In some embodiments, the dense retrieval component 202 can be configured to assign similarity scores only to the subset of the potential results 210 identified by the sparse retriever 221.” Col. 6, lines 58-68, col 7, lines 1-10) by Jiao et al US 12111837 B1 
Jiao teaches:
 wherein the ranking of the candidate document portions according to the respective relevance analysis with the natural language request is performed according to a density-based ranking.  Jiao teaches (“(35) It is to be understood that the ranked list of search results returned by the dense retrieval component 202 can be a subset of the candidate search results identified by the dense retriever 212. In other words, although the ranker 214 may be configured to generate a ranking of all of the candidate search results provided to the ranker 214 by the dense retriever 212, the dense retrieval component 202 need not return an indication of the complete ranking to the originator of the query. In some instances, for example, the dense retrieval component 202 can return a single search result (e.g., from among the candidate search results identified by the dense retriever 212 and ranked by the ranker 214) based upon that search result having a higher rank than any other search results. In another example, the dense retrieval component 202 can return a single search result based upon that search result having a score, generated by the cross-encoder 226, that is a threshold greater than the score of any other candidate search result scored by the cross-encoder 226.”) by Jiao et al US 12111837 B1 
Jiao is considered to be analogous to the claimed invention because it relates to Computer-based search and retrieval is now employed to assist with a range of tasks including automated open-domain question answering, retrieval and display of additional content related to content already being viewed by a user, or retrieval of documents related to a user-specified query. 
Therefore, it would have been obvious for someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Shih and Mohanty to incorporate the teachings of Jiao in order to include hybrid sparse and density-based retrieval technique.
One could have been motivated to do so because the system can achieve benefits of computer-based search and retrieval over large datasets. (“(14) … Technologies described herein can therefore retain the benefits of computer-based search and retrieval over large datasets while mitigating the disadvantages of potentially lower relevance of search results than could be achieved by a human reviewer. …”col. 3, lines 47-42) by Jiao et al US 12111837 B1

Claim 6 is a method claim with a limitation similar to the limitation of system Claim 2 and is rejected under similar rationale.

Claim 15 is a non-transitory, computer-readable storage media, claim with a limitation similar to the limitation of system Claim 2 and is rejected under similar rationale. Additionally, 
Regarding Claim 15, Jiao teaches:
15. The one or more non-transitory, computer-readable storage media of claim 14, wherein the search representation and the search is performed according to a hybrid sparse and density-based retrieval technique. Jiao  teaches (“(29) The dense retriever 212 provides the selected candidate search results to the ranker 214. The ranker 214 is configured to generate a ranking of the candidate search results selected by the dense retriever 212. Thus, the dense retriever 212 identifies a subset of the potential search results in the search results pool 210 as being candidate search results, whereas the ranker 214 generates a ranking of the previously-identified subset of candidate search results.” Col 8, lines 28-35) (“(42) …. In various embodiments, the training component 308 can include a sparse retriever 316 that is configured to retrieve a pool of candidate hard negative results based upon a training query in the training queries 312. In an exemplary embodiment, the sparse retriever 316 can be configured to retrieve, from the results pool 310, results that include one or more features in common with a query (e.g., results that include the word “dog” where the query includes the word “dog” or results that are labeled as pertaining to a demographic category where the representation of the query includes a feature value indicating that a user belongs to the demographic category). The candidate hard negative results retrieved by the sparse retriever 316 can then be employed by the training component 308 as hard negative results with respect to the training query when warming up the query encoder 216 and the result encoder 218. In some embodiments, the set of results {r.sub.1, r.sub.2, . . . , r.sub.i} employed to warm up the query encoder 216 and the result encoder 218 can be composed of the positive result r.sub.1 and the remainder of the results r.sub.2, . . . , r.sub.i can consist of hard negative results (e.g., as selected by the sparse retriever).” Col. 11, lines 5-26) by Jiao et al US 12111837 B1

Claims 7 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Shih and Mohanty in view of Alayrac et al. US 20230350936 A1
Regarding Claim 7, the combination teaches the system of claim 5 as identified above.
The combination does not explicitly teach the different document portions are non-overlapping.
Alayrac;  teaches:
7. The method of claim 5, wherein the different document portions are non-overlapping.  Alayrac;  teaches (“[0122] … the interaction term is generated based on compressed representations of only a (proper) subset of the data items in query input and only a (proper) sub-set of the prompt input. For example, the query input may comprise (or consist of) a plurality of (non-overlapping) portions which each include exactly one of data items and may also include a plurality of the tokens of the input tokens string.  …”) by Alayrac;et al. US 20230350936 A1
Alayrac is considered to be analogous to the claimed invention because it relates to a neural network configured to process a multi-mode query input (e.g. a mixture of text and sound/image(s)), to generate an output which is a response to the query input.
Therefore, it would have been obvious for someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Shih and Mohanty to incorporate the teachings of Alayrac  in order to include non-overlapping text feature.
One could have been motivated to do so because the system can provide an accurate way of processing multi-mode query inputs (i.e. ones composed both of tokens and data items of another, different modality). (“[0057] A technical effect of the present disclosure is to make possible the provision of a query processing system which has been demonstrated experimentally to provide an accurate way of processing multi-mode query inputs (i.e. ones composed both of tokens and data items of another, different modality). The accuracy has been demonstrated to be high even compared to known computer systems which have been designed for specific multi-mode query processing tasks. It is high even for zero-shot query inputs, and becomes higher for few-shot query inputs.”) by Alayrac;et al. US 20230350936 A1

Claim 16 is a non-transitory, computer-readable storage media, claim with a limitation similar to the limitation of method Claim 7 and is rejected under similar rationale.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.                                                                                                                                                                                                     Any inquiry concerning this communication or earlier communications from the examiner should be directed to FOUZIA HYE SOLAIMAN whose telephone number is (571)270-5656. The examiner can normally be reached M-F (8-5)AM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras D. Shah can be reached at (571) 270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/F.H.S./Examiner, Art Unit 2653                                                                                                                                                                                                        
/Paras D Shah/Supervisory Patent Examiner, Art Unit 2653                                                                                                                                                                                                        
02/21/2026
Read full office action
Prosecution Timeline

Sep 28, 2023
Application Filed
Aug 07, 2025
Non-Final Rejection — §101, §103
Nov 13, 2025
Response Filed
Feb 21, 2026
Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/174,120
Patent 12592217
SYSTEM AND METHOD FOR SPEECH PROCESSING
2y 5m to grant Granted Mar 31, 2026
18/116,994
Patent 12579976
USER TERMINAL, DIALOGUE MANAGEMENT SYSTEM, CONTROL METHOD OF USER TERMINAL, AND DIALOGUE MANAGEMENT METHOD
2y 5m to grant Granted Mar 17, 2026
17/888,243
Patent 12555563
SYSTEMS AND METHODS FOR CHARACTER-TO-PHONE CONVERSION
2y 5m to grant Granted Feb 17, 2026
17/666,645
Patent 12542149
METHOD AND APPARATUS FOR IMPROVING SPEECH INTELLIGIBILITY IN A ROOM
2y 5m to grant Granted Feb 03, 2026
18/932,524
Patent 12537017
COMPUTERIZED SCORING METHOD OF FEATURE EXTRACTION-BASED FOR COVERTNESS OF IMITATED MARINE MAMMAL SOUND SIGNAL
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
67%
Grant Probability
99%
With Interview (+55.5%)
3y 0m
Median Time to Grant
Moderate
PTA Risk
Based on 63 resolved cases by this examiner. Grant probability derived from career allow rate.