Last updated: May 29, 2026
Application No. 18/666,627
DYNAMIC PARALLEL NESTED LLM PROMPTS WITH STREAMING ACTIONS

Non-Final OA §101§102§103§112
Filed
May 16, 2024
Examiner
SCHMIEDER, NICOLE A K
Art Unit
2659
Tech Center
2600 — Communications
Assignee
Soundhound AI Ip LLC
OA Round
1 (Non-Final)
Interview Optional

— +34.4% interview lift. Examiner has a relatively high allowance rate (68%); +34.4% interview lift. A written response may suffice.
Based on 169 resolved cases, 2023–2026
Examiner Intelligence

SCHMIEDER, NICOLE A K View full profile →
Grants 68% — above average
Career Allowance Rate
115 granted / 169 resolved
+6.0% vs TC avg
Strong +34% interview lift
Without
With
+34.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
14 currently pending
Career history
193
Total Applications
across all art units
Statute-Specific Performance

§101
7.0%
-33.0% vs TC avg
§103
88.6%
+48.6% vs TC avg
§102
2.2%
-37.8% vs TC avg
§112
1.8%
-38.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 169 resolved cases
Office Action

§101 §102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim(s) 1-24 is/are pending and has/have been examined.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 09/19/2025 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Drawings
The drawings are objected to because of the following informalities: Fig. 3 - element 210 is not in the spec.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Objections
Claim 1 is objected to because of the following informalities: claim 1 recites “have not start condition to be satisfied”, which the Examiner believes to be a typo. For the sake of compact prosecution, the limitation will be interpreted as –have no start condition to be satisfied--.  Appropriate correction is required.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: a parallel processing engine in claim 1, and a nested loop engine in claims 1 and 15.
Regarding the terms parallel processing engine and nested loop engine, the terms are generic placeholders. There is no evidence that one or ordinary skill in the art would understand the structure by looking at the terms. Further, the terms are modified by the functional language “configured to”, but are not modified by a sufficient structure for performing the claimed function. 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
The parallel processing engine and nested loop engine are embodied as a processors, as per the specifications [0017-19].
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
	
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 15 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 15 recites “the nested loop engine”. There is insufficient antecedent basis for this limitation in the claim.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-24 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim(s) 1, 13, and 20, the limitation(s) of receive, divide, send, and send, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind and/or with pen and paper but for the recitation of generic computer components, as well as following rules or instructions. More specifically, the mental process and interaction of a human receiving a query, writing out the query into sub-sections, and handing the different sections to different people, where a first group determines their responses concurrently, and then hand their results to a second group to work their response that requires input from answering a query sub-section in the first group. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind and/or with pen and paper but for the recitation of generic computer components, or can include following rules or instructions, then it falls within the --Mental Processes—and –Methods of Organizing Human Activity-- groupings of abstract ideas. Accordingly, the claim(s) recite(s) an abstract idea.
This judicial exception is not integrated into a practical application because the recitation of a system, parallel processing engine, and nested loop engine in claim 1, and system, memory, and processors in claim 13, reads to generalized computer components, based upon the claim interpretation wherein the structure is interpreted using [0016-20] in the specification. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim(s) is/are directed to an abstract idea.
The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using generalized computer components to receive, divide, send, and send, amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim(s) is/are not patent eligible.
	
With respect to claim(s) 2, the claim(s) recite(s) the engines are integrated together in a single application program, which reads on multiple humans working together to answer a query. No additional limitations are present.

With respect to claim(s) 3, 15, and 21, the claim(s) recite(s) processing token groups recursively, which reads on the humans repeating their work to determine a potentially updated response to their sub-section based on the their work on the sub-section or someone else’s work on a different sub-section. No additional limitations are present.

With respect to claim(s) 4 and 16, the claim(s) recite(s) one of the token groups has dynamic states, which reads on a human having a sub-section of the query that they will need to change how they process the sub-section based on the results of processing any of their or the other sub-sections. No additional limitations are present.

With respect to claim(s) 5-9, the claim(s) recite(s) features of the token groups, which reads on a query being divided into sub-sections having specific characteristics. No additional limitations are present.

With respect to claim(s) 10 and 18, the claim(s) recite(s) information present in a token group, which reads on a human writing out a sub-section that includes information on what to do when the response is determined. No additional limitations are present.

With respect to claim(s) 11, 17, 19, 22, 23, and 24, the claim(s) recite(s) ((claims 11, 19, 23, and 24) performing the action) ((claims 11, 17, 19 and 23) upon satisfaction of a condition) ((claims 11, 17, 19, and 24) prior to completion of processing) the token group, which reads on a human stopping their work on their sub-section when a particular piece of information is retrieved and performing the required action using that information. No additional limitations are present.

With respect to claim(s) 12, the claim(s) recite(s) not processing a token group, which reads on a human not working on their sub-section until they have a piece of information they need. No additional limitations are present.
With respect to claim(s) 14, the claim(s) recite(s) send a third token group for processing in parallel, which reads on a human handing an additional sub-section to a third human to review and determine a response while the first human is reviewing the first sub-section. No additional limitations are present.

These claims further do not remedy the judicial exception being integrated into a practical application and further fail to include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 2, 4-10, 13, 14, 16, 18, 20, and 23 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Pathak et al. (US PG Pub No. 2025/0139136), hereinafter Pathak.

Regarding claims 1, 13, and 20, Pathak teaches
(claim 1) A system for processing prompts to a large language model (LLM) (a computing system [0010],[0022]), comprising:
(claim 13) A system for processing prompts to a large language model (LLM) (a computing system [0010],[0022]), comprising:
(claim 13) a memory for storing software code (the system includes storage media retaining instructions [0118-9]);
(claim 13) one or more processors configured to execute the software code to (the system includes processors for executing the instructions [0118-22]):
(claim 20) A method of processing prompts to a large language model, comprising the steps of (method [0008]):

((claim 1) a parallel processing engine configured to) receive a prompt, divide the prompt into two or more token groups, and send first ((claims 1 and 20) and second token groups) of the two or more token groups for processing by the LLM ((claims 1 and 20) in parallel where the first and second token groups have not start condition to be satisfied) (a user inputs a query into the system, i.e. parallel processing engine configured to receive a prompt, where the system partitions a single long query into plural component queries, each of which expresses a part of the original query, and where the queries are tokenized such that a prompt is a sequence of tokens submitted to a model, i.e. divide the prompt into two or more token groups, where the language model processes the sub-streams of tokens associated with the component queries in parallel, i.e. send first and second token groups of the two or more token groups for processing by the LLM in parallel, where first-stage queries are processed, i.e. the first and second token groups have not start condition to be satisfied, followed by second-stage queries based on the response for the first-stage queries, i.e. (claim 13) send a first token group of the two or more token groups to the LLM for processing Figs. 1 and 2,[0022-3],[0025-7],[0029],[0041],[0050],[0086], [0089],[0091],[0104]); and
((claim 1) a nested loop engine configured to perform nested processing of the two or more token groups,) wherein the second token group is sent for processing by the LLM upon satisfaction of a condition during processing of the first token group by the LLM (the system, i.e. nested loop engine, partitions a single long query into plural component queries, each of which expresses a part of the original query, and where the queries are tokenized such that a prompt is a sequence of tokens submitted to a model, i.e. two or more token groups, where in a first phase, the language model processes the first-stage component query guided by the user’s original query, i.e. processing of the first token group by the LLM, and in the second phase, the language model processes a second-stage component query guided by the original query and the data collected in the first phase, i.e. perform nested processing…wherein the second token group is sent for processing by the LLM upon satisfaction of a condition during processing of the first token group Figs. 1 and 2,[0022-3],[0025-7],[0029],[0041],[0050], [0060],[0086],[0089],[0091],[0104]).  
(claim 20) sending a third token group to the LLM for processing after the first token group and upon satisfaction of a start condition in the third token group (the system partitions a single long query into plural component queries, each of which expresses a part of the original query, and where the queries are tokenized such that a prompt is a sequence of tokens submitted to a model, i.e. two or more token groups, where in a first phase, the language model processes the first-stage component query guided by the user’s original query, i.e. processing…the first token group, and in the second phase, the language model processes a second-stage component query guided by the original query and the data collected in the first phase, i.e. sending a third token group to the LLM for processing after the first token group and upon satisfaction of a start condition in the third token group Figs. 1 and 2,[0022-3],[0025-7],[0029],[0041],[0050], [0060],[0086],[0089],[0091],[0104]).

Regarding claim 2, Pathak teaches claim 1, and further teaches
the parallel processing engine and the nested loop engine are integrated together in a single application program (the system as a whole is integrated into a particular application, such as a chat engine, question-answering engine, or search engine, i.e. integrated together in a single application program, where the system processes queries in parallel and with a hierarchical structure, i.e. parallel processing engine and the nested loop engine Figs. 1 and 2,[0022-3],[0029],[0041],[0060],[0086], [0089]). 

Regarding claims 4 and 16, Pathak teaches claims 1 and 13, and further teaches
 at least one of the two or more token groups has dynamic states that change as the two or more token groups are processed by the LLM (the system partitions a single long query into plural component queries, each of which expresses a part of the original query, and where the queries are tokenized such that a prompt is a sequence of tokens submitted to a model, i.e. two or more token groups, where in a first phase, the language model processes the first-stage component query guided by the user’s original query, i.e. as the two or more token groups are processed by the LLM, and in the second phase, the language model processes a second-stage component query guided by the original query and the data collected in the first phase, i.e. at least one of the two or more token groups has dynamic states that change as the two or more token groups are processed, which can include follow-up queries or processing just the common part of the query first, and then begin processing each component query at the instance-specific part, i.e.  at least one of the two or more token groups has dynamic states that change as the two or more token groups are processed by the LLM Figs. 1 and 2,[0022-3],[0025-7],[0029],[0041],[0050], [0060],[0081-4],[0086],[0089],[0091], [0104]).  

Regarding claim 5, Pathak teaches claim 1, and further teaches
the prompt processed by the LLM comprises a plurality of system prompts and a user prompt (the original query, such as “"Show me information regarding
different aspects of a 2024 vacation to Italy.", i.e. user prompt, is expanded upon to include instance specific parts pertaining to the original query, where the instance-specific parts are included in component queries, such as parts pertaining to airfare, hotel arrangements, package tours, and restaurants, contains or makes reference to a list of function definitions associated with functions that are capable of being invoked to answer the user’s question, where the function definition enables the language model to specify invocation information to perform the particular function, i.e. a plurality of system prompts [0030],[0034-37],[0053-4],[0056],[0071-2]).

Regarding claim 6, Pathak teaches claim 5, and further teaches
the two or more token groups comprise one token group for each system prompt (the original query, such as “"Show me information regarding different aspects of a 2024 vacation to Italy.", is expanded upon to include instance specific parts pertaining to the original query, such as parts pertaining to airfare, hotel arrangements, package tours, and restaurants, contains or makes reference to a list of function definitions associated with functions that are capable of being invoked to answer the user’s question, where the function definition enables the language model to specify invocation information to perform the particular function, i.e. one token group for each system prompt [0030],[0034-37],[0053-4]).  

Regarding claim 7, Pathak teaches claim 5, and further teaches
the two or more token groups comprise one token group for the whole user prompt (a system prompt can be sent by the prompt-compiling component to the language model that describes the specific task that the language model will be subsequently asked to perform, and then the original query is sent to the language model for analysis, where the query may not be partitioned, i.e. one token group for the whole user prompt [0056],[0071-3]).  

Regarding claim 8, Pathak teaches claim 5, and further teaches
the user prompt is broken into two or more token groups (the original query is partitioned into plural component queries, i.e. broken into two or more token groups, where there is a common part and one or more instance-specific parts, and the each component query includes the common part and an associated instance-specific part [0026],[0041]).  

Regarding claim 9, Pathak teaches claim 5, and further teaches
at least one system prompt and at least a portion of the user prompt share a single token group of the two or more token groups (the original query, such as “"Show me information regarding different aspects of a 2024 vacation to Italy.", i.e. user prompt, is expanded upon to include instance specific parts pertaining to the original query, where the instance-specific parts are included in component queries, such as parts pertaining to airfare, hotel arrangements, package tours, and restaurants, contains or makes reference to a list of function definitions associated with functions that are capable of being invoked to answer the user’s question, where the function definition enables the language model to specify invocation information to perform the particular function, i.e. a plurality of system prompts, and where each component query includes the common part and an associated instance-specific part [0026],[0041] [0030],[0034-37],[0053-4],[0056],[0071-2]).  
Regarding claims 10, 18, and 23, Pathak teaches claims 1, 13, and 20, and further teaches
a token group comprises a key pair having a condition and an action upon satisfaction of the condition (a predetermined keyword, symbol, or flag may be present in the original query associated with different types of post-solution actions to be performed on the component-query responses, i.e. a token group comprises a key pair having a condition and an action upon satisfaction of the condition, such as concatenate, summary, extract, or rank the results of processing the queries, i.e. a condition and an action upon satisfaction of the condition [0077]).  

Regarding claim 14, Pathak teaches claim 13, and further teaches
send a third token group of the two or more token groups to the LLM for processing in parallel with the first token group (a user inputs a query into the system, where the system partitions a single long query into plural component queries, each of which expresses a part of the original query, and where the queries are tokenized such that a prompt is a sequence of tokens submitted to a model, i.e. a third token group of the two or more token groups, where the language model processes the sub-streams of tokens associated with the component queries in parallel, i.e. send a third token group of the two or more token groups to the LLM for processing in parallel with the first token group, where first-stage queries are processed, i.e. first token group…third token group, followed by second-stage queries based on the response for the first-stage queries Figs. 1 and 2,[0022-3],[0025-7],[0029],[0041],[0050],[0086], [0089],[0091],[0104]).  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3, 15, 17, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Pathak, in view of Zhang (U.S. Patent No. 12,547,840), hereinafter Zhang.

Regarding claim 3, Pathak teaches claim 1.
While Pathak provides processing series of component queries with parent-child relationships, Pathak does not specifically teach recursively processing the prompts based on the satisfaction of conditions, and thus does not teach
the nested loop engine is configured to process two or more of the two or more token groups recursively based on satisfaction of conditions within the two or more token groups.  
Zhang, however, teaches the nested loop engine is configured to process two or more of the two or more token groups recursively based on satisfaction of conditions within the two or more token groups (in the first phase, sub-questions are generated from the original question and provided to the LLM to receive contextual answers, i.e. nested loop engine is configured to process two or more of the two or more token groups, and once the sub-question generator has reached a terminal state, i.e. based on satisfaction of conditions within the two or more token groups, the original question is reprocessed using the contextual answers to provide a final updated question to the LLM and receive a final answer, i.e. process two or more of the two or more token groups recursively based on…(6:39-8:28)).  
Where Pathak teaches that the query may be partitioned into queries to be processed in parallel and/or stages Figs. 1 and 2,[0022-3],[0025-7],[0029],[0041], [0050],[0086],[0089],[0091],[0104].
Pathak and Zhang are analogous art because they are from a similar field of endeavor in optimizing processing user queries with LLMs. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processing series of component queries with parent-child relationships teachings of Pathak with reprocessing an original question based on the contextual answers from sub-questions as taught by Zhang. It would have been obvious to combine the references to improve the accuracy of answers to problems using context and expand the scope of natural language prompts that can be answer correctly by a LLM (Zhang (3:62-4:7)).

Regarding claims 15 and 21, Pathak teaches claims 13 and 20.
While Pathak provides processing series of component queries with parent-child relationships, Pathak does not specifically teach recursively processing the prompts based on the satisfaction of conditions in processing token groups, and thus does not teach
the nested loop engine is configured to process the first token group to completion, then process the second token group, then process the first token group again based on satisfaction of a condition in the second token group directing that the first token group be processed again.  
Zhang, however, teaches the nested loop engine is configured to process the first token group to completion, then process the second token group, then process the first token group again based on satisfaction of a condition in the second token group directing that the first token group be processed again (in the first phase, sub-questions are generated from the original question and provided to the LLM to receive contextual answers, i.e. the nested loop engine is configured to process the first token group to completion, and once the sub-question generator has reached a terminal state, i.e. then process the second token group, the original question is reprocessed using the contextual answers to provide a final updated question to the LLM and receive a final answer, i.e. then process the first token group again based on satisfaction of a condition in the second token group directing that the first token group be processed again (6:39-8:28)).  
Where Pathak teaches that the query may be partitioned into queries to be processed in parallel and/or stages Figs. 1 and 2,[0022-3],[0025-7],[0029],[0041], [0050],[0086],[0089],[0091],[0104].
Pathak and Zhang are analogous art because they are from a similar field of endeavor in optimizing processing user queries with LLMs. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processing series of component queries with parent-child relationships teachings of Pathak with reprocessing an original question based on the contextual answers from sub-questions as taught by Zhang. It would have been obvious to combine the references to improve the accuracy of answers to problems using context and expand the scope of natural language prompts that can be answer correctly by a LLM (Zhang (3:62-4:7)).

Regarding claim 17, Pathak teaches claim 13, and further teaches
processing of the first token group terminates before completion of processing by the LLM based on an end condition contained in the second token group (processors operating independently and may generate query-component responses at different times, i.e. processing of the first token group…second token group, where, before each calculation a processor instance will check for already-calculated KV information, i.e. an end condition contained in the second token group, and the processor will not perform the calculation if the information is already there, i.e. processing of the first token group terminates before completion of processing by the LLM [0082-4]).

Claim(s) 11, 19, 22, and 24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Pathak, in view of Zhu et al. (“Efficient Tuning and Inference for Large Language Models on Textual Graphs”, arXiv:2401.15569v1, 28 Jan 2024), hereinafter Zhu.

Regarding claims 11, 19, 22, and 24, Pathak teaches claims 10, 18, 20, and 23
While Pathak provides performing actions after generating responses, Pathak does not specifically teach performing actions before completing the processing of the token group, and thus does not teach
the action is performed upon satisfaction of the condition prior to completion of processing the token group. 
Zhu, however, teaches the action is performed upon satisfaction of the condition prior to completion of processing the token group (dynamic early exit determines whether to exit the LLM layers early and return the prediction when certain criteria are met, such as consecutive early exit classifiers predict the same results, the model stops inference early, i.e. upon satisfaction of the condition prior to completion of processing the token group Fig. 2,(Abstract, Sec. 1, 3.3)). 
Where Pathak teaches that the action is performed on the responses to the queries [0077].
Pathak and Zhu are analogous art because they are from a similar field of endeavor in optimizing language model performance. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the performing actions after generating responses teachings of Pathak with the use of dynamic early exiting as taught by Zhu. It would have been obvious to combine the references to achieve faster inference with a negligible performance drop (Zhu Abstract).

Claim(s) 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Pathak, in view of Kim et al. (“An LLM Compiler for Parallel Function Calling”, Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, Published 01 May 2024.), hereinafter Kim.

Regarding claim 12, Pathak teaches claim 1. 
While Pathak provides sequential processing of component queries, Pathak does not specifically teach not processing a token group based on a start condition not being satisfied, and thus does not teach
a token group of the two or more token groups is not processed based on its start condition not being satisfied.  
Kim, however, teaches a token group of the two or more token groups is not processed based on its start condition not being satisfied (dependent tasks are blocked, i.e. a token group of the two or more token groups is not processed, until the task fetching unit uses the results of a completed task to replace placeholder variables with actual values Fig. 2,(Sec. 3.1-3.3)).  
Pathak teaches that a dependent task may be a second-set of component queries associated with sub-streams of tokens [0023],[0086-89].
Pathak and Kim are analogous art because they are from a similar field of endeavor in optimizing language model performance with parallel processing. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the sequential processing of component queries teachings of Pathak with the blocking of dependent tasks until placeholder variables can be replaced by actual values with the completion of preceding tasks as taught by Kim. It would have been obvious to combine the references to enable arriving at a final answer with latency speedup, cost savings, and accuracy improvement compared to other function calling methods (Kim Abstract).

Conclusion

	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICOLE A K SCHMIEDER whose telephone number is (571)270-1474. The examiner can normally be reached 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached at (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NICOLE A K SCHMIEDER/Primary Examiner, Art Unit 2659
Read full office action
Prosecution Timeline

May 16, 2024
Application Filed
Apr 02, 2026
Non-Final Rejection mailed — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/371,233
Patent 12633284
PATCHED MULTI-CONDITION TRAINING FOR ROBUST SPEECH RECOGNITION
2y 8m to grant Granted May 19, 2026
18/640,026
Patent 12614027
GENERATIVE ARTIFICIAL INTELLIGENCE BASED (GEN AI-BASED) CONTENT GENERATION SYSTEM AND METHOD FOR GENERATING PERSONA-BASED QUESTION ANSWERS TO OPTIMIZE USER EXPERIENCES
2y 0m to grant Granted Apr 28, 2026
18/219,339
Patent 12572751
ELECTRONIC DEVICE AND CONTROLLING METHOD OF ELECTRONIC DEVICE
2y 8m to grant Granted Mar 10, 2026
17/626,617
Patent 12567408
MULTI-MODAL SMART AUDIO DEVICE SYSTEM ATTENTIVENESS EXPRESSION
4y 1m to grant Granted Mar 03, 2026
17/938,173
Patent 12554930
TRANSFORMER-BASED TEXT ENCODER FOR PASSAGE RETRIEVAL
3y 4m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
68%
Grant Probability
99%
With Interview (+34.4%)
2y 8m (~8m remaining)
Median Time to Grant
Low
PTA Risk
Based on 169 resolved cases by this examiner. Grant probability derived from career allowance rate.