Last updated: May 29, 2026
Application No. 18/649,000
LEARNING MONOTONIC ALIGNMENT FOR LANGUAGE MODELS IN AI SYSTEMS AND APPLICATIONS

Non-Final OA §101§103
Filed
Apr 29, 2024
Examiner
YAMAMOTO, JOSEPH JEREMY
Art Unit
2656
Tech Center
2600 — Communications
Assignee
Nvidia Corporation
OA Round
1 (Non-Final)
Interview Optional

— +28.3% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 71% grant rate with +28.3% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 45 resolved cases, 2023–2026
Examiner Intelligence

YAMAMOTO, JOSEPH JEREMY View full profile →
Grants 71% — above average
Career Allowance Rate
32 granted / 45 resolved
+9.1% vs TC avg
Strong +28% interview lift
Without
With
+28.3%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
8 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
13.3%
-26.7% vs TC avg
§103
80.6%
+40.6% vs TC avg
§112
6.1%
-33.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 45 resolved cases
Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Claims 1-20 are pending. Claims 1, 8, and 18 are independent.  
Claims 2-7 depend from Claim 1.
Claims 9-17 depend from Claim 8.
Claims 19-20 depend from Claim 18.
This Application was published as U.S. 2025/0336389.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 25 Nov 2024 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference sign(s) mentioned in the description:
Par [0049] references process 600 that is not mentioned in the drawings.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-7 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.

Independent claim 1 recite various limitations that, but for generic computer components (i.e. one or more language models) can be performed in the human mind or with pen and paper, and are considered abstract ideas. The claims under the broadest reasonable interpretation cover the method of generating audio data representative of speech associated with the first text, further limited based on one or more trained language models processing the first text. (See MPEP 2106.04(a)(2) III)

This judicial exception is not integrated into a practical application because the claims only recite elements in the form of “language models” which are passively involved in the claimed method. These elements are passively used to perform the claimed methods and steps, and are recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception based on trained language models that process text without explaining the output of the processing of text and linkage to the method, while the method generates audio data representative of speech associated with the text. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. 

The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because they do not include subject matter that could not be performed by a human, as discussed above with respect to integration of the abstract idea into a practical application. The additional elements of using the generic computing elements to perform the claimed elements amount to no more than mere instructions to apply the exception using a generic computer component or can be considered insignificant extra solution activity. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept, and mere data gathering in conjunction with an abstract idea cannot provide an inventive concept. For the all the reasons stated above, the claims are not patent eligible.

	With regards to claim 2, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed in the human mind by pen or paper because the additional limitations about the cross attention scores and how to update the parameters does not resolve how generating the audio text representative of speech can be performed as an abstract idea. Similar to claim 1, no additional elements beyond the use of generic computing elements are claimed. Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception. 

	With regards to claim 3, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed in the human mind by pen or paper because the additional limitations about the cross attention scores and how to update the parameters does not resolve how generating the audio text representative of speech can be performed as an abstract idea. Similar to claim 1, no additional elements beyond the use of generic computing elements are claimed. Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception. 

	With regards to claim 4, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed in the human mind by pen or paper because the additional limitations about the monotonic sequences, determining of losses, and how to update the parameters does not resolve how generating the audio text representative of speech can be performed as an abstract idea. Similar to claim 1, no additional elements beyond the use of generic computing elements are claimed. Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception. 

	With regards to claim 5, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed in the human mind by pen or paper because the additional limitations about the cross attention scores and how to update the parameters does not resolve how generating the audio text representative of speech can be performed as an abstract idea. Similar to claim 1, no additional elements beyond the use of generic computing elements are claimed. Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception. 

	With regards to claim 6, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed in the human mind by pen or paper because the additional limitations about the determining losses does not resolve how generating the audio text representative of speech can be performed as an abstract idea. Similar to claim 1, no additional elements beyond the use of generic computing elements are claimed. Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception. 

	With regards to claim 7, the claim further limits the elements of claim 1; however, these limitations do not preclude the limitations from being performed in the human mind by pen or paper because the additional limitations about the attention priors and text tokens does not resolve how generating the audio text representative of speech can be performed as an abstract idea. Similar to claim 1, no additional elements beyond the use of generic computing elements are claimed. Therefore, the judicial exception is not integrated into a practical application nor are the elements sufficient to amount to significantly more than the judicial exception. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-4, 8, 12-14, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Lee (US2021/0358493 hereinafter Lee) in view Cayet et al. (US2025/0217683 hereinafter Cayet) 

	With regards to claim 1, Lee teaches:
	A method comprising: generating, based at least on one or more language models processing first text data representative of first text, audio data representative of speech associated with the first text, [Lee Fig 1 teaches first text data (130, Par [0058]) is an input sequence that is processed by language model (120, Par [0062]) which generates output (140, Par [0058])]
	
	With regards to claim 1, Lee fails to teach:
	wherein the one or more language models are trained at least by:
		determining, based at least on the one or more language models processing second text data representative of second text, one or more cross-attention scores associated with one or more layers of a decoder of the one or more language models; 
	determining, based at least on one or more text tokens associated with the second text data and one or more time durations associated with the second text data, one or more attention priors associated with the second text data; and
	updating one or more parameters associated with the one or more language models based at least on the one or more cross-attention scores and the one or more attention priors.
		
	With regards to claim 1, Cayet teaches:
	wherein the one or more language models are trained at least by:
		determining, based at least on the one or more language models processing second text data representative of second text, one or more cross-attention scores associated with one or more layers of a decoder of the one or more language models; [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) where the tokens are representative of the text. The tokens are processed by the transformer and each decoder layer of the transformer contains “cross-attention (Enc2Enc Attention) for incorporating the output of encoder (contextualized input token representations)” (Par [0032]) which generates attention scores. (see Fig 3, Par [0039-40])]
	determining, based at least on one or more text tokens associated with the second text data and one or more time durations associated with the second text data, one or more attention priors associated with the second text data; and  [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) which are sent to the transformers where mixing of information occurs “among the input tokens to the decoder (i.e., the decoded output tokens generated so far during inference time)” (Par [0032]) where inference time is one or more time durations associated with the input or second text data. The attention priors associated with the input text are generated by averaging the “attention scores generated by the last decoder layer and provides the average attention score to padding and softmax layer 251” (Par [0035])]
		updating one or more parameters associated with the one or more language models based at least on the one or more cross-attention scores and the one or more attention priors. [Cayet teaches updating the parameters of the copy distribution using the attention priors and associated attention scores.  (Par [0039])
	It would be obvious to one of ordinary skill at the time of applicant’s filing to combine the sequence model as taught by Lee with the transformer using attention scores as taught by Cayet. The motivation to combine the teachings of Lee with Cayet is because Cayet teaches performing “accurate text generation from prompts with a small but powerful model architecture. The accuracy improvement of the model architecture comes from additional blocks in the architecture helping the model to more easily learn things that are complex for smaller models, such as using variable-length input values” (Par [0020]) which improves the capabilities of the invention of Lee to improve the model’s accuracy]
	
	With regards to claim 2, Lee in view of Cayet teaches:
	All the limitations of claim 1
	wherein: the determining the one or more cross-attention scores associated with the one or more layers comprises determining, based at least on the one or more language models processing the second text data, at least one or more first cross-attention scores associated with one or more first layers of the one or more layers and one or more second cross-attention scores associated with one or more second layers of the one or more layers; and [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) where the tokens are representative of the text. The tokens are processed by the transformer and each decoder layer of the transformer contains “cross-attention (Enc2Enc Attention) for incorporating the output of encoder (contextualized input token representations)” (Par [0032]) which generates attention scores. (see Fig 3, Par [0039-40])]
	the updating the one or more parameters associated with the one or more language models is based at least on applying the one or more attention priors to the one or more first cross-attention scores and applying the one or more attention priors to the one or more second cross-attention scores. [Cayet teaches updating the parameters of the copy distribution using the attention priors and associated attention scores.  (Par [0039])

	With regards to claim 3, Lee in view of Cayet teaches:
	All the limitations of claim 1
	wherein: the determining the one or more cross-attention scores associated with the one or more layers comprises determining, based at least on the one or more language models processing the second text data, at least one or more first cross-attention scores associated with a first head of a layer of the one or more layers and one or more second cross-attention scores associated with a second head of the layer of the one or more layers; and [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) where the tokens are representative of the text. The tokens are processed by the transformer and each decoder layer of the transformer contains “cross-attention (Enc2Enc Attention) for incorporating the output of encoder (contextualized input token representations)” (Par [0032]) which generates attention scores. (see Fig 3, Par [0039-40])]
	the updating the one or more parameters associated with the one or more language models is based at least on applying the one or more attention priors to the one or more first cross-attention scores and applying the one or more attention priors to the one or more second cross-attention scores. [Cayet teaches updating the parameters of the copy distribution using the attention priors and associated attention scores.  (Par [0039])

	With regards to claim 4, Lee in view of Cayet teaches:
	All the limitations of claim 1
	wherein the one or more language models are further trained by: determining one or more monotonic sequences associated with the one or more text tokens; and [Lee Fig 9B teaches monotonic attention guide module (950) to “train the attention-based sequence-to-sequence model by guiding the attention weight matrix 900 to have monotonic properties from a point of time of the training” (Par [0132])]
	determining one or more losses based at least on the one or more monotonic sequences [Lee teaches “select an attention weight matrix having a smallest error and monotonic properties based on an analysis result from the plurality of attention weight matric” (Par [0115]) while error and loss are not the same, Lee also teaches “path through which all parameters of a model are simultaneously trained for one loss function” (Par [0069]) which would be imply teaching a loss function for the monotonic sequence because “it will be apparent after an understanding of the disclosure of this application that various changes in form and details may be made in these examples without departing from the spirit and scope of the claims and their equivalents” (Par [0138])]
	and the one or more cross-attention scores, [Cayet teaches “training loss is simply the sum of the generation loss (from the averaged generation+copy distribution, negative log-likelihood loss) and the copy weight loss (binary cross entropy between the copy weight and the copy labels)” (Par [0054]) where training loss is based on the attention scores]
	wherein the updating of the one or more parameters associated with the one or more language models is further based at least on the one or more losses. [Lee teaches “select an attention weight matrix having a smallest error and monotonic properties based on an analysis result from the plurality of attention weight matric” (Par [0115]) which is based on a loss]

	With regards to claim 8, Lee teaches:
	A system comprising: one or more processors to: [Lee Par [0135]]
		determine one or more monotonic sequences associated with one or more text tokens corresponding to text data; [Lee Fig 9B teaches monotonic attention guide module (950) to “train the attention-based sequence-to-sequence model by guiding the attention weight matrix 900 to have monotonic properties from a point of time of the training” (Par [0132])]
		determine one or more losses based at least on the one or more monotonic sequences and; and [Lee teaches “select an attention weight matrix having a smallest error and monotonic properties based on an analysis result from the plurality of attention weight matric” (Par [0115]) while error and loss are not the same, Lee also teaches “path through which all parameters of a model are simultaneously trained for one loss function” (Par [0069]) which would be imply teaching a loss function for the monotonic sequence because “it will be apparent after an understanding of the disclosure of this application that various changes in form and details may be made in these examples without departing from the spirit and scope of the claims and their equivalents” (Par [0138])]
		update, based at least on the one or more losses, one or more parameters associated with the one or more language models. [Lee teaches “select an attention weight matrix having a smallest error and monotonic properties based on an analysis result from the plurality of attention weight matric” (Par [0115]) which is based on a loss]

	With regards to claim 8, Lee fails to teach:
	determine, based at least on one or more language models processing the text data, one or more cross-attention scores associated with one or more layers of a decoder of the one or more language models;	
	determine one or more losses based at least on the one or more cross-attention scores; and
	
	With regards to claim 8, Cayet teaches:
determine, based at least on one or more language models processing the text data, one or more cross-attention scores associated with one or more layers of a decoder of the one or more language models; [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) where the tokens are representative of the text. The tokens are processed by the transformer and each decoder layer of the transformer contains “cross-attention (Enc2Enc Attention) for incorporating the output of encoder (contextualized input token representations)” (Par [0032]) which generates attention scores. (see Fig 3, Par [0039-40])]
	determine one or more losses based at least on the one or more cross-attention scores; and [Cayet teaches “training loss is simply the sum of the generation loss (from the averaged generation+copy distribution, negative log-likelihood loss) and the copy weight loss (binary cross entropy between the copy weight and the copy labels)” (Par [0054]) where training loss is based on the attention scores.
	It would be obvious to one of ordinary skill at the time of applicant’s filing to combine the sequence model as taught by Lee with the transformer using attention scores as taught by Cayet. The motivation to combine the teachings of Lee with Cayet is because Cayet teaches performing “accurate text generation from prompts with a small but powerful model architecture. The accuracy improvement of the model architecture comes from additional blocks in the architecture helping the model to more easily learn things that are complex for smaller models, such as using variable-length input values” (Par [0020]) which improves the capabilities of the invention of Lee to improve the model’s accuracy]

	With regards to claim 12, Lee in view of Cayet teaches:
	All the limitations of claim 8
	wherein the one or more processors are further to: determine one or more probabilities associated with the one or more monotonic sequences, [Lee teaches training the attention matrix (900) to “have hard monotonic properties, or may train the attention weight matrix 900 to have soft monotonic properties” (Par [0133]) where the weight matrix consists of probabilities]
	wherein the determination of the one or more losses is based at least on the one or more probabilities and the one or more cross-attention scores. [Lee teaches “select an attention weight matrix having a smallest error and monotonic properties based on an analysis result from the plurality of attention weight matric” (Par [0115]) while error and loss are not the same, Lee also teaches “path through which all parameters of a model are simultaneously trained for one loss function” (Par [0069]) which would be imply teaching a loss function for the monotonic sequence because “it will be apparent after an understanding of the disclosure of this application that various changes in form and details may be made in these examples without departing from the spirit and scope of the claims and their equivalents” (Par [0138])]

	With regards to claim 13, Lee in view of Cayet teaches:
	All the limitations of claim 8
	wherein the one or more processors are further to: determine, based at least on the one or more text tokens associated with the text data and one or more time durations associated with the text data, one or more attention priors associated with the text data; [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) which are sent to the transformers where mixing of information occurs “among the input tokens to the decoder (i.e., the decoded output tokens generated so far during inference time)” (Par [0032]) where inference time is one or more time durations associated with the input or second text data. The attention priors associated with the input text are generated by averaging the “attention scores generated by the last decoder layer and provides the average attention score to padding and softmax layer 251” (Par [0035])]
	wherein the one or more parameters associated with the one or more language models are further updated based at least on applying the one or more attention priors to the one or more cross-attention scores. [Cayet teaches updating the parameters of the copy decoder using the attention priors used to attentions the copy distribution (Par [0039])]

	With regards to claim 14, Lee in view of Cayet teaches:
	All the limitations of claim 13
	wherein: the one or more cross-attention scores include at least one or more first cross-attention scores associated with one or more first layers of the one or more layers and one or more second cross-attention scores associated with one or more second layers of the one or more layers; and [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) where the tokens are representative of the text. The tokens are processed by the transformer and each decoder layer of the transformer contains “cross-attention (Enc2Enc Attention) for incorporating the output of encoder (contextualized input token representations)” (Par [0032]) which generates attention scores. (see Fig 3, Par [0039-40])]
	the one or more parameters associated with the one or more language models are updated based at least on the one or more losses, [Lee teaches “select an attention weight matrix having a smallest error and monotonic properties based on an analysis result from the plurality of attention weight matric” (Par [0115]) while error and loss are not the same, Lee also teaches “path through which all parameters of a model are simultaneously trained for one loss function” (Par [0069]) which would be imply teaching a loss function for the monotonic sequence because “it will be apparent after an understanding of the disclosure of this application that various changes in form and details may be made in these examples without departing from the spirit and scope of the claims and their equivalents” (Par [0138])]
	applying the one or more attention priors to the one or more first cross-attention scores, and applying the one or more attention priors to the one or more second cross-attention scores. [Cayet teaches updating the parameters of the copy distribution using the attention priors and associated attention scores.  (Par [0039])]

	With regards to claim 17, Lee in view of Cayet teaches:
	All the limitations of claim 8
	wherein the system is comprised in at least one of:
	a control system for an autonomous or semi-autonomous machine;
	a perception system for an autonomous or semi-autonomous machine;
	a system for performing one or more simulation operations;
	a system for performing one or more digital twin operations;
	a system for performing light transport simulation;
	a system for performing collaborative content creation for 3D assets;
	a system for performing one or more deep learning operations; [Lee Fig 9B teaches deep learning operation]
	a system implemented using an edge device;
	a system implemented using a robot;
	a system for performing one or more generative AI operations; [Cayet teaches text generation (Par [0020])]
	a system for performing operations using one or more large language models (LLMs);
	a system for performing operations using one or more vision language models (VLMs);
	a system for performing one or more conversational AI operations;
	a system for generating synthetic data;
	a system for presenting at least one of virtual reality content, augmented reality content, or mixed reality content;
	a system incorporating one or more virtual machines (VMs);
	a system implemented at least partially in a data center; or
	a system implemented at least partially using cloud computing resources.

	With regards to claim 18, Lee teaches:
	One or more processors comprising: [Lee Par [0135]]
	and at least one of one or more attention priors or one or more monotonic sequences, wherein at least one of the one or more attention priors or the one or more monotonic sequences are determined using one or more text tokens associated with text data processed using the one or more language models. [Lee Fig 9B teaches monotonic attention guide module (950) to “train the attention-based sequence-to-sequence model by guiding the attention weight matrix 900 to have monotonic properties from a point of time of the training” (Par [0132])]

	With regards to claim 18, Lee fails to teach:
	processing circuitry to perform one or more operations using one or more language models, wherein the one or more language models are trained based at least on one more cross-attention scores determined using one or more layers of a decoder associated with the one or more language models
		
	With regards to claim 18, Cayet teaches:
	processing circuitry to perform one or more operations using one or more language models, wherein the one or more language models are trained based at least on one more cross-attention scores determined using one or more layers of a decoder associated with the one or more language models [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) where the tokens are representative of the text. The tokens are processed by the transformer and each decoder layer of the transformer contains “cross-attention (Enc2Enc Attention) for incorporating the output of encoder (contextualized input token representations)” (Par [0032]) which generates attention scores. (see Fig 3, Par [0039-40])]
	It would be obvious to one of ordinary skill at the time of applicant’s filing to combine the sequence model as taught by Lee with the transformer using attention scores as taught by Cayet. The motivation to combine the teachings of Lee with Cayet is because Cayet teaches performing “accurate text generation from prompts with a small but powerful model architecture. The accuracy improvement of the model architecture comes from additional blocks in the architecture helping the model to more easily learn things that are complex for smaller models, such as using variable-length input values” (Par [0020]) which improves the capabilities of the invention of Lee to improve the model’s accuracy]


	With regards to claim 19, Lee in view of Cayet teaches:
	All the limitations of claim 18
	wherein the one or more cross-attention scores include at least one of:
		first cross-attention scores associated with layers of the one or more layers; or [Cayet  Fig 2 teaches “Input text is encoded as tokens” (Par [0029]) where the tokens are representative of the text. The tokens are processed by the transformer and each decoder layer of the transformer contains “cross-attention (Enc2Enc Attention) for incorporating the output of encoder (contextualized input token representations)” (Par [0032]) which generates attention scores. (see Fig 3, Par [0039-40])
		second cross-attention scores associated with heads of a layer of the one or more layers.

	With regards to claim 20, Lee in view of Cayet teaches:
	All the limitations of claim 18
	wherein the one or more processors are comprised in at least one of:
	a control system for an autonomous or semi-autonomous machine;
	a perception system for an autonomous or semi-autonomous machine;
	a system for performing one or more simulation operations;
	a system for performing one or more digital twin operations;
	a system for performing light transport simulation;
	a system for performing collaborative content creation for 3D assets;
	a system for performing one or more deep learning operations; [Lee Fig 9B teaches deep learning operation]
	a system implemented using an edge device;
	a system implemented using a robot;
	a system for performing one or more generative AI operations; [Cayet teaches text generation (Par [0020])]
	a system for performing operations using one or more large language models (LLMs);
	a system for performing operations using one or more vision language models (VLMs);
	a system for performing one or more conversational AI operations;
	a system for generating synthetic data;
	a system for presenting at least one of virtual reality content, augmented reality content, or mixed reality content;
	a system incorporating one or more virtual machines (VMs);
	a system implemented at least partially in a data center; or
	a system implemented at least partially using cloud computing resources.

Potentially Allowable Subject Matter
Claims 5-7 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 101, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

Allowable Subject Matter
Claims 9-11 and 15-16 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Joseph J Yamamoto whose telephone number is (571)272-4020. The examiner can normally be reached M-F 1000-1800 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached at 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

JOSEPH J. YAMAMOTO
Examiner
Art Unit 2656



/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656
Read full office action
Prosecution Timeline

Apr 29, 2024
Application Filed
Feb 25, 2026
Non-Final Rejection mailed — §101, §103
Apr 29, 2026
Applicant Interview (Telephonic)
Apr 29, 2026
Examiner Interview Summary
Apr 30, 2026
Response Filed
Precedent Cases

Applications granted by this same examiner with similar technology

18/195,297
Patent 12619823
COMPUTER-IMPLEMENTED SYSTEM AND METHOD TO PERFORM NATURAL LANGUAGE PROCESSING ENTITY RESEARCH AND RESOLUTION
2y 12m to grant Granted May 05, 2026
18/494,874
Patent 12614559
NEAR-END SPEECH INTELLIGIBILITY ENHANCEMENT WITH MINIMAL ARTIFACTS
2y 6m to grant Granted Apr 28, 2026
18/081,410
Patent 12602546
KEY POINTS EXTRACTION FOR UNIFORM RESOURCE LOCATORS
3y 4m to grant Granted Apr 14, 2026
18/423,836
Patent 12602377
SYSTEMS AND METHODS FOR QUESTION ANSWERING WITH DIVERSE KNOWLEDGE SOURCES
2y 2m to grant Granted Apr 14, 2026
18/388,447
Patent 12592220
DEEPFAKE DETECTION
2y 4m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
71%
Grant Probability
99%
With Interview (+28.3%)
2y 8m (~7m remaining)
Median Time to Grant
Low
PTA Risk
Based on 45 resolved cases by this examiner. Grant probability derived from career allowance rate.