Prosecution Insights
Last updated: April 19, 2026
Application No. 17/109,824

FACTORIZED NEURAL NETWORK

Non-Final OA §101
Filed
Dec 02, 2020
Examiner
MENGISTU, TEWODROS E
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Microsoft Technology Licensing, LLC
OA Round
5 (Non-Final)
49%
Grant Probability
Moderate
5-6
OA Rounds
4y 5m
To Grant
77%
With Interview

Examiner Intelligence

Grants 49% of resolved cases
49%
Career Allow Rate
62 granted / 127 resolved
-6.2% vs TC avg
Strong +28% interview lift
Without
With
+28.2%
Interview Lift
resolved cases with interview
Typical timeline
4y 5m
Avg Prosecution
34 currently pending
Career history
161
Total Applications
across all art units

Statute-Specific Performance

§101
27.9%
-12.1% vs TC avg
§103
44.5%
+4.5% vs TC avg
§102
9.6%
-30.4% vs TC avg
§112
14.7%
-25.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 127 resolved cases

Office Action

§101
Detailed Action Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1, 4-7, 14, 16-17, and 21-29. Claims 1, 14, and 21 are independent. Response to Amendment The office action is responsive to the amendments filed on 02/23/2026. As directed by the amendments claims 1, 4-5, 7, and 14 are amended. Claims 21-29 are new. Response to Arguments Applicant's arguments filed 02/23/2026 have been fully considered but they are not persuasive. Applicant arguments regarding 35 U.S.C. § 101: Turning now to the standards for patent eligibility under Section 101, the Federal Circuit and the USPTO have repeatedly emphasized that the "mental process" category is limited to steps that can practically be performed in the human mind. A human cannot practically perform the recited spectral initialization and Frobenius decay of the factorized layer as claimed, particularly on a model having a parameter size of millions of parameters, as nearly all modem neural networks do. The Examiner's analysis collapses the distinction between abstract reasoning and concrete algorithmic computation, contrary to MPEP § 2106.04(a)(2)(III). As recognized in Ex parte Desjardins (Appeal 2024-000567), a machine learning model that improves its own operation is not a generic computer element, even if it is executed on standard hardware. The Desjardins panel vacated a 35 U.S.C. 101 rejection where the claim recited improvements to model operation-specifically, continual learning that preserved prior task performance-holding that such technical improvements constituted "significantly more" than an abstract idea. The Federal Circuit decision in Recentive Analytics, Inc. v. Fox Corp., No. 23-2437 (Fed. Cir. 2025) also confirms that claims directed to specific novel machine-learning architectures and improvements in model operation are not ineligible merely because they involve data analytics. Rather as the Court noted, in contrast to existing models applied to new data in a routine way, new model architectures for machine learning models can be subject matter eligible. Further, as set forth in Example 39 of the US PTO 2019 Patent Eligibility Guidance, a claim to a neural network for facial detection that is trained using a novel two-stage training process does not recite a judicial exception under Step 2A, Prong 1 of the Section 101 analysis framework set forth in the USPTO guidance. The claims recite a novel method for factorizing a neural network, and thus are similar in subject matter to Example 39, and should qualify as eligible subject matter. […] Examiners response: Examiner respectfully disagrees, under broadest reasonable interpretation, the recited spectral initialization and Frobenius decay of the factorized layer as claimed, does not require millions of model parameters. Further the recited claim limitation is addresses as a mathematical calculation as detailed in applicants’ specification para 0021-0027. Applicants’ application describes a different invention and improvement compared to Desjardins. It is unclear how applicants claim relates to Desjardins. Similarly, applicant mentions example 39 of the US PTO 2019 Patent Eligibility Guidance. Example 39 never reaches the point of determining how to handle the training limitation at Step 2A Prong Two or Step 2B because the claim itself did not include any abstract ideas. However, applicants claim recite abstract ideas in step 2A Prong 1. Overall, the claim limitations are a combination of mental steps and math under step 2A Prong 1, and additional elements under steps 2A Prong 2 & 2B as detailed in the 101 rejection below. Applicant arguments regarding 35 U.S.C. § 103: Examiners response: Applicant’s arguments, with respect to 35 U.S.C. § 103 have been fully considered and are persuasive. The 35 U.S.C. § 103 has been withdrawn. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1, 4-7, 14, 16-17, and 21-29 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Step 1 According to the first part of the analysis, in the instant case, each of the claims falls within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter). Regarding Claim 1: 2A Prong 1: processing a layer of an initial machine learning model by factoring a matrix associated with the laver of the initial machine learning model into a set of factorization matrices to generate a factorized machine learning model having a factorized laver parameterized by the factorization matrices, and by initializing the factorization matrices using spectral initialization, wherein the initial machine learning model has an associated initial optimizer adapted to train the initial machine learning model by evaluating a non-factorized layer thereof; (This step for generating a factorized machine learning model is understood to be a recitation of a mathematical concept (i.e., mathematical calculations). Para 0017-0021 of the specification further describes the limitation, which is a mathematical operation/calculation.) generating, based at least in part on the initial optimizer of the initial machine learning model and on the factorized machine learning model, a processed optimizer by replacing a weight decay function of a regularizer of the initial optimizer with a Frobenius decay function, the processed optimizer being for training the factorized machine learning model and therein being adapted to evaluate the factorized layer; (This step for generating an optimizer and evaluating a factorized layer is understood to be a recitation of a mathematical concept (i.e., mathematical calculations). Para 0021-0027 of the specification further describes the limitation as performing a mathematical calculation.) 2A Prong 2: This judicial exception is not integrated into a practical application. Additional elements: A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations comprising: (The system having a processor and memory storing instructions to perform operations, are understood to be generic computer equipment or merely using a computer as a tool to perform an abstract idea - See MPEP 2106.05(f).) training the factorized machine learning model using the processed optimizer instead of the initial optimizer, wherein the trained factorized machine learning model is operable to generate inferences for a client application. (Training a machine learning model is understood as adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea (e.g., generate inferences) on a computer - see MPEP 2106.05(f).)) The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are generic computer functions that are implemented to perform the disclosed abstract idea above. 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Additional elements: A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations comprising: (The system having a processor and memory storing instructions to perform operations, are understood to be generic computer equipment or merely using a computer as a tool to perform an abstract idea - See MPEP 2106.05(f).) training the factorized machine learning model using the processed optimizer instead of the initial optimizer, wherein the trained factorized machine learning model is operable to generate inferences for a client application. (Training a machine learning model is understood as adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea (e.g., generate inferences) on a computer - see MPEP 2106.05(f).)) The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as they are generic computer functions that are implemented to perform the disclosed abstract idea above. Regarding Claim 14: 2A Prong 1: A method of processing a machine learning model, the method comprising: factoring the initial matrix of the untrained machine learning model layer into a set of factorization matrices to generate a factorized machine learning model having a factorized laver parameterized by the factorization matrices; (This step for factoring a matrix is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).) initializing the factorization matrices using spectral initialization, whereby the factorized machine learning model comprises in place of the initialization matrix the set of initialized factorization matrices; (This step for initializing is understood to be a recitation of a mathematical concept (i.e., mathematical calculations). Para 0021 of the specification further describes the formula used to perform spectral initialization, which is a mathematical calculation.) processing the initial optimizer to replace a weight decay function of a regularizer of the initial optimizer with a Frobenius decay function associated with the factorized layer, thereby generating as a replacement for the initial optimizer a processed optimizer for training the factorized machine learning model; (This step is understood to be a recitation of a mathematical concept (i.e., mathematical calculations). Para 0021-0026 of the specification further describe this limitation.) 2A Prong 2: This judicial exception is not integrated into a practical application. Additional elements: receiving, from a client device, an indication of an untrained machine learning model and an initial optimizer associated with the untrained machine learning model, wherein the untrained machine learning model comprises a layer that is parameterized by an initial matrix; (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and data gathering. See MPEP 2106.05(g).) providing, to the client device, the factorized machine learning model and the processed optimizer. (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and data gathering. See MPEP 2106.05(g).) The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are insignificant extra solution activity that are implemented to perform the disclosed abstract idea above. 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Additional elements: receiving, from a client device, an indication of an untrained machine learning model and an initial optimizer associated with the untrained machine learning model, wherein the untrained machine learning model comprises a layer that is parameterized by an initial matrix; (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and is well understood, routine and conventional activity of transmitting and receiving data as identified by the court (MPEP2106.05(d)(ll)(i)))) providing, to the client device, the factorized machine learning model and the processed optimizer. (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and is well understood, routine and conventional activity of transmitting and receiving data as identified by the court (MPEP2106.05(d)(ll)(i)))) The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed that are implemented to perform the disclosed abstract idea above. Regarding Claim 21: 2A Prong 1: at least one of the plurality of layers is a factorized neural network layer that is decomposed into a plurality of factorization matrices that have been initialized using spectral initialization and have been trained using gradient descent optimization with Frobenius decay based regularization; (This step for factoring a neural network is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).) 2A Prong 2: This judicial exception is not integrated into a practical application. Additional elements: A computing system comprising: at least one processor and memory storing instructions that when executed by the processor, causes the processor to implement: (The system having a processor and memory storing instructions to perform operations, are understood to be generic computer equipment or merely using a computer as a tool to perform an abstract idea - See MPEP 2106.05(f).) a machine learning model including a neural network, the neural network including a plurality of layers (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.), wherein the factorized neural network layer is a fully connected layer, a convolutional layer, or a multiheaded attention layer (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the layer.); and during training or inference, the machine learning model is configured to receive an input, process the input using the plurality of layers including the factorized neural network layer, and generate an output result. (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and data gathering. See MPEP 2106.05(g).) The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are insignificant extra solution activity in combination of generic computer functions and field of use that are implemented to perform the disclosed abstract idea above. 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Additional elements: A computing system comprising: at least one processor and memory storing instructions that when executed by the processor, causes the processor to implement: (The system having a processor and memory storing instructions to perform operations, are understood to be generic computer equipment or merely using a computer as a tool to perform an abstract idea - See MPEP 2106.05(f).) a machine learning model including a neural network, the neural network including a plurality of layers (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.), wherein the factorized neural network layer is a fully connected layer, a convolutional layer, or a multiheaded attention layer (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the layer.); and during training or inference, the machine learning model is configured to receive an input, process the input using the plurality of layers including the factorized neural network layer, and generate an output result. (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and is well understood, routine and conventional activity of transmitting and receiving data as identified by the court (MPEP2106.05(d)(ll)(i)))) The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed in combination of generic computer functions and field of use that are implemented to perform the disclosed abstract idea above. Regarding Claims 4 and 16 2A Prong 1: wherein the processed optimizer further comprises a weight decay function associated with a non-factorized layer of the factorized machine learning model. (This step is understood to be a recitation of a mathematical concept (i.e., mathematical formulas or equations).) 2A Prong 2 & 2B: The claim does not recite any additional elements. Regarding Claims 5 and 17 2A Prong 1: The claim does not recite any Abstract idea. 2A Prong 2 & 2B: wherein the layer of the initial machine learning model is one of: a convolutional layer; a fully connected layer; or a multi-head attention layer. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies a layer.) Regarding Claim 6 2A Prong 1: wherein the initial machine learning model is processed to generate the factorized machine learning model based at least in part on a set of model processing rules. (This step for generating a factorized machine learning model based on a processing rule is understood to be a recitation of a recitation of a mental process (i.e., evaluation).) 2A Prong 2 & 2B: The claim does not recite any additional elements. Regarding Claim 7 2A Prong 1: the set of factorization matrices are based at least in part on the matrix- parameterized layer. (This step is understood to be a recitation of a mathematical concept (i.e., mathematical relationships).) 2A Prong 2 & 2B: wherein: the layer of the initial machine learning model is a matrix-parameterized layer; (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the layer of the machine learning model.) Regarding Claim 22 2A Prong 1: wherein the spectral initialization is performed using an optimizer that applies an update rule to the factorization matrices during stochastic gradient descent, and according to the update rule, the factorization matrices are updated such that a weight direction of the factorized layer is approximately updated with a step-size that is multiplied by a linear transformation of a gradient of the weight direction. (These steps are understood to be a recitation of a mathematical concept (i.e., mathematic calculations).) 2A Prong 2 & 2B: The claim does not recite any additional elements. Regarding Claim 23 2A Prong 1: wherein the factorized neural network layer has been factorized using a full factorization setting, deep factorization setting, or wide factorization setting. (This step is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).) 2A Prong 2 & 2B: The claim does not recite any additional elements. Regarding Claim 24 2A Prong 1: The claim does not recite any Abstract idea. 2A Prong 2 & 2B: wherein the machine learning model is a deep learning model and each of the plurality of neural network layers is a factorized neural network layer except a first and a last layer of the neural network. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the layer of the machine learning model.) Regarding Claim 25 2A Prong 1: The claim does not recite any Abstract idea. 2A Prong 2 & 2B: wherein the machine learning model has a transformer architecture. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.) Regarding Claim 26 2A Prong 1: The computing system of claim 21, wherein the factorized layer is generated based on an initial unfactorized model layer that has been generated using Gaussian initialization and L2-regularization. (This step is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).) 2A Prong 2 & 2B: The claim does not recite any additional elements. Regarding Claim 27 2A Prong 1: The claim does not recite any Abstract idea. 2A Prong 2 & 2B: wherein the machine learning model is an embeddings model. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.) Regarding Claim 28 2A Prong 1: The claim does not recite any Abstract idea. 2A Prong 2 & 2B: wherein the machine learning model is an embeddings model that includes a multi-headed attention block and a scaled down embedding dimension as compared to an initial model from which the factorized layer was generated. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.) Regarding Claim 29 2A Prong 1: wherein the Frobenius decay function penalizes the squared Frobenius norm of the factorization matrices (This step is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).) 2A Prong 2 & 2B: The claim does not recite any additional elements. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Nixon et al. (US 20200104678 A1) describes training an optimizer for a neural network. Any inquiry concerning this communication or earlier communications from the examiner should be directed to TEWODROS E MENGISTU whose telephone number is (571)270-7714. The examiner can normally be reached Mon-Fri 9:30-5:30. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ABDULLAH KAWSAR can be reached at (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /TEWODROS E MENGISTU/Examiner, Art Unit 2127
Read full office action

Prosecution Timeline

Dec 02, 2020
Application Filed
Jan 02, 2024
Non-Final Rejection — §101
Jun 10, 2024
Response Filed
Jul 11, 2024
Final Rejection — §101
Nov 18, 2024
Request for Continued Examination
Nov 19, 2024
Response after Non-Final Action
Dec 13, 2024
Non-Final Rejection — §101
May 07, 2025
Applicant Interview (Telephonic)
May 07, 2025
Examiner Interview Summary
May 19, 2025
Response Filed
Aug 08, 2025
Final Rejection — §101
Nov 12, 2025
Notice of Allowance
Nov 12, 2025
Response after Non-Final Action
Dec 19, 2025
Response after Non-Final Action
Feb 23, 2026
Request for Continued Examination
Mar 04, 2026
Response after Non-Final Action
Mar 16, 2026
Non-Final Rejection — §101 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12566817
AUTOMATIC MACHINE LEARNING MODEL EVALUATION
2y 5m to grant Granted Mar 03, 2026
Patent 12482032
Selective Data Rejection for Computationally Efficient Distributed Analytics Platform
2y 5m to grant Granted Nov 25, 2025
Patent 12450465
NEURAL NETWORK SYSTEM, NEURAL NETWORK METHOD, AND PROGRAM
2y 5m to grant Granted Oct 21, 2025
Patent 12400252
ARTIFICIAL INTELLIGENCE BASED TRANSACTIONS CONTEXTUALIZATION PLATFORM
2y 5m to grant Granted Aug 26, 2025
Patent 12380369
HYPERPARAMETER TUNING IN AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODELS
2y 5m to grant Granted Aug 05, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
49%
Grant Probability
77%
With Interview (+28.2%)
4y 5m
Median Time to Grant
High
PTA Risk
Based on 127 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month