Last updated: April 19, 2026

Application No. 17/109,824

FACTORIZED NEURAL NETWORK

Non-Final OA §101

Filed

Dec 02, 2020

Examiner

MENGISTU, TEWODROS E

Art Unit

2127

Tech Center

2100 — Computer Architecture & Software

Assignee

Microsoft Technology Licensing, LLC

OA Round

5 (Non-Final)

Interview Optional

— +28.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 127 resolved cases, 2023–2026

Examiner Intelligence

MENGISTU, TEWODROS E View full profile →

Grants 49% of resolved cases

Career Allow Rate

62 granted / 127 resolved

-6.2% vs TC avg

Strong +28% interview lift

Without

With

+28.2%

Interview Lift

resolved cases with interview

Typical timeline

4y 5m

Avg Prosecution

34 currently pending

Career history

161

Total Applications

across all art units

Statute-Specific Performance

§101

27.9%

-12.1% vs TC avg

§103

44.5%

+4.5% vs TC avg

§102

9.6%

-30.4% vs TC avg

§112

14.7%

-25.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 127 resolved cases

Office Action

§101

Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1, 4-7, 14, 16-17, and 21-29. Claims 1, 14, and 21 are independent. 

Response to Amendment
The office action is responsive to the amendments filed on 02/23/2026. As
directed by the amendments claims 1, 4-5, 7, and 14 are amended. Claims 21-29 are new.

Response to Arguments
Applicant's arguments filed 02/23/2026 have been fully considered but they are not persuasive. 

Applicant arguments regarding 35 U.S.C. § 101:
Turning now to the standards for patent eligibility under Section 101, the Federal
Circuit and the USPTO have repeatedly emphasized that the "mental process" category is limited to steps that can practically be performed in the human mind. A human cannot practically perform the recited spectral initialization and Frobenius decay of the factorized layer as claimed, particularly on a model having a parameter size of millions of parameters, as nearly all modem neural networks do. The Examiner's analysis collapses the distinction between abstract reasoning and concrete algorithmic computation, contrary to MPEP § 2106.04(a)(2)(III).
As recognized in Ex parte Desjardins (Appeal 2024-000567), a machine learning model that improves its own operation is not a generic computer element, even if it is executed on standard hardware. The Desjardins panel vacated a 35 U.S.C. 101 rejection where the claim recited improvements to model operation-specifically, continual learning that preserved prior task performance-holding that such technical improvements constituted "significantly more" than an abstract idea. The Federal Circuit decision in Recentive Analytics, Inc. v. Fox Corp., No. 23-2437 (Fed. Cir. 2025) also confirms that claims directed to specific novel machine-learning architectures and improvements in model operation are not ineligible merely because they involve data analytics. Rather as the Court noted, in contrast to existing models applied to new data in a routine way, new model architectures for machine learning models can be subject matter eligible.
Further, as set forth in Example 39 of the US PTO 2019 Patent Eligibility Guidance, a claim to a neural network for facial detection that is trained using a novel two-stage training process does not recite a judicial exception under Step 2A, Prong 1 of the Section 101 analysis framework set forth in the USPTO guidance. The claims recite a novel method for factorizing a neural network, and thus are similar in subject matter to Example 39, and should qualify as eligible subject matter. […]
Examiners response: Examiner respectfully disagrees, under broadest reasonable interpretation, the recited spectral initialization and Frobenius decay of the factorized layer as claimed, does not require millions of model parameters. Further the recited claim limitation is addresses as a mathematical calculation as detailed in applicants’ specification para 0021-0027.
	Applicants’ application describes a different invention and improvement compared to Desjardins. It is unclear how applicants claim relates to Desjardins. Similarly, applicant mentions example 39 of the US PTO 2019 Patent Eligibility Guidance. Example 39 never reaches the point of determining how to handle the training limitation at Step 2A Prong Two or Step 2B because the claim itself did not include any abstract ideas. However, applicants claim recite abstract ideas in step 2A Prong 1. Overall, the claim limitations are a combination of mental steps and math under step 2A Prong 1, and additional elements under steps 2A Prong 2 & 2B as detailed in the 101 rejection below.

Applicant arguments regarding 35 U.S.C. § 103:
Examiners response: Applicant’s arguments, with respect to 35 U.S.C. § 103 have been fully considered and are persuasive.  The 35 U.S.C. § 103 has been withdrawn. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 4-7, 14, 16-17, and 21-29 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Step 1
According to the first part of the analysis, in the instant case, each of the claims falls within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).

Regarding Claim 1:
2A Prong 1:
processing a layer of an initial machine learning model by factoring a matrix associated with the laver of the initial machine learning model into a set of factorization matrices to generate a factorized machine learning model having a factorized laver parameterized by the factorization matrices, and by initializing the factorization matrices using spectral initialization, wherein the initial machine learning model has an associated initial optimizer adapted to train the initial machine learning model by evaluating a non-factorized layer thereof; (This step for generating a factorized machine learning model is understood to be a recitation of a mathematical concept (i.e., mathematical calculations). Para 0017-0021 of the specification further describes the limitation, which is a mathematical operation/calculation.) 
generating, based at least in part on the initial optimizer of the initial machine learning model and on the factorized machine learning model, a processed optimizer by replacing a weight decay function of a regularizer of the initial optimizer with a Frobenius decay function, the processed optimizer being for training the factorized machine learning model and therein being adapted to evaluate the factorized layer; (This step for generating an optimizer and evaluating a factorized layer is understood to be a recitation of a mathematical concept (i.e., mathematical calculations). Para 0021-0027 of the specification further describes the limitation as performing a mathematical calculation.)

2A Prong 2: This judicial exception is not integrated into a practical application. 
Additional elements: 

A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations comprising: (The system having a processor and memory storing instructions to perform operations, are understood to be generic computer equipment or merely using a computer as a tool to perform an abstract idea - See MPEP 2106.05(f).)
training the factorized machine learning model using the processed optimizer instead of the initial optimizer, wherein the trained factorized machine learning model is operable to generate inferences for a client application. (Training a machine learning model is understood as adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea (e.g., generate inferences) on a computer - see MPEP 2106.05(f).))
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are generic computer functions that are implemented to perform the disclosed abstract idea above.

2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
Additional elements: 

A system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations comprising: (The system having a processor and memory storing instructions to perform operations, are understood to be generic computer equipment or merely using a computer as a tool to perform an abstract idea - See MPEP 2106.05(f).)
training the factorized machine learning model using the processed optimizer instead of the initial optimizer, wherein the trained factorized machine learning model is operable to generate inferences for a client application. (Training a machine learning model is understood as adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea (e.g., generate inferences) on a computer - see MPEP 2106.05(f).))
The additional elements as disclosed above in combination of the abstract idea
are not sufficient to amount to significantly more than the judicial exception as they are
generic computer functions that are implemented to perform the disclosed abstract
idea above.

Regarding Claim 14:
2A Prong 1:
A method of processing a machine learning model, the method comprising:
factoring the initial matrix of the untrained machine learning model layer into a set of factorization matrices to generate a factorized machine learning model having a factorized laver parameterized by the factorization matrices; (This step for factoring a matrix is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).)
initializing the factorization matrices using spectral initialization, whereby the factorized machine learning model comprises in place of the initialization matrix the set of initialized factorization matrices; (This step for initializing is understood to be a recitation of a mathematical concept (i.e., mathematical calculations). Para 0021 of the specification further describes the formula used to perform spectral initialization, which is a mathematical calculation.)
processing the initial optimizer to replace a weight decay function of a regularizer of the initial optimizer with a Frobenius decay function associated with the factorized layer, thereby generating as a replacement for the initial optimizer a processed optimizer for training the factorized machine learning model; (This step is understood to be a recitation of a mathematical concept (i.e., mathematical calculations). Para 0021-0026 of the specification further describe this limitation.)

2A Prong 2: This judicial exception is not integrated into a practical application. 
Additional elements: 

receiving, from a client device, an indication of an untrained machine learning model and an initial optimizer associated with the untrained machine learning model, wherein the untrained machine learning model comprises a layer that is parameterized by an initial matrix; (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and data gathering. See MPEP 2106.05(g).)
providing, to the client device, the factorized machine learning model and the processed optimizer. (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and data gathering. See MPEP 2106.05(g).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are insignificant extra solution activity that are implemented to perform the disclosed abstract idea above.

2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
Additional elements: 

receiving, from a client device, an indication of an untrained machine learning model and an initial optimizer associated with the untrained machine learning model, wherein the untrained machine learning model comprises a layer that is parameterized by an initial matrix; (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and is well understood, routine and conventional activity of transmitting and receiving data as identified by the court (MPEP2106.05(d)(ll)(i))))
providing, to the client device, the factorized machine learning model and the processed optimizer. (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and is well understood, routine and conventional activity of transmitting and receiving data as identified by the court (MPEP2106.05(d)(ll)(i))))
The additional elements as disclosed above in combination of the abstract idea
are not sufficient to amount to significantly more than the judicial exception as they are
well, understood, routine and conventional activity as disclosed that are implemented to perform the disclosed abstract idea above.

Regarding Claim 21:
2A Prong 1:
at least one of the plurality of layers is a factorized neural network layer that is decomposed into a plurality of factorization matrices that have been initialized using spectral initialization and have been trained using gradient descent optimization with Frobenius decay based regularization; (This step for factoring a neural network is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).)

2A Prong 2: This judicial exception is not integrated into a practical application. 
Additional elements: 

A computing system comprising: at least one processor and memory storing instructions that when executed by the processor, causes the processor to implement: (The system having a processor and memory storing instructions to perform operations, are understood to be generic computer equipment or merely using a computer as a tool to perform an abstract idea - See MPEP 2106.05(f).)
a machine learning model including a neural network, the neural network including a plurality of layers (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.), wherein 
the factorized neural network layer is a fully connected layer, a convolutional layer, or a multiheaded attention layer (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the layer.); and 
during training or inference, the machine learning model is configured to receive an input, process the input using the plurality of layers including the factorized neural network layer, and generate an output result. (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and data gathering. See MPEP 2106.05(g).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are insignificant extra solution activity in combination of generic computer functions and field of use that are implemented to perform the disclosed abstract idea above.

2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
Additional elements: 

A computing system comprising: at least one processor and memory storing instructions that when executed by the processor, causes the processor to implement: (The system having a processor and memory storing instructions to perform operations, are understood to be generic computer equipment or merely using a computer as a tool to perform an abstract idea - See MPEP 2106.05(f).)
a machine learning model including a neural network, the neural network including a plurality of layers (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.), wherein 
the factorized neural network layer is a fully connected layer, a convolutional layer, or a multiheaded attention layer (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the layer.); and 
during training or inference, the machine learning model is configured to receive an input, process the input using the plurality of layers including the factorized neural network layer, and generate an output result. (This step is directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity and is well understood, routine and conventional activity of transmitting and receiving data as identified by the court (MPEP2106.05(d)(ll)(i))))
The additional elements as disclosed above in combination of the abstract idea
are not sufficient to amount to significantly more than the judicial exception as they are
well, understood, routine and conventional activity as disclosed in combination
of generic computer functions and field of use that are implemented to perform the disclosed abstract idea above.

Regarding Claims 4 and 16
2A Prong 1:
wherein the processed optimizer further comprises a weight decay function associated with a non-factorized layer of the factorized machine learning model. (This step is understood to be a recitation of a mathematical concept (i.e., mathematical formulas or equations).)
2A Prong 2 & 2B: The claim does not recite any additional elements.

Regarding Claims 5 and 17
2A Prong 1: The claim does not recite any Abstract idea.
2A Prong 2 & 2B: 
wherein the layer of the initial machine learning model is one of: a convolutional layer; a fully connected layer; or a multi-head attention layer. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies a layer.)

Regarding Claim 6
2A Prong 1:
wherein the initial machine learning model is processed to generate the factorized machine learning model based at least in part on a set of model processing rules. (This step for generating a factorized machine learning model based on a processing rule is understood to be a recitation of a recitation of a mental process (i.e., evaluation).)
2A Prong 2 & 2B: The claim does not recite any additional elements.

Regarding Claim 7
2A Prong 1:
the set of factorization matrices are based at least in part on the matrix- parameterized layer. (This step is understood to be a recitation of a mathematical concept (i.e., mathematical relationships).)
2A Prong 2 & 2B: 
wherein: the layer of the initial machine learning model is a matrix-parameterized layer; (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the layer of the machine learning model.)

Regarding Claim 22
2A Prong 1:
wherein the spectral initialization is performed using an optimizer that applies an update rule to the factorization matrices during stochastic gradient descent, and
according to the update rule, the factorization matrices are updated such that a weight direction of the factorized layer is approximately updated with a step-size that is multiplied by a linear transformation of a gradient of the weight direction. (These steps are understood to be a recitation of a mathematical concept (i.e., mathematic calculations).)
2A Prong 2 & 2B: The claim does not recite any additional elements.

Regarding Claim 23
2A Prong 1:
wherein the factorized neural network layer has been factorized using a full factorization setting, deep factorization setting, or wide factorization setting. (This step is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).)
2A Prong 2 & 2B: The claim does not recite any additional elements.

Regarding Claim 24
2A Prong 1: The claim does not recite any Abstract idea.
2A Prong 2 & 2B: 
wherein the machine learning model is a deep learning model and each of the plurality of neural network layers is a factorized neural network layer except a first and a last layer of the neural network. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the layer of the machine learning model.)

Regarding Claim 25
2A Prong 1: The claim does not recite any Abstract idea.
2A Prong 2 & 2B: 
wherein the machine learning model has a transformer architecture. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.)

Regarding Claim 26
2A Prong 1:
The computing system of claim 21, wherein the factorized layer is generated based on an initial unfactorized model layer that has been generated using Gaussian initialization and L2-regularization. (This step is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).)
2A Prong 2 & 2B: The claim does not recite any additional elements.

Regarding Claim 27
2A Prong 1: The claim does not recite any Abstract idea.
2A Prong 2 & 2B: 
wherein the machine learning model is an embeddings model. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.)

Regarding Claim 28
2A Prong 1: The claim does not recite any Abstract idea.
2A Prong 2 & 2B: 
wherein the machine learning model is an embeddings model that includes a multi-headed attention block and a scaled down embedding dimension as compared to an initial model from which the factorized layer was generated. (The specification of data to be stored is understood to be a field of use limitation - See MPEP 2106.05(h). This limitation further specifies the machine learning model.)

Regarding Claim 29
2A Prong 1:
wherein the Frobenius decay function penalizes the squared Frobenius norm of the factorization matrices (This step is understood to be a recitation of a mathematical concept (i.e., mathematical calculations).)
2A Prong 2 & 2B: The claim does not recite any additional elements.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Nixon et al. (US 20200104678 A1) describes training an optimizer for a neural network.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TEWODROS E MENGISTU whose telephone number is (571)270-7714. The examiner can normally be reached Mon-Fri 9:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ABDULLAH KAWSAR can be reached at (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TEWODROS E MENGISTU/Examiner, Art Unit 2127

Read full office action

Prosecution Timeline

Dec 02, 2020

Application Filed

Jan 02, 2024

Non-Final Rejection — §101

Jun 10, 2024

Response Filed

Jul 11, 2024

Final Rejection — §101

Nov 18, 2024

Request for Continued Examination

Nov 19, 2024

Response after Non-Final Action

Dec 13, 2024

Non-Final Rejection — §101

May 07, 2025

Applicant Interview (Telephonic)

May 07, 2025

Examiner Interview Summary

May 19, 2025

Response Filed

Aug 08, 2025

Final Rejection — §101

Nov 12, 2025

Notice of Allowance

Nov 12, 2025

Response after Non-Final Action

Dec 19, 2025

Response after Non-Final Action

Feb 23, 2026

Request for Continued Examination

Mar 04, 2026

Response after Non-Final Action

Mar 16, 2026

Non-Final Rejection — §101 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/168,227

Patent 12566817

AUTOMATIC MACHINE LEARNING MODEL EVALUATION

2y 5m to grant Granted Mar 03, 2026

16/872,322

Patent 12482032

Selective Data Rejection for Computationally Efficient Distributed Analytics Platform

2y 5m to grant Granted Nov 25, 2025

17/046,292

Patent 12450465

NEURAL NETWORK SYSTEM, NEURAL NETWORK METHOD, AND PROGRAM

2y 5m to grant Granted Oct 21, 2025

18/387,799

Patent 12400252

ARTIFICIAL INTELLIGENCE BASED TRANSACTIONS CONTEXTUALIZATION PLATFORM

2y 5m to grant Granted Aug 26, 2025

18/984,272

Patent 12380369

HYPERPARAMETER TUNING IN AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODELS

2y 5m to grant Granted Aug 05, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

5-6

Expected OA Rounds

49%

Grant Probability

77%

With Interview (+28.2%)

4y 5m

Median Time to Grant

High

PTA Risk

Based on 127 resolved cases by this examiner. Grant probability derived from career allow rate.

FACTORIZED NEURAL NETWORK

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email