Last updated: April 19, 2026

Application No. 18/328,629

NEURAL NETWORKS PROCESSING UNITS WEIGHT SPARSITY REMOVAL

Non-Final OA §101§102§103

Filed

Jun 02, 2023

Examiner

HOANG, MICHAEL H

Art Unit

2122

Tech Center

2100 — Computer Architecture & Software

Assignee

Neuronix AI Labs Inc.

OA Round

1 (Non-Final)

This examiner grants 52% of cases after interview

— +25.9% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 136 resolved cases, 2023–2026

Examiner Intelligence

HOANG, MICHAEL H View full profile →

Grants 52% of resolved cases

Career Allow Rate

70 granted / 136 resolved

-3.5% vs TC avg

Strong +26% interview lift

Without

With

+25.9%

Interview Lift

resolved cases with interview

Typical timeline

4y 1m

Avg Prosecution

26 currently pending

Career history

162

Total Applications

across all art units

Statute-Specific Performance

§101

30.3%

-9.7% vs TC avg

§103

45.3%

+5.3% vs TC avg

§102

9.1%

-30.9% vs TC avg

§112

12.3%

-27.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 136 resolved cases

Office Action

§101 §102 §103

DETAILED ACTION
This action is in response to the claims filed 06/02/2023 for Application number 18/328,629. Claims 1-8 are currently pending. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 1 is objected to because of the following informalities: The claim recites "to be accumulated in a different vector multiplication calculations" however this appears to be grammatically incorrect. Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-8 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1, 
Step 1 Analysis: Claim 1 is directed to a process, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 1 recites, in part, the limitations of:
generating a list of activation memory matrixes (AMM) addresses, wherein each address in the list of addresses points to an activations row that each one of its activation components needs to be multiplied by a given non-zero weight and to be accumulated in a different vector multiplication calculations can be considered to be a mathematical calculation
implementing vector multiplication on the rows of activations and non-zero weights, including removing weight sparsity from the vector multiplications can be considered to be a mathematical calculation
These limitations as drafted, are processes that, under broadest reasonable interpretation, covers the recitation of mathematical calculations which falls within the “Mathematical concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements - “scalable deep neural networks”. Thus, this element in the claim is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Please see MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim further recites: storing in the AMM rows of activations, each row of activations including a corresponding plurality of activations to be multiplied with a same non-zero weight. This limitation is an insignificant extra-solution activity. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim as a whole is directed to an abstract idea.
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing scalable deep neural networks to perform the steps of the claimed process amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Furthermore, the limitation of storing in the AMM rows of activations, each row of activations including a corresponding plurality of activations to be multiplied with a same non-zero weight is well-understood, routine, and conventional, as evidenced by MPEP §2106.05(d)(II)(iv), “Storing and retrieving information in memory”. These limitations therefore remain insignificant extra-solution activity even upon reconsideration, and does not amount to significantly more. Even when considered in combination, these additional elements amount to mere instructions to apply the exception using generic computer components and insignificant extra-solution activity, which cannot provide an inventive concept. The claim is not patent eligible.  

Regarding claim 2, the rejection of claim 1 is further incorporated, and further, the claim recites: generating different combinations of vector multiplication tensors for machine learning models or algorithms. This claim recites additional mathematical steps in addition to the judicial exception identified in the rejection of claim 1, thus recites a judicial exception.
The claim does not include any additional elements that amount to an integration of the judicial exceptions into a practical application, nor to significantly more than the judicial exceptions. The claim is not patent eligible.

Regarding claim 3, the rejection of claim 1 is further incorporated, and further, the claim recites: supporting at least one of multiple different parallel modes including at least one of: a multiple points (pixels) parallel scheme, a lines parallel scheme, a multiple input channels parallel scheme, or a multiple output channels parallel scheme. This limitation amounts to generally linking the judicial exception to a field of use or technological environment. Please see MPEP 2106.05(h).
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 4, the rejection of claim 1 is further incorporated, and further, the claim recites: implementing a sequential execution NPU, a concurrent execution NPU, or a combination of a sequential execution NPU and a concurrent execution NPU to implement the vector multiplication. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 5, the rejection of claim 1 is further incorporated, and further, the claim recites: implementing a sequential execution NPU comprises storing back (feedback) an output of each neural network layer to a current AMM layer; The limitation amounts to an insignificant extra-solution activity.
and 
implementing a concurrent execution NPU comprises allocating different hardware resources to different DNN layers to process the DNN layers in parallel (concurrently). This limitation amounts to mere instructions to apply the judicial exception using a generic computer component or using a generic computer to perform routine tasks. Please see MPEP 2106.05(f).
The claim does not include any additional elements that amount to significantly more than the judicial exception. The limitation of “storing back (feedback) an output of each neural network layer to a current AMM layer” is just a nominal or tangential addition to the claim, and is also well-understood, routine and conventional as evidenced by MPEP §2106.05(d)(II)(iv), “Storing and retrieving information in memory”. This limitation therefore remains insignificant extra-solution activity even upon reconsideration, and does not amount to significantly more. Even when considered in combination, this additional element represents an insignificant extra-solution activity which cannot provide an inventive concept. The claim is not patent eligible. 
Regarding claim 6, the rejection of claim 1 is further incorporated, and further, the claim recites: implementing a sequential execution NPU comprises reusing hardware resources to calculate different layers of a same neural network; and 
implementing a concurrent execution NPU comprises providing results of each DNN layer to another hardware logic that executes a next DNN layer. These limitations amount to mere instructions to apply the judicial exception using a generic computer component or using a generic computer to perform routine tasks. Please see MPEP 2106.05(f).
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 7, the rejection of claim 1 is further incorporated, and further, the claim recites: further comprising supporting different size convolution operations. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 8, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein supporting different size convolution operations comprises supporting two different n*n convolution operations, and wherein n in a first of the convolution operations has a first value that is different than a second value of n in a second of the convolution operations. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-3 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Woo et al. ("US 20200012608 A1", hereinafter "Woo").

Regarding claim 1, Woo teaches A method of reducing scalable deep neural networks (DNN) accelerator (sDNA) power consumption and silicon area, the method comprising:
generating a list of activation memory matrixes (AMM) addresses (“The operations further comprise, storing, in a memory bank of the computing device, at least one of the input activations, wherein storing the at least one input activation includes generating an index comprising one or more memory address locations having input activation values that are non-zero values.” [¶0011]), wherein each address in the list of addresses points to an activations row that each one of its activation components needs to be multiplied by a given non-zero weight and to be accumulated in a different vector multiplication calculations (“As noted above, controller 302 maintains one or more address registers in memory. So, to mitigate or prevent any potential misalignment of operands (input activation and weight), upon detection of the zero value input activation, controller 302 can disable the corresponding compute unit, skip loading a particular weight, and retrieve the appropriate corresponding weight (and memory address) for the next non-zero input activation to resume computing output activations for a given neural network layer.” [¶0052; note: It is inherent that the weight values may be non-zero values because Woo’s system is designed to improve computational efficiency by eliminating multiplication by zeros.); 
storing in the AMM rows of activations (“At block 506, controller 302 stores, in memory bank 108, received input activations. Storing the input activation can include controller 302 generating an index of one or more memory address locations having input activations that include non-zero values.” [¶0061]), each row of activations including a corresponding plurality of activations to be multiplied with a same non-zero weight (“In this implementation, controller 302 can reference the above mentioned registers to determine a corresponding weight (and memory address) for the first input activation and to determine a corresponding weight (and memory address) for the second input activation.” [¶0051]); and 
implementing vector multiplication on the rows of activations and non-zero weights, including removing weight sparsity from the vector multiplications. (“Computation processes performed within a neural network layer (e.g., a convolutional layer) can include multiplying an input activation (i.e., a first operand) with a weight (i.e., a second operand) on one or more cycles and performing an accumulation of products over many cycles. An output activation is generated based on multiply and accumulation operations performed on the two operands.” [¶0026; note: See further: “Moreover, when the compute system uses a communication scheme that includes primarily non-zero activation values, computational efficiency can be enhanced or accelerated by eliminating multiplication by zeros.” [¶0012])

Regarding claim 2, Woo teaches The method of claim 1, further comprising generating different combinations of vector multiplication tensors for machine learning models or algorithms. (“As described in more detail below, various layers of a neural network process machine learning inferences by performing large quantities of computations that include matrix multiplications. Computation processes performed within a neural network layer (e.g., a convolutional layer) can include multiplying an input activation (i.e., a first operand) with a weight (i.e., a second operand) on one or more cycles and performing an accumulation of products over many cycles. An output activation is generated based on multiply and accumulation operations performed on the two operands.” [¶0026])

Regarding claim 3, Woo teaches The method of claim 1, further comprising supporting at least one of multiple different parallel modes including at least one of: a multiple points (pixels) parallel scheme, a lines parallel scheme, a multiple input channels parallel scheme, or a multiple output channels parallel scheme. (“Each data element (a.sub.0, a.sub.1, b.sub.0, b.sub.1, c.sub.0, d.sub.0 and etc.) of the data structure 102a/b is an input activation value and each input depth corresponds to a depth of an input to a neural network layer. In some implementations, a neural network layer can have an input depth of one while in other implementations a neural network layer can have an input depth of more than one.” [¶0023; corresponds to “a multiple input channels parallel scheme”])


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 4-8 are rejected under 35 U.S.C. 103 as being unpatentable over Woo in view of Hill et al. ("US 20200012608 A1", hereinafter "Hill").

Regarding claim 4, Woo teaches The method of claim 1, however fails to explicitly teach further comprising implementing a sequential execution NPU, a concurrent execution NPU, or a combination of a sequential execution NPU and a concurrent execution NPU to implement the vector multiplication.
Hill teaches further comprising implementing a sequential execution NPU, a concurrent execution NPU, or a combination of a sequential execution NPU and a concurrent execution NPU to implement the vector multiplication. (“Neural networks may also have recurrent or feedback (also called top-down) connections. In a recurrent connection, the output from a neuron in a given layer may be communicated to another neuron in the same layer. A recurrent architecture may be helpful in recognizing patterns that span more than one of the input data chunks that are delivered to the neural network in a sequence” [¶0035; corresponds to a “sequential execution NPU”])
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Woo’s teachings by using a sequential NPU as taught by Hill. One would have been motivated to make this modification as a network with many feedback connections may be helpful when the recognition of a high-level concept may aid in discriminating the particular low-level features of an input. [Hill, ¶0035]

Regarding claim 5, Woo/Hill teaches The method of claim 4, where Hill teaches wherein: 
implementing a sequential execution NPU comprises storing back (feedback) an output of each neural network layer to a current AMM layer (“Neural networks may also have recurrent or feedback (also called top-down) connections. In a recurrent connection, the output from a neuron in a given layer may be communicated to another neuron in the same layer. A recurrent architecture may be helpful in recognizing patterns that span more than one of the input data chunks that are delivered to the neural network in a sequence. A connection from a neuron in a given layer to a neuron in a lower layer is called a feedback (or top-down) connection.” [¶0035]); and 
implementing a concurrent execution NPU comprises allocating different hardware resources to different DNN layers to process the DNN layers in parallel (concurrently). (“Nevertheless, instead of allowing half of the MAC hardware 540 to remain idle, artificial sparsity is introduced to spread the four channels of the first multilane segment 812 across the MAC hardware 540 to maximize resource utilization.” [¶0078])
Same motivation to combine the teachings of Woo/Hill as claim 4.

Regarding claim 6, Woo/Hill teaches The method of claim 4, where Hill teaches wherein:
implementing a sequential execution NPU comprises reusing hardware resources to calculate different layers of a same neural network (“While some resources of the MAC hardware 540 may go unused during certain clock cycles, using artificial sparsity and packing of the FIFO buffers (e.g., 860 and 862) and the TCM memory (e.g., 870 and 872) according to the source-column activation constraint reduces an amount of wasted resources when encountering empty activation channels. A process for exploiting activation sparsity is shown in FIG. 9.” [¶0080]); and
implementing a concurrent execution NPU comprises providing results of each DNN layer to another hardware logic that executes a next DNN layer. (“The connections between layers of a neural network may be fully connected or locally connected. FIG. 2A illustrates an example of a fully connected neural network 202. In a fully connected neural network 202, a neuron in a first layer may communicate its output to every neuron in a second layer, so that each neuron in the second layer will receive input from every neuron in the first layer.” [¶0036])
Same motivation to combine the teachings of Woo/Hill as claim 4.

Regarding claim 7, Woo teaches The method of claim 1, however fails to explicitly teach further comprising supporting different size convolution operations.
Hill teaches further comprising supporting different size convolution operations (“Upon receiving the image 226, a convolutional layer 232 may apply convolutional kernels (not shown) to the image 226 to generate a first set of feature maps 218. As an example, the convolutional kernel for the convolutional layer 232 may be a 5×5 kernel that generates 28×28 feature maps. In the present example, because four different convolutional kernels were applied to the image 226 at the convolutional layer 232, four different feature maps are generated in the first set of feature maps 218. The convolutional kernels may also be referred to as filters or convolutional filters.” [¶0039])
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Woo’s teachings by supporting different size convolution operations as taught by Hill. One would have been motivated to make this modification as Convolutional neural networks may be well suited to problems in which the spatial location of inputs is meaningful. [¶0037, Hill]

	Regarding claim 8, Woo/Hill teaches The method of claim 7, where Hill teaches wherein supporting different size convolution operations comprises supporting two different n*n convolution operations, and wherein n in a first of the convolution operations has a first value that is different than a second value of n in a second of the convolution operations. (“As an example, the convolutional kernel for the convolutional layer 232 may be a 5×5 kernel that generates 28×28 feature maps. In the present example, because four different convolutional kernels were applied to the image 226 at the convolutional layer 232, four different feature maps are generated in the first set of feature maps 218. The convolutional kernels may also be referred to as filters or convolutional filters.” [¶0039])
Same motivation to combine the teachings of Woo/Hill as claim 7.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL H HOANG whose telephone number is (571)272-8491. The examiner can normally be reached Mon-Fri 8:30AM-4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL H HOANG/Examiner, Art Unit 2122

Read full office action

Prosecution Timeline

Jun 02, 2023

Application Filed

Mar 07, 2026

Non-Final Rejection — §101, §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/235,074

Patent 12518156

Training a Neural Network using Graph-Based Temporal Classification

2y 5m to grant Granted Jan 06, 2026

17/316,998

Patent 12468934

SYSTEMS AND METHODS FOR GENERATING DYNAMIC CONVERSATIONAL RESPONSES USING DEEP CONDITIONAL LEARNING

2y 5m to grant Granted Nov 11, 2025

18/982,085

Patent 12456115

METHODS, ARCHITECTURES AND SYSTEMS FOR PROGRAM DEFINED SYSTEMS

2y 5m to grant Granted Oct 28, 2025

18/313,050

Patent 12437211

System and Method for Predicting Fine-Grained Adversarial Multi-Agent Motion

2y 5m to grant Granted Oct 07, 2025

16/879,775

Patent 12430543

Structured Sparsity Guided Training In An Artificial Neural Network

2y 5m to grant Granted Sep 30, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

52%

Grant Probability

77%

With Interview (+25.9%)

4y 1m

Median Time to Grant

Low

PTA Risk

Based on 136 resolved cases by this examiner. Grant probability derived from career allow rate.