Last updated: May 29, 2026
Application No. 17/134,095
PATTERN-BASED NEURAL NETWORK PRUNING

Final Rejection §101§103§112
Filed
Dec 24, 2020
Examiner
GERMICK, JOHNATHAN R
Art Unit
2122
Tech Center
2100 — Computer Architecture & Software
Assignee
Cypress Semiconductor Corporation
OA Round
6 (Final)
Interview Optional

— +27.9% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 46% grant rate with +27.9% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 96 resolved cases, 2023–2026
Examiner Intelligence

GERMICK, JOHNATHAN R View full profile →
Grants 46% of resolved cases
Career Allowance Rate
44 granted / 96 resolved
-9.2% vs TC avg
Strong +28% interview lift
Without
With
+27.9%
Interview Lift
resolved cases with interview
Typical timeline
4y 6m
Avg Prosecution
20 currently pending
Career history
120
Total Applications
across all art units
Statute-Specific Performance

§101
13.2%
-26.8% vs TC avg
§103
76.6%
+36.6% vs TC avg
§102
8.5%
-31.5% vs TC avg
§112
1.7%
-38.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 96 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
This action is responsive to the Application filed on 03/23/2026. Claims 1, 3, 5-9, 11, 13-17 and 20 are pending in the case.  Claims 1, 9, and 17 are independent claims. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 03/23/2026 have been fully considered but they are not persuasive. 
With respect to the rejection under 35 U.S.C. 101
	Applicant argues that, like Enfish, the claims recite a specific technical process for pruning which is a concrete improvement in neural network technology. Further noting that the specific implementation reduces computational complexity through sparsification, which provides an improvement to computer functionality itself, like Enfish.
	Examiner disagrees. The claims recite the selection and pruning which corresponds to  the recited judicial exception, i.e a mental process, identified in the rejection. The pruning is performing by applying a feature mask selected based on values being maximized. This is a mental evaluation, because selection of a mask and applying it to remove feature maps amounts to selection of data which can be performed in the mind. As noted in the MPEP, the supposed improvement to technology cannot be a result of the abstract idea alone. Examiner notes that the improved abstract idea may very well result in reduction of computational complexity, which nevertheless is not reflected in the additional elements of the claim, therefore the claims are not similar to Enfish.
	Further, in response to the above Applicant appears to suggest that because selecting and pruning involves 1) “maximizing a sum or values of the feature map, wherein no pruning mask is assigned to more than one feature map” and 2) “applying to each feature map…. A respective selected pruning mask” the pruning masks are improved.
	Examiner disagrees. These steps 1) and 2) do not describe any particular technological functions nor additional elements. Rather, they characterize either the result of the selection (i.e maximizing a sum) or a description of how the pruning or selection is performed. These steps further describe the abstract idea and as such are not improvement indicative of improvements to the functionality of a computer like in Enfish.
	Applicant argues that training on constrained devices clearly describes a practical application and that the ability to run a trained neural network on a constrained device represents an improvement.
	Examiner disagrees. Training as claimed describes an application at a high degree of generality (see MPEP 2106.05(f)). The claims do not describe any details of the training except the intended result and use. The claims describe that the neural network is for performing a voice recognition task, but the claim does not actually recite performing the supposed task. The specification does not describe any technical improvements to the training process to consider except that the training is improved as a result of performing the pruning (i.e the abstract idea). MPEP 2106.05(a) notes that the judicial exception alone cannot provide the improvement.
	Applicant argues the deployment of neural networks on a resource constrained device describes a practical application.
	Examiner disagrees. Examiner notes that the additional elements can provide an improvement, however the recited additional elements do not describe improvements to the functioning of resource constrained devices, rather they describe merely linking an improved neural network to the technology of constrained devices generally. The neural network is improved by the selection and pruning claimed which is identified as reciting an abstract idea, as noted in the MPEP2106.05(a) “It is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements”.
Applicant argues further that the claims are not well-understood routine and conventional (WURC). Applicant appears to suggest that other limitations which do not correspond to mere data gathering under 2106.05(g) are not WURC.
Examiner notes that this analysis overlaps with the consideration in step 2A prong 2 with respect to additional elements which are identified as insignificant extra solution activity. These particular additional elements identified are explained as WURC, under step 2B, because receiving as claimed in an unspecified manner, i.e mere data gathering, is recognized in the MPEP as WURC.
Applicant’s suggestion that other intervening limitation such as selecting and pruning as claimed are not WURC is not relevant to the rejection. As noted previously, the selecting and pruning amount to the judicial exception, as such they are not considered as additional elements according to the flow chart.
Further, the training limitation is considered not significantly more under 2106.05(f) and 2106.05(h) are ineligible because this limitations do not describe any specific functioning for how the training is performed to be considered. 

With respect to the art rejection:
	As best understood by the Examiner, Applicant appears to argue that Ding does not teach the claimed “wherein no pruning mask is assigned to more than one feature map” because as stated in the art “the based mask does not change between the current move and the next move”. Applicant further disputes that the statement “the based mask is not changed until the return or rest condition is met” as it does not align with the art. Further, Applicant appears to highlight that the base mask is assigned to multiple conv layers and as such cannot teach the claim requirement of not being assigned to more than one feature map.
	Examiner disagrees. Principally, the fact that a “base mask” is assigned to multiple conv layers or feature maps, does not indicate that the very same mask is assigned to each feature map. In fact, equation (9) appears to clearly specify a plurality of base masks “u” are assigned to a plurality of corresponding feature maps. A plurality is clear because of the subscript indicating multiple base masks for multiple feature maps. (see equation 9)
    PNG
    media_image1.png
    85
    462
    media_image1.png
    Greyscale

	As evidenced by the superscript (i) and subscript k multiple “version” of the base mask is applied to the feature map, M, depending on the value of k and i. The scripts serve to identify different masks corresponding to different feature maps. 
	The art’s statement that the “base mask does not change” between moves is only a statement about the mask’s value not changing across moves and says nothing about the assignment of the masks as claimed. The base mask is updated and changed only according to the corresponding lines of algorithm 1 (lines 2, 19, and 25) which includes initialization to 1 or reset to 0.
	Examiner notes that the mask is understood to be a binary 1 or 0 value, as such many masks assigned to a plurality of feature maps necessarily share the same value, however the claim does not disallow the assigned mask from having unchanging or common values.
	The claim at most specifies that a particular mask is assigned to a particular feature map, which is clearly covered by the art because the base mask shares the same identifying subscripts or superscripts as the corresponding feature maps.
	Therefore, the rejection is maintained.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1, 3, 5-9, 11, 13-17 and 20 are rejected under 35 U.S.C. 101 because the claims are directed to an abstract idea without significantly more.

Regarding Claim 1/9/17
Under step 1, claim 1 is directed to A method, which is directed to a process, one of the statutory categories. Under step 1, claim 9 is directed to A system, which is directed to a machine, one of the statutory categories. Under step 1, claim 17 is directed to A non-transitory computer-readable storage medium storing executable instructions, which is directed to a product of manufacture, one of the statutory categories. The limitations of these claims are evaluated according to the steps in the flow chart in MPEP 2106. 
Under Step 2A Prong 1, the claim recites the following limitations which are considered mental evaluations  “for each feature map of the plurality of feature maps, selecting [selecting], from a predetermined set of pruning masks, a corresponding pruning mask that, when applied to the feature map, maximizes a sum of values of the feature map, wherein no pruning mask is assigned to more than one feature map; pruning the neural network by applying, to each feature map of the plurality of feature maps, a respective selected pruning mask;”.
	Examiner notes that selecting from a set of existing data (i.e masks) is a decision which can be performed in the human mind. Pruning the network by applying the pruning mask is described in specification para 0033 by “multiplying each feature map element fij k by the corresponding mask element.”, which may be performed in the mind. However, pruning also broadly includes a decision to remove or prune feature maps according to a “mask”, which similar describes a selection or decision about data, which is a mental step.
	Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claims recite the additional element(s) the limitations “by the processing device…a memory; and a processing device, coupled to the memory, the processing device configured to:… when executed by a processing device, cause the processing device to:…” amounts to mere instructions to apply a computer technology to an abstract idea, see MPEP 2106.05(f). Further, the claims recite the additional element(s) the limitations “and training… the pruned neural network.” describes an application at a high degree of generality which makes use of the recited exception, see MPEP 2106.05(f). The claim does not describe any steps describing how the training processed is performed, thus the claims simply apply the recited abstract idea to training. Further, the additional elements “wherein the pruned neural network is for performing a voice recognition task with limited computational capacity ” is generally linking the use of the judicial exception to a particular technological environment or field of use, see MPEP 2106.05(h). This limitation is merely an incidental or token addition to the claim that did not alter or affect how the claimed steps are performed. In addition, the claim recites additional element(s) “receiving, by a processing device, a plurality of feature maps produced by an input layer of a neural network;…” that amounts to adding insignificant extra-solution activity to the judicial exception. See MPEP 2106.05(g). This limitation amounts to “mere data gathering” the claim does not set limits on how the receiving is being performed only that certain named data is received by a generalized processing device. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
	Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Further, the additional elements, “receiving, by a processing device, a plurality of feature maps produced by an input layer of a neural network;…” are insignificant extra-solution activities that are considered well-understood, routine, conventional activities, for the following reasons. Examiner notes that “receiving, by a processing device, a plurality of feature maps produced by an input layer of a neural network…” amounts to receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i). According to MPEP 2106.05(d)(II)(i), “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner”. As such, the insignificant extra-solution activities are considered well-understood, routine, conventional activities. Therefore, the claim is not patent eligible


Regarding Claim 3/5/6/7 and 11/13/14/15 and 20
The claims recites the following limitations: “wherein each feature map of the plurality of feature maps represents a plurality of responses of the input layer of the neural network at respective portions of input data represented in time-frequency coordinates”
“wherein selecting the pruning mask further comprises: removing the selected pruning mask from the predetermined set of pruning masks.”
“multiplying each element of the feature map by a corresponding element of the selected pruning mask.”
“applying the selected pruning mask to the feature map further comprises: applying a decay factor to the selected pruning mask.”	 Under Step 2A Prong 1, these limitations only serve to describe the abstract idea addressed in the independent claim.
Furthermore, under step 2A Prong 2 and 2B, the claim does not recite additional elements to consider other than those considered in the independent claim. Accordingly, the recited additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea, nor do they amount to significantly more than the judicial exception because they do not impose any meaningful limits on practicing the abstract idea.

Regarding Claim 8 and 16
The claim recites the following limitations “responsive to determining that a terminating condition is not satisfied, iteratively repeating the pruning….” Under Step 2A Prong 1, these limitations describe a step performed in the mind.
	Furthermore, under step 2A Prong 2 and 2B: The judicial exception in not integrated into a practical application or provide significantly more. In particular, the claims recite the additional element(s) the limitations “iteratively training.” describes an application at a high degree of generality which makes use of the recited exception, see MPEP 2106.05(f). Examiner notes that the modifier “iteratively” does not provide any detail about how the training is functioning. Additional, while not relied upon in the rejection, Examiner notes that training is generally understood to be an iterative process. For example, a limitation “…is not satisfied, continue training” would be analyzed similarly.  Accordingly, the recited additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea, nor do they amount to significantly more than the judicial exception because they do not impose any meaningful limits on practicing the abstract idea.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1, 3, 5-9, 11, 13-17 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The parent claim term “with limited computational capacity” in claim 1, 9 and 17 is a relative term which renders the claim indefinite. The term “limited” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. 
The dependent claims are rejected by virtue of their dependency.


Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3, 5, 6, 8, 9, 11, 13, 14, 16, 17 and 20 is/are rejected under 35 U.S.C. § 103 as being unpatentable over Ding et al “Approximated Oracle Filter Pruning for Destructive CNN Width Optimization”, further in view of Park et al. “Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices”.
	
Claim 1
	Ding teaches, A method, comprising: receiving, by a processing device, a plurality of feature maps produced by an input layer of a neural network; ( Section 3.1 pg 3 “Let i be the layer index, M(i) ∈ R hi×wi×ci be an hi × wi feature map with ci channels and M(i,j) = M(i) :,:,j be the jth channel” pg 2 Figure 1 shows the neural network and its corresponding feature maps M output by an input layer conv1 
    PNG
    media_image2.png
    100
    320
    media_image2.png
    Greyscale
)

for each feature map of the plurality of feature maps, selecting, by the processing device, from a predetermined set of pruning masks a pruning mask (Section 3.4 pg 4 and 5 “E.g., Fig. 1 shows two scoring paths which each contain only one conv layer (conv2, conv3) as we are pruning conv1 and conv2 simultaneously. The base path forwards the outputs of layer i through a base mask u (i) ∈ R ci initialized as 1. The j-th channel of the output of the next layer becomes …. 
    PNG
    media_image3.png
    70
    427
    media_image3.png
    Greyscale
… During the training process, for each batch of input data, we randomly set some bits in the scoring mask to zero, such that the corresponding filters are ablated on the scoring path but still kept on the base path… When enough samples have been collected, we approximate T’ ” for each k feature maps a corresponding mask “u” is randomly selected from a predetermined set. For example, a random binary vector from a k element vector is a selection from a finite (i.e predetermined) set of vectors or in the case “masks”.)  
a corresponding pruning mask that, when applied to the feature map, maximizes a sum of values of the feature map; (pg 6 “We introduce a global hyper-parameter, the refinement threshold θ, which is used to compare with the max Tˆ value (Eq. 10) of the filters in B to judge if the picked set is good (unimportant) enough. We say B is good enough if 
    PNG
    media_image4.png
    45
    410
    media_image4.png
    Greyscale
” here it is explained that the T value is maximized to pick the best filters in B according to theta. Algorithm 1 annotated and provided below depicts u, the mask, being repeatedly selected (i.e line 19) until p > theta or T has been maximized. 
    PNG
    media_image5.png
    123
    406
    media_image5.png
    Greyscale
. Finally T is a sum of values of the feature map as shown in equations 6 and 7 pg 4 
    PNG
    media_image6.png
    75
    423
    media_image6.png
    Greyscale
 
    PNG
    media_image7.png
    77
    433
    media_image7.png
    Greyscale
 Examiner notes that t(F,x) are values “of the features map” because they are derived from the feature map. It is understood by the examiner that t(F,x) in the art is a value which characterizes the distance between subsequent layer feature maps.)
wherein no pruning mask is assigned to more than on feature map ( (pg 4 Section 3.4  “The j-th channel of the output of the next layer becomes… 
    PNG
    media_image8.png
    76
    421
    media_image8.png
    Greyscale
It is obvious that setting u (i) k = 0 is equivalent to removing the k-th filter at layer i…. During the training process, for each batch of input data, we randomly set some bits in the scoring mask to zero, such that the corresponding filters are ablated on the scoring path but still kept on the base path” u is the pruning mask. The super and subscripts (i and k) indicate that the mask is for a corresponding feature map M. There is a plurality of feature maps for each layer (i) and a plurality for each channel (k) each with their own corresponding mask. Further, as evidence in Figure 4 of the specification, the pruning masks appear to be re-used for each iteration of training. Nominally each pruning mask assigned to a given mask is considered its own corresponding mask such that no pruning mask is assigned to more than one feature map.) 
pruning, by the processing device, the neural network by applying, to each feature map of the plurality of feature maps, a respective selected pruning mask; (pg 4 Section 3.4  “The j-th channel of the output of the next layer becomes… 
    PNG
    media_image8.png
    76
    421
    media_image8.png
    Greyscale
It is obvious that setting u (i) k = 0 is equivalent to removing the k-th filter at layer i…. During the training process, for each batch of input data, we randomly set some bits in the scoring mask to zero, such that the corresponding filters are ablated on the scoring path but still kept on the base path”  the mask u is selected and applied to prune M, the feature maps)
and training, by the processing device, the pruned neural network   (pg 4 Section 3.4 “During the training process, for each batch of input data, we randomly set some bits in the scoring mask to zero… comparing the corresponding feature maps on the base and scoring paths, and if it is large, we learn that the ablated filters are important for the current batch of data” the training is updated according to the pruned network performance Also shown in Algorithm 1.)
	Ding does not explicitly teach, for deployment on voice operated devices equipped with general purpose processors, wherein the pruned neural network is for performing a voice recognition task with limited computational capacity
	Park however teaches, for deployment on voice operated devices equipped with general purpose processors, wherein the pruned neural network is for performing a voice recognition task with limited computational capacity (Abstract “real-time automatic speech recognition (ASR) on mobile and embedded devices has been of great interests for many years. We present real-time speech recognition on smartphones or embedded systems by employing recurrent neural network (RNN) based acoustic models,” Section 3 pg 3 “The target speech recognition system in this work consists of CTC-trained AM, RNN LM, and beam search decoder.” Pg 2 introduction “For efficient beam search decoding, we early prune the output symbols of low probability in the acoustic model (AM).” Conclusion pg 9 “Real-time automatic speech recognition (ASR) on embedded CPUs is studied by integrating end-to-end trained acoustic RNN” smartphones have limited computational capacity)
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network system which uses and prunes a model described by Ding to be combined with the neural network system of Park for speech recognition. One would have been motivated to make such a combination because both Ding and Park are concerned with slimming and speeding up neural networks. Park in particular notes “To reduce the DRAM access overhead, we apply multi-frame parallel processing for the AM RNN…The word piece based model shows x2 of the real-time speed on an ARM CPU mainly due to x4 down-sampling of the word piece AM. This study can be applied to all single stream or small batch-size implementation of ASR regardless of the platform, such as GPU or special-purpose hardware.” (Conclusion Park) Further, Park notes “several model compression techniques have been developed, such as pruning …they can be combined with the proposed method. We applied 8-bit quantization to further reduce the execution time and the model size” (Section 2 Park)

Claim 3
Ding/Park teach claim 1
	Ding teaches, each feature map of the plurality of feature maps represents a plurality of responses of the input layer of the neural network at respective portions of input data (Section 3.1 “In this paper, filter j at layer i refers to the tuple comprising the trained parameters related to the output channel j of layer i… Let ∗ be the 2-D convolution operator, an arbitrary output channel j is… 
    PNG
    media_image9.png
    78
    425
    media_image9.png
    Greyscale
” a channel of a filter is applied to an feature map M via the filter K, which filters a portion of the input data or input data according to the index k. Generally, in mathematics a convolution operator operates on a respective portion of input.)
Park teaches, input data represented in time-frequency coordinates.  (Section 3.1 pg 3 “The AM RNN is trained with CTC loss [23]. The input of the convolutional layer consists of three 2-D feature maps with the time and frequency axes, and each feature map is formed with the mel-filter bank output,”

Claim 5
	Ding/Park teach claim 1
	Ding teaches, wherein selecting the pruning mask further comprises: removing the selected pruning mask from the predetermined set of pruning masks.  (pg 5 and 6 Section 3.5 “Inspired by the idea of incremental refinement, we propose to search for the least important filters in a binary search manner… remaining filters compose the search space A. We first score every filter in A and pick up |A|/2 filters as the picked set B which are most likely to be unimportant… If the current picked set B is good enough, finish the current move with g = |B| (i.e., permanently mask out the filters in B and start a new move)” filters are searched via binary search, the search space gets progressively smaller thus the corresponding mask, u, is removed in some cases permanently from the search space )

Claim 6
	Ding/Park teach claim 1
	Ding teaches, multiplying each element of the feature map by a corresponding element of the selected pruning mask. ( pg 4 Section 3.4 The base path forwards the outputs of layer i through a base mask u (i) ∈ R ci initialized as 1. The j-th channel of the output of the next layer becomes …. 
    PNG
    media_image3.png
    70
    427
    media_image3.png
    Greyscale
”)

Claim 8
	Ding/Park teach claim 1
	Ding teaches, responsive to determining that a terminating condition is not satisfied, iteratively repeating the pruning and the training. (pg 6 Figure 2. Flow chart of AOFP on a single layer. Examiner notes refining corresponds to training, select half of the filters is pruning 
    PNG
    media_image10.png
    454
    632
    media_image10.png
    Greyscale
)

Claim 9
	Ding teaches, A system, comprising: a memory; and a processing device, coupled to the memory, the processing device configured to: … ( pg 8 “AOFP pruned v.s. uniformly slimmed VGG. All the models are tested on an Nvidia GTX 1080Ti GPU or E5-2680 CPU with batch size 64, measured in examples/sec.”)
	The remaining limitations are rejected for the reasons set forth in the rejection of claim 1.

Claim 11, 13, 14 and 16
	are rejected for the reasons set forth in claims 3, 5, 6 and 8, respectively, in connection with claim 9

Claim 17
	Ding teaches, A non-transitory computer-readable storage medium storing executable instructions which, when executed by a processing device, cause the processing device to:… ( pg 8 “AOFP pruned v.s. uniformly slimmed VGG. All the models are tested on an Nvidia GTX 1080Ti GPU or E5-2680 CPU with batch size 64, measured in examples/sec.”)
	The remaining limitations are rejected for the reasons set forth in the rejection of claim 1.
Claim 20
	is rejected for the reasons set forth in claims 6 in connection with claim 17

Claim(s) 7 and 15 are rejected under 35 U.S.C. § 103 as being unpatentable over Ding/Park further in view of Yu et al. “Scalpel: Customizing DNN Pruning to the Underlying Hardware Parallelism” 

Claim 7
	Ding teaches Claim 1
Ding does not explicitly teach, applying a decay factor to the selected pruning mask.  
Yu however teaches, applying a decay factor to the selected pruning mask.  ( pg 7 “Figure 12 gives an example of a mask layer for fully-connected layers… Each node in the mask layer holds two parameters α and β. α is a boolean variable (α ∈ {0,1}) and β is a floating number between 0 and 1…. Let array Y and Y’ to be the output activation array of the original layer A and the mask layer A’… 
    PNG
    media_image11.png
    25
    99
    media_image11.png
    Greyscale
… In training iteration k > 1, αi is calculated as… 
    PNG
    media_image12.png
    81
    403
    media_image12.png
    Greyscale
… βi is updated through back-propagation and truncated to [0,1].” The next mask alpha is updated according to a decay parameter Beta. The next alpha is set to 0 or decayed based on Beta.)

Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network system which uses and prunes a model described by Ding to be combined with the neural network pruning system of Yu. One would have been motivated to make such a combination because both Ding and Yu are concerned with slimming and speeding up neural networks. Yu in particular notes “To avoid this performance decrease, node pruning removes DNN redundancy by removing entire nodes instead of weights. It uses mask layers to dynamically find out unimportant nodes and block their outputs.” (Section 3.4) Yu also notes the benefits particularly for parallel hardware “For high parallelism hardware (e.g., GPU), node pruning removes redundant nodes, not redundant weights, thereby reducing computation without sacrificing the dense matrix format” (abstract Yu)

Claim 15
	are rejected for the reasons set forth in claim 7 in connection with claim 9

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/J.R.G./
Examiner, Art Unit 2122  

/KAKALI CHAKI/            Supervisory Patent Examiner, Art Unit 2122
Read full office action
Prosecution Timeline

Show 12 earlier events
Jun 03, 2025
Final Rejection mailed — §101, §103, §112
Aug 04, 2025
Response after Non-Final Action
Sep 03, 2025
Request for Continued Examination
Sep 09, 2025
Response after Non-Final Action
Dec 23, 2025
Non-Final Rejection mailed — §101, §103, §112
Mar 23, 2026
Response Filed
Apr 22, 2026
Final Rejection mailed — §101, §103, §112
May 27, 2026
Interview Requested
Precedent Cases

Applications granted by this same examiner with similar technology

16/240,514
Patent 12566962
DITHERED QUANTIZATION OF PARAMETERS DURING TRAINING WITH A MACHINE LEARNING TOOL
7y 2m to grant Granted Mar 03, 2026
17/121,871
Patent 12566983
MACHINE LEARNING CLASSIFIERS PREDICTION CONFIDENCE AND EXPLANATION
5y 2m to grant Granted Mar 03, 2026
17/025,845
Patent 12554977
DEEP NEURAL NETWORK FOR MATCHING ENTITIES IN SEMI-STRUCTURED DATA
5y 5m to grant Granted Feb 17, 2026
16/537,752
Patent 12443829
NEURAL NETWORK PROCESSING METHOD AND APPARATUS BASED ON NESTED BIT REPRESENTATION
6y 2m to grant Granted Oct 14, 2025
17/029,290
Patent 12443868
QUANTUM ERROR MITIGATION USING HARDWARE-FRIENDLY PROBABILISTIC ERROR CORRECTION
5y 0m to grant Granted Oct 14, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

7-8
Expected OA Rounds
46%
Grant Probability
74%
With Interview (+27.9%)
4y 6m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 96 resolved cases by this examiner. Grant probability derived from career allowance rate.