DETAILED ACTION
This action is responsive to the Application filed on 04/01/2026 Claims 1-20 are pending in the case. Claims 1, 11 and 18 are independent claims. Claims 1-20 are amended.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 04/01/2026 has been entered.
Response to Arguments
Applicant's arguments filed 04/01/2026 have been fully considered but they are not persuasive
With respect to prior art:
Applicant principally argues Ren fails to anticipate the amended claims and further notes, “Ren is silent with respect to performing scatter and gather operations in place of a mask operation….Ren mentions a ‘gather kernel’ and ‘scatter kernel’ with respect to an input of the convolution algorithm” but does not describe the amended claims. Applicant specifically noted that Ren is silent with respect to “modifying one or more pruned neural network by replacing a tensor mask with a dense version of the tensor and one or more scatter operations…”
Examiner disagrees.
Ren does not merely mention such kernels. As noted in the Response to Arguments section of the Final Rejection dated 04/01/2026, Ren describes the disclosed gather and scatter based output generation process claimed.
Ren specifically describes a system which replaces a dense tensor representation of an input with a dense tiled version of the tensor the tiled tensor is blocked into gather and scatter operation which generate an output of the modified pruned neural network as claimed.
The rejection has been updated in accordance with the amendments.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ren “SBNet: Sparse Blocks Network for Fast Inference”
Claim 1/11/18
Ren teaches, A computer-implemented method comprising (claim 1)…a system comprising one or more processors (claim 11)…one or more processors comprising circuitry (claim 18) (pg 1 Introduction “We implemented our proposed sparse convolution kernels (fragments of parallel code) on graphics processing unit (GPU)”)
Ren teaches, modifying one or more pruned neural networks by at least replacing a tensor and a mask indicative of pruned weights of the one or more pruned neural networks with a dense version of the tensor and one or more scatter operations (pg 3 “The input to our sparse convolution module is a dense binary mask. Just like other standard sparse operations, we first need to extract a list of active location indices, which is named the reduce mask operation. Then, we would like to extract data from the sparse inputs at specified locations and paste the computed results back to the original tensor. To summarize, there are two major building blocks in our approach to sparse block-wise convolution…Reduce mask to indices: converts a binary mask to a list of indices, where each index references the location of the corresponding n-dimensional block in the input tensor and in our current implementation this is a 3- d tuple…For gathering we extract a block from the input tensor…Scatter is the inverse operation” here the conversion/replacement of the binary mask to a list of active indices amounts to replaces the input tensor containing a mask of inactive weights/indices into dense tile representation, which is further used with scatter operations to extract blocks from the input tensor.) and generating an output of the one or more modified pruned neural networks based, at least in part, on the dense version of the tensor and the one or more scatter operations(pg 1 “In these examples, spatial sparsity can be represented as binary computation masks where ones indicate active locations that need more computation and zeros inactive” pg 2 “we gather block-wise slices from tensors and maintain the tensor shape instead of lowering them to vectors. Within each active block, we perform a regular dense convolution” Figure 1 pg 1
PNG
media_image1.png
206
393
media_image1.png
Greyscale
as shown and described in the art a gather operation collected a subset of a feature map which is the dense version of the tensor. The scatter operation produces the modified pruned network output matrix according to the scatter operation and dense representation.)
Claim 2/12
Ren teaches claim 1/11
Ren teaches, performing the one or more scatter operations to match original dimensionality of one or more representations comprising the one or more pruned weights (Figure 1
PNG
media_image1.png
206
393
media_image1.png
Greyscale
Pg 2 “we gather block-wise slices from tensors and maintain the tensor shape instead of lowering them to vectors” as shown in the figure scattering remaps the computation to the original dimension. Each of the expanded matrix arrays comprise elements which correspond to the pruned weights.)
Claim 3/16
Ren teaches claim 1/11
Ren teaches, wherein the one or more pruned weights corresponds to a zero value (pg 1 “In these examples, spatial sparsity can be represented as binary computation masks where ones indicate active locations that need more computation and zeros inactive.” The binary masks have zero values for the pruned tensor elements pg 3 “The resulting non-zero locations are the spatial block locations that we extract the patches from.” The elements used for computation are non-zero values, conversely zero values are therefore prevented from being used to infer as they are not extracted for computation.)
Claim 4
Ren teaches claim 1
Ren teaches, the pruned weights correspond to one or more functions, wherein the one or more functions comprise at least one of: concatenation function, a matrix multiply function, a convolution function, or a rectifier linear unit (ReLU) function. (pg 1 “In this work, we leverage structured sparsity patterns of computation masks and propose Sparse Blocks Networks (SBNet), which computes convolution on a blockwise decomposition of the mask”)
Claim 5/14/20
Ren teaches claim 1/11/20
Ren teaches, the pruned weights are prevented from being used to infer information by replacing a first tensor that comprises the pruned weights with a second tensor that is generated based, at least in part, on the first tensor. (pg 1 “we gather block-wise slices from tensors and maintain the tensor shape instead of lowering them to vectors” pg 4 “we then slice the blocks out of the 4-d N × H × W × C input tensor using h×w×C slices, where h and w are the blocks’ height and width, and stack the B slices into a new tensor along the batch dimension, yielding a B×h×w×C tensor” the tensor is created by replacing the original tensor with a new tensor, thereby preventing certain elements of the original tensor to be used to infer.)
Claim 6
Ren teaches claim 1
Ren teaches, the pruned weights correspond to a masked portion of the one or more pruned neural networks (pg 1 “we propose to use the masks to guide the convolutional filters. Computation masks can also be considered as a form of attention mechanism where the attention weights are binary.” Figure 1
PNG
media_image1.png
206
393
media_image1.png
Greyscale
the mask selectively attends to a portion of the tensor which is not deactivated as shown in the figure)
Claim 7
Ren teaches claim 1
Ren teaches, the pruned weights are identified based, at least in part, on whether the pruned weights contribute to generation of an output tensor. (pg 3 “We observe that many input sources have structured sparsity that meshes well with block sparsity - background pixels are likely to be surrounded by other background pixels. It stands to reason that computations for entire spatial clumps or “blocks” of activations can be skipped.” Pg 6 “Using the same activation size of our detector network, we test the speedup on three types of masks: … Predicted masks obtained from the outputs of PSPNet.” The masks which identify the deactivated weights are based on skipping background pixels which have minimal impact on the activation of the block, further the masks are obtained based on the generation of the output tensor from PSPNet.)
Claim 8
Ren teaches claim 1
Ren teaches, determining whether to perform a scatter operation on a first node that is associated with one of the pruned weights based, at least in part on a function associated with a second node that is subsequent to the first node. (Figure 1 and pg 3 “For gathering, we extract a block from the input tensor, given the start location and the size of the n-d block. Scatter is the inverse operation where we update the output tensor using previously gathered and transformed data” as shown in figure 1 each the scatter operation, associated with the mask or deactivated weights, is based on subsequent convolutions and gather operations.)
Claim 9/13/19
Ren teaches claim 1/11/18
Ren teaches, combining the one or more scatter operations with one or more additional scatter operations associated with one or more layers of the one or more modified pruned neural networks. (claim 9) cause coalescing of the one or more scatter operations with an additional scatter operation associated with a second layer of the one or more pruned neural networks that resides after to a first layer of the one or more pruned neural networks in a sequence of layers of the one or more modified pruned neural networks (claim 13) combine one or more scatter operations with additional scatter operations to align dimensionality of a first portion of the one or more pruned neural networks to dimensionality of a second portion of the one or more modified pruned neural networks. (claim 19) ( Section 3 “ For speed-up purposes, the same sparsity mask is reused for every layer in our experiments, but it can also be computed from a different source per layer…” pg 3 “In this section, we first go over details of the above two building blocks, and then we introduce a sparse blocks residual unit which groups several layers of computation into sparse blocks” Figure 1
PNG
media_image2.png
214
398
media_image2.png
Greyscale
multiple scatter operations are shown in combination for the convolution module of the neural network over several layers of computation. The dimensionality is aligned by maintaining the dimension of the pre-gathered tensor.)Claim 10/15
Ren teaches claim 2/11/18
Ren teaches, propagating one or more scatter operations to one or more portions of the one or more modified pruned neural network to match an original dimensionality of one or more layers of the one or more pruned neural networks (claim 10)…. propagate the one or more scatter operations to an match original dimensionality of one or more layers of the one or more modified pruned neural network that comprise the pruned weights. (claim 15) Figure 1
PNG
media_image2.png
214
398
media_image2.png
Greyscale
multiple scatter operations are shown in combination for the convolution module of the neural network over several layers of computation. The dimensionality is aligned by maintaining the dimension of the pre-gathered tensor, or the original dimensionality.)
Claim 17
Ren teaches claim 16
Ren teaches, combine two or more scatter operations to be performed in response to one or more gather operations Figure 1
PNG
media_image2.png
214
398
media_image2.png
Greyscale
multiple scatter operations are shown in combination for the convolution module of the neural network over several layers of computation. Pg 1 “For gathering, we extract a block from the input tensor, given the start location and the size of the n-d block. Scatter is the inverse operation where we update the output tensor using previously gathered and transformed data” scattering is the inverse operation, performed to undo a gather operation after a convolution operation to re-map back to the original dense full dimensionality representation.)
Conclusion
Prior art not relied upon:
Anwar et al. “Structured Pruning of Deep Convolutional Neural Networks” describes pruning via removal of weight connection in a neural network.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 9:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.R.G./
Examiner, Art Unit 2122
/KAKALI CHAKI/ Supervisory Patent Examiner, Art Unit 2122