DETAILED ACTION
This action is responsive to the Amendments filed on 10/7/2025. Claims 1, 3-9, 11-15 and 18-19 are pending in the case. Claims 1 and 9 are independent claims. Claims 1, 6, 9 and 14 are amended.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 10/7/2025 have been fully considered but they are not persuasive
With respect to the 112 rejections: For claim 18 and 19:
Applicant argues that the rejection should be removed because "ranking neurons and removing the low ranking neurons is low rank approximation" and further cites paragraph 0031 for support.
Examiner disagrees. The cited section describes that finding the “best rank” involves pruning, quantizing and low rand approximation method. Further, neurons could be ranked using the claimed different techniques. Nothing in the cited section suggested that the describing ranking is using low rank approximation. Low rank approximation is a term of art that specifically describes approximation of a matrix with another "low rank" matrix, here rank is not a measure of importance or value of neurons but the "rank" of a matrix which is the number of linearly independent rows/columns of a matrix. One of ordinary skill in the art would not understand "ranking neurons and removing the low-ranking neurons is low rank approximation". Nevertheless, Examiner points out that if this equivalence supposed by applicant were accurate, the claims could be amended to replace "low rank approximation" with "ranking neurons and removing the low ranking neurons".
For claim 6 and 14:
The rejection is removed in light of the amendments.
For claim 1, 3-9, 11-15, 18-19:
Applicant argues that replacing the term “acceptable” with “simplified” in the independent claim resolves the relative term rejection. It is argued that complexity depends on the size of the device and therefore has a definite meaning and cites paragraph 16 and 24 for support.
Examiner disagrees. Neither the terms simplified nor acceptable have a definite meaning. It is not clear to examiner how or why simplified necessarily depends on any particular device, this is not supported by the claims nor the specification. Any and all models are simpler in complexity in some domain to every other model, the claims, nor the specification, define a means of measuring what constitutes simplified complexity, and therefore it is impossible to determine the bounds of the claim. The claims could be amended to state, "wherein the reconfigured model has at least on logic circuit is removed from a pre-trained DNN model", which appears to be the intended meaning in paragraph 0024.
With respect to the 101 rejections:
Applicant seems to make a broad holistic argument that the claims amount to more than an abstract idea.
Examiner disagrees. The MPEP sets forth a framework for making this determination, via a flowchart. The fact that the abstract ideas recited in the claims are in some way associated with vague attributes of computer components such as "size" is not a supported rational for eligibility. As an example, the human mind is capable of evaluating constraints and making decisions based on them, even those grounded in description of the size of an electronic device.
To demonstrate improvement to the functioning of the device claimed, the MPEP clearly specifies that the improvement should be reflected in the "additional elements". The improvement is not reflected for the reasons identified in the rejection and further described in MPEP 2106.05(f-h), the rejection is maintained.
With respect to the 103 rejections:
Applicant argues the claims describe combining unstructured and structured pruning, and that these terms are well known by phosita. Applicant argues that Han relates to unstructured pruning, further noting that structured pruning amounts to removing entire blocks of weights within given weight matrices meaning the pruned model can be run using the same hardware and software.
Examiner notes that based on Applicants description it is clear that the cited art Hu relates to removing individual weights and does not specify removing entire channels or filters. Nevertheless, the claims do not require removing entire channels and filters, rather that claims describe determining a best rank of filters according to sparsity and compression/pruning of a received model. Hu is relied upon for the compression and pruning as well as ranking and sparsity analysis claimed, while Denton is relied upon for applying low rank approximation as claimed. The claims do not recite such features as “unstructured” or “structured” pruning, nor the removal of entire blocks of weights within given weight matrices. Further, the neither the claims nor the specification uses the words “blocks”, “unstructured”, “structured” or which functional steps result in the intended use of running on the same hardware and software.
Applicant further notes that Denton does not perform sparsity analysis because structured sparsity removes entire blocks or groups of neurons and that Low rank approximation removes redundancy by approximating full rank filters, while structured sparsity removes entire groups rather than individual elements.
Examiner notes Denton is relied upon for performing LRA according to analysis results as claimed. Applicant appears to argue that Low-rank approximation is not structured sparsity because it doesn’t remove entire groups of elements. However, removing entire groups of elements is not claimed, rather applying low-rank approximation is claimed, which is taught by Denton.
Examiner notes the rejection is made based on a combination of references. It is not clear which limitations Applicant contends are not taught because they only broadly describe concepts which are different in the art like structured sparsity and removal of blocks, neither of which are claimed.
For these reasons the rejection is maintained
Applicant further traverses the remaining claim rejections of the dependent claims and similar independent claim without additional argument.
Examiner disagrees for the reasons provided above
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claim 18-19 and claims are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 18 and 19 recite “the low-rank approximation is according to L/1/L2 mean…mean activations…a number of times not being zero”. Examiner does not find evidence of support in the original disclosure. Applicant points to specification of the instant application 0031 for support of such amendments. The paragraph describes various steps “in order to find the nest rank of filters” the paragraph states “the ranking can be done according to the L1/L2 mean…”. This does not suggest that “low rank approximation is according to [the] mean”, rather that ranking generally includes the steps. Therefore, Examiner does not find evidence to suggest that low rank approximation as claimed is according to the claim features, but rather that ranking of filters generally is according to L1/L2 mean, mean activations, or a number of times of not being zero
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 1, 3-9 and 11-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
The term “simplified computational complexity” in claim 1 and 9 is a relative term which renders the claim indefinite. The term is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. At best, such a limitation is not given patentable weight as is it amounts to non-functional language which does not affect the structure or function of the invention.
Dependent claims 3-8 and 11-19 are rejected with the same reason.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 3- 9, 11-15, 18-19 are rejected under 35 U.S.C. 101 because the claims are directed to an abstract idea without significantly more.
Regarding Claim 1,
Step 1 Analysis
Claim 1 is directed to a self-tuning model compression methodology, which is a process, one of the statutory categories.
Step 2A Prong One Analysis:
Claim 1 recites the abstract ideas in the following limitations:
compressing the pre-trained DNN model into a reconfigured model according to the data set
analyzing sparsity of the pre-trained DNN model to generate analysis results for determining redundancies of parameters and feature maps of the pre-trained DNN model
ranking a plurality of neurons in the pre-trained DNN model according to the contribution of their parameters,
performing pruning and quantizing of a network redundancy of the pre-trained DNN model
applying a low-rank approximation method to said at least one hidden layer and the output layer of the DNN model according to the analysis results comprising ranking the neurons according to their respective contribution in order to determine a best rank of filters for representing the DNN model;
wherein the compression parameter for the pruning, quantizing and low-rank approximation is determined according to the analysis results in order to compress the pre-trained DNN model into a reconfigured model having the simplified computational complexity.
Each of the above steps describe mental processing regarding the evaluation of abstract data and may be performed in the human mind and thus falls under the mental processes group of abstract idea.
And thus, the claim falls within judicial exception of abstract idea and requires further analysis under Step 2A Prong Two.
Step 2A Prong Two Analysis:
Claim 1 recites the following additional elements along with the abstract ideas:
an electronic device
executed by a processor
receiving a pre-trained DNN model and a data set
the processor receiving a compression parameter
wherein the pre-trained DNN model comprises an input layer, at least one hidden layer and an output layer, and said at least one hidden layer and the output layer of the pre-trained DNN model comprise a plurality of neurons
wherein the reconfigured model comprises an input layer, at least one hidden layer and an output layer, and said at least one hidden layer and the output layer of the reconfigured model comprise a plurality of neurons, and a size of the reconfigured model is smaller than a size of the pre-trained DNN model and has a simplified computational complexity for being executed on the electronic device
the processor executing the reconfigured model on a user terminal of the electronic device for an end-user application
The additional elements related to the execution of the electronic device is recited in high generality and amounts to no more than a recitation of the words "apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a computer (MPEP 2106.05(f))
The additional elements which describe receiving information ( a DNN model, a data set, a compression parameter) amount to adding insignificant extra solution activity to the judicial exception (MPEP 2106.05(g)).
The wherein clauses are additional elements which are recited generally link the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
Claim 1 does not integrate the abstract idea into a practical application.
Step 2B Analysis:
The additional element of executing is recited in high generality which generically recites an effect of the judicial exception or claims every mode of accomplishing that effect and thus amounts to a claim that is merely adding the word "apply it" to the judicial exception (MPEP 2106.05(f)).
The additional elements in the wherein clauses are recited in high generality and generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
The steps of receiving is well-understood, routine, conventional activity recognized in MPEP 2106.05(d)(i) - receiving or transmitting data over a network.
Claim 1 does not contribute inventive concept.
Regarding Claim 3,
Step 2A Prong One Analysis:
Claim 3 does not recite additional limitations that are direct to an abstract idea.
Step 2A Prong Two and 2B Analysis:
Claim 3 recites the following additional elements:
wherein a number of the plurality of neurons of the reconfigured model is less than a number of the plurality of neurons of the DNN model
These additional elements amount to generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
The claim therefore does not integrate the abstract idea into a practical application, nor provided significantly more.
Regarding Claim 4,
Step 2A Prong One Analysis:
The abstract idea limitations of claim 1 are incorporated, Claim 4 recites additional abstract idea limitations
compressing the DNN model into a reconfigured model comprises: removing a portion of the logic circuits in the DNN model according to the analysis result so that a number of logic circuits in the reconfigured model is less than a number of logic circuits in the DNN model
The steps of compressing the DNN model involves removing a portion of logic circuits. Removing logic circuits amounts a mental decision to ignore the elements of certain neurons of the model which is a decision which can be made practically in the human mind and thus falls under the mental processes group of abstract idea.
The claim recites an abstract idea.
Step 2A Prong Two and 2B Analysis:
Claim 4 recites the following additional elements:
wherein each of the plurality of neurons of the reconfigured model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder, each of the plurality of neurons of the DNN model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder
These additional elements amount to generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
The claim therefore does not integrate the abstract idea into a practical application, nor provided significantly more.
Regarding Claim 5,
Step 2A Prong One Analysis:
Claim 5 does not recite any additional abstract ideas beyond those recited in a parent claim.
Step 2A Prong Two and 2B Analysis:
Claim 5 recites the following additional elements along with the abstract ideas:
retraining the reconfigured model with the data set
The additional element of retraining is recited in high generality and amounts to no more than a recitation of the words "apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a computer (MPEP 2106.05(f)).
The claim therefore does not integrate the abstract idea into a practical application, nor provided significantly more.
Regarding Claim 6 - 8,
Step 2A Prong One Analysis:
Claim 6 - 8 does not recite additional abstract ideas beyond those recited in the parent claim.
Step 2A Prong Two and 2B Analysis:
Claim 6 - 8 recites the following additional elements along with the abstract ideas:
wherein the DNN model is used for computer vision targeted application models selected from the group consisting of AlexNet, a VGG16, a ResNet, and a MobileNetn
wherein each of said at least one hidden layer and the output layer of the reconfigured model is a convolutional layer or a fully-connected layer
wherein the end-user application is a visual recognition application or a speech recognition application
The additional elements general linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
The claims therefore does not integrate the abstract idea into a practical application, nor provided significantly more.
Regarding Claim 9,
Claim 1 is directed to an electronic device, which is one of the statutory categories.
Claim 9 is the corresponding electronic device claim of Claim 1. The recited limitation of storage device, program code, processor are recited in high generality which amounts no more than a recitation of the words "apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a computer (MPEP 2106.05(f)) . Claim 9 is rejected with the same reason as Claim 1.
Regarding Claim 11 – 15,
Claim 11 – 15 are the corresponding electronic device claim of Claim 3 – 7. Claim 11 – 15 are rejected with the same reason as Claim 3 – 7 in view of the rejection of claim 1.
Regarding Claim 18
The claim is directed the same category identified in the parent claim. The dependent claim recites the following additional limitations which recite abstract ideas:
“wherein the low-rank approximation is according to L1/L2 mean of neuron weights, mean activations, or a number of times of not being zero on a validation set of the pre-trained DNN model.”
Under Step 2A Prong 1, these limitations correspond to a mental evaluation. These limitations describe decision about approximation made based on abstract information. Such a evaluation can be performed in the human mind.
Therefore the claim recites an abstract idea.
Furthermore under step 2A Prong 2 and 2B, the claim does not recite additional elements to consider other than those considered in the independent claim. Accordingly, the recited additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea, nor do they amount to significantly more than the judicial exception because they do not impose any meaningful limits on practicing the abstract idea.
Regarding Claim 19
The claim is directed the same category identified in the parent claim. The dependent claim recites the following additional limitations which recite abstract ideas:
“wherein the low-rank approximation is according to L1/L2 mean of neuron weights, mean activations, or a number of times of not being zero on a validation set of the pre-trained DNN model.”
Under Step 2A Prong 1, these limitations correspond to a mental evaluation. These limitations describe decision about approximation made based on abstract information. Such a evaluation can be performed in the human mind.
Therefore, the claim recites an abstract idea.
Furthermore, under step 2A Prong 2 and 2B, the claim does not recite additional elements to consider other than those considered in the independent claim. Accordingly, the recited additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea, nor do they amount to significantly more than the judicial exception because they do not impose any meaningful limits on practicing the abstract idea.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 3, 5– 9, 11, 13-15, 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Hu, (Network Trimming a Data Driven Neuron Pruning), in view of Han (Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding), and Denton, (Exploiting Linear Structure within Convolutional Networks for Efficient Evaluation).
Regarding Claim 1,
Hu discloses a self-tuning model compression methodology executed by a processor for reconfiguring a Deep Neural Network (DNN) on an electronic device, comprising the processor, (Hu, sec. 4 “implement our algorithm using the standard Caffe library (program code)” Section 4.2.2 “After trimming the CONV5-3 and FC6 layers, we continue to trim their neighboring layers. We experimented with three sets of trimming layouts:” the programing code is executed on a computing hardware which is an electronic device and trimming amounts to reconfiguring a deep network with multiple layers), the method comprising: receiving a pre-trained DNN model and a data set (Hu Section 3.2 pg 3 “Our network trimming method consists of three main steps, as illustrated in Figure 3. First the network is trained under conventional process and the number of neurons in each layer is set empirically… Next, we run the network on a large validation dataset” the network is first trained before pruning.) wherein the pre-trained DNN model comprises an input layer, at least one hidden layer and an output layer, and said at least one hidden layer and the output layer of the pre-trained DNN model comprise a plurality of neurons (pg 4 Figure 4 before pruning, it is clear the DNN is composed of 3 layers and neurons thus corresponding to the claims requiring at least 3 named layers.) the processor receiving a compression parameter (pg 3-4 Section 3.1 “We define Average Percentage of Zeros (APoZ) to measure the percentage of zero activations of a neuron after the ReLU mapping…
PNG
media_image1.png
66
494
media_image1.png
Greyscale
” this equation is used for compressing the claimed model via pruning. These parameters are received by the processor for performing the compression described below. ) for compressing the pre-trained DNN model into a reconfigured model according to the data set, wherein the reconfigured model comprises an input layer, at least one hidden layer and an output layer, and said at least one hidden layer and the output layer of the reconfigured model comprise a plurality of neurons, and a size of the reconfigured model is smaller than a size of the pre-trained DNN model and has a simplified computational complexity for being executed on the electronic device… wherein the compression parameter for the pruning, … is determined according to the analysis results in order to compress the pre-trained DNN model into a reconfigured model having the simplified computational complexity (Hu, Figure 5 after pruning at least one neuron is removed thus the size is smaller. See 112 rejection regarding “simplified complexity”. A model which can be and is executed on a device, like in Hu, has an simplified complexity. The parameters for compression are determined based on analysis and compress the model) wherein the step of compressing the pre-trained DNN model into a reconfigured model according to the data set comprises …analyzing a sparsity of the pre-trained DNN model to generate analysis results for determining redundancies of parameters and feature maps of the pre-trained DNN model; performing pruning…of a network redundancy of the pre-trained DNN model according to the analysis results….comprising ranking the neurons according to their respective contribution order… (Hu, 3.1 – 3.2, “Average Percentage of Zeros (APoZ) to measure the percentage of zero activations of a neuron … we use the definition of APoZ to evaluate the importance of each neuron … starting to trim from a few layers with high mean APoZ (single layer sparsity analysis on each layer) and then progressively trim its neighboring layers (multi-layer sparsity analysis)”; abs. “pruning unimportant neurons based on analysis” measuring percentage of zeros is a sparsity analysis of the model as claimed… the trimming of unimportant neurons amounts to determining redundancies of parameters and feature maps, as determination of important vs unimportant neurons amounts to ranking neurons as either important or not according to their contribution) and the processor executing the reconfigured model on a user terminal for an end-user application (Hu, sec. 3.2, “the image classification task (end-user application)”; the application is executed on a computer environment (user terminal))
Hu does not explicitly teach, and performing quantizing of a network redundancy of the pretraining model according to the analysis result… and applying a low-rank approximation method to said at least one
PNG
media_image2.png
3
7
media_image2.png
Greyscale
hidden layer and the output layer of the DNN model according to the analysis results to determine a best rank of filters for representing the DNN model; [the compression parameter] for… quantization and low rank approximation
Han when addressing combining quantization and pruning to reduce the size of a pruned neural network teaches, and performing quantizing of a network redundancy of the pretraining model according to the analysis result [the compression parameter] for… quantization (fig. 1, “compression pipeline: pruning, quantization” as shown in the figure quantization via codebooks is performed after pruning which is in part an analysis result. When combined with Hu the compression parameter is in part for quantization)
Hu and Han both teach neural network compression technique and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Hu’s teaching of neuron pruning with Han’s teaching of neural network compression with pruning and quantization to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification in order to “pruning and trained quantization are able to compress the network without interfering each other, thus lead to surprisingly high compression rate” (Han, sec. 1).
Hu/Han do not explicitly teach, and applying a low-rank approximation method to said at least one
PNG
media_image2.png
3
7
media_image2.png
Greyscale
hidden layer and the output layer of the DNN model according to the analysis results to determine a best rank of filters for representing the DNN model; [the compression parameter] for… quantization and low rank approximation
Denton when address low rank approximation of neural network layers teaches, and applying a low-rank approximation method to said at least one
PNG
media_image2.png
3
7
media_image2.png
Greyscale
hidden layer and the output layer of the DNN model according to the analysis results to determine a best rank of filters for representing the DNN model; [the compression parameter] for… low rank approximation (Denton, sec. 1, “start by compressing each convolutional layer by finding an approximate low-rank approximation … reduction of parameters in fully connected layer”; i.e., the parameters that are used to represent the network are the selected (filtered) as the best ranked representation of the network. When combined with Hu/Han the compression parameter is for low rank approximation)
Hu/Han and Denton both teach techniques to reduce redundancy in neural network and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Hu/Han teaching of neuron pruning with Denton’s teaching of low-rank analysis to approximate the network output to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification to speed up the processing while keep high accuracy ( “We exploit the redundancy present within the convolutional filters to derive approximations that significantly reduce the required computation” Denton, abs.).
Regarding Claim 3,
Hu/Han/Denton teach claim 1
Hu teaches, wherein a number of the plurality of neurons of the reconfigured model is less than a number of the plurality of neurons of the DNN model (Hu, fig. 4 & fig. 5, after pruning model has less neuron than before pruning model).
Regarding Claim 5,
Hu/Han/Denton teach claim 1
Hu teaches, retraining the reconfigured model with the data set (Hu, pg 2 Section 1 “Then those weak neurons are pruned while others are kept to initialize a new model. Finally, the new model is retrained or fine-tuned depending on the performance drop”).
Regarding Claims 6
Hu/Han/Denton teach claim 1
Hu further teaches, wherein the DNN model is used for computer vision targeted application models selected from the group consisting of AlexNet, a VGG16, a ResNet, and a MobileNet, (Hu, sec. 5.2, “VGG-16”; Section 6 pg 8 “We experimented our algorithm on LeNet and VGG-16 achieving the same accuracy with 2∼3× less parameters. In VGG-16, the trimmed models can even surpass the original one” VGG 16 is a computer vision model, AlexNet, ResNet and MobilNet are all computer vision model involving images)
Regarding Claims 7
Hu/Han/Denton teach claim 1
Hu teaches, wherein each of said at least one hidden layer and the output layer of the reconfigured model is a convolutional layer or a fully-connected layer (Hu, sec. 4.1, “LeNet network consists of two convolutional layers followed by two fully connected layers”).
Regarding Claims 8
Hu/Han/Denton teach claim 1
Hu teaches, wherein the end-user application is a visual recognition application or a speech recognition application (Hu, sec. 3.2, “image classification task”).
Regarding Claim 9,
Hu teaches, an electronic device, comprising: a storage device, arranged to store a program code; and a processor, arranged to execute the program code; wherein when loaded and executed by processor, the program code instructs the processor to execute the following steps (Hu, sec. 4 “implement our algorithm using the standard Caffe library (program code)”; the programing code is executed on a computing hardware with memory and processor).
The remaining limitations of Claim 9 is rejected for the same reasons as set forth in the rejection of Claim 1 in view of Hu/Han/Denton.
Regarding Claim 11, 13, 14, 15
Claims 11, 13, 14, 15 recite limitation which are rejected for the same reasons as Claim 3, 5, 6, and 7, respectively, in connection with claim 9.
Regarding Claim 18
Hu/Han/Denton teach claim 1
Further Hu teaches, wherein the low-rank approximation is according to L1/L2 mean of neuron weights, mean activations, or a number of times of not being zero on a validation set of the pre-trained DNN model (Hu, 3.1 – 3.2, “Average Percentage of Zeros (APoZ) to measure the percentage of zero activations of a neuron … we use the definition of APoZ to evaluate the importance of each neuron … starting to trim from a few layers with high mean APoZ (single layer sparsity analysis on each layer) and then progressively trim its neighboring layers (multi-layer sparsity analysis)”; abs. “pruning unimportant neurons based on analysis” measuring percentage of zeros necessarily involves a number of times the neuron activation is not zero on a validation. To the extent that low rank approximation includes these steps in order to rank, as addressed in the 112b above, Hu’s discussion of selecting via importance of rank according to a number of times not being zero corresponds to the claim.)
Regarding Claim 19
Hu/Han/Denton teach claim 9
Further Hu teaches, wherein the low-rank approximation is according to L1/L2 mean of neuron weights, mean activations, or a number of times of not being zero on a validation set of the pre-trained DNN model (Hu, 3.1 – 3.2, “Average Percentage of Zeros (APoZ) to measure the percentage of zero activations of a neuron … we use the definition of APoZ to evaluate the importance of each neuron … starting to trim from a few layers with high mean APoZ (single layer sparsity analysis on each layer) and then progressively trim its neighboring layers (multi-layer sparsity analysis)”; abs. “pruning unimportant neurons based on analysis” measuring percentage of zeros necessarily involves a number of times the neuron activation is not zero on a validation. To the extent that low rank approximation includes these steps in order to rank, as addressed in the 112b above, Hu’s discussion of selecting via importance of rank according to a number of times not being zero corresponds to the claim.)
Claim 4 and 12 are rejected under 35 U.S.C. 103 as being unpatentable under Hu/Han/Denton, further in view of Saffar “A Neural Network Architecture Using High Resolution Multiplying Digital to Analog Converters”
Regarding Claims 4,
Hu/Han/Denton teach claim 1
Hu does not teach, wherein each of the plurality of neurons of the reconfigured model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder, each of the plurality of neurons of the DNN model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder, and compressing the DNN model into a reconfigured model comprises; removing a portion of the logic circuits in the DNN model according to the analysis result so that a number of logic circuits in the reconfigured model is less than a number of logic circuits in the DNN model.
Saffar explicitly discloses the logic circuit and multiplexer or adder hardware realization of a neural network teaching, wherein each of the plurality of neurons of the reconfigured model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder, each of the plurality of neurons of the DNN model corresponds to at least one logic circuit comprising at least one of a multiplexer and an adder, (Saffar Section 2 “Fig. 1 shows the block diagram of one layer of the proposed network… Figure 1, where m and n represent the number of corresponding neuron and inputs to each layer, respectively….The number of MDACs in each layer is equal to the number of inputs to that layer… The adder block is simply a current-mode summation node…There is one time-multiplexer per each layer, which acts as the selection signal controlling inputs to the MDAC block…” the network consisting of a plurality of neurons and layers, composed of multiplexer and adders.)
Hu/Han/Denton when combined with Saffar teaches, removing a portion of the logic circuits in the DNN model according to the analysis result so that a number of logic circuits in the reconfigured model is less than a number of logic circuits in the DNN model, (Hu, fig. 4 & fig. 5, after pruning model has less neuron than before pruning model. Figures 1-3 of Saffar depict many circuit components related to the neural network, when a neuron is removed the number of circuit components necessarily is reduced.)
Hu/Han/Denton and Saffar both disclose neural network acceleration application and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Hu/Han/Denton teaching of neural network compression method with Judd’s teaching of acceleration apparatus to achieve the claimed teaching. One of the ordinary skilled in the art would have motivated to make this modification in order to reduce circuit size (Conclusion Saffar “can significantly reduce the circuit size and makes the proposed method feasible to be implemented in mixed-signal neural network”)
Regarding Claim 12
Claim 12 is rejected for the reasons set forth in the rejection of claim 4, in connection with parent claim 9
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.R.G./
Examiner, Art Unit 2122
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122