Detailed Action
This Office Action is in response to the Appeal Brief filed on 09/25/2025. Claims 1-5 and 7-20 are currently pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In view of the Appeal Brief filed on 09/25/2025, PROSECUTION IS HEREBY REOPENED. New grounds of rejection are set forth below.
To avoid abandonment of the application, appellant must exercise one of the following two options:
(1) file a reply under 37 CFR 1.111 (if this Office action is non-final) or a reply under 37 CFR 1.113 (if this Office action is final); or,
(2) initiate a new appeal by filing a notice of appeal under 37 CFR 41.31 followed by an appeal brief under 37 CFR 41.37. The previously paid notice of appeal fee and appeal brief fee can be applied to the new appeal. If, however, the appeal fees set forth in 37 CFR 41.20 have been increased since they were previously paid, then appellant must pay the difference between the increased fees and the amount previously paid.
A Supervisory Patent Examiner (SPE) has approved of reopening prosecution by signing below:
/ABDULLAH AL KAWSAR/ Supervisory Patent Examiner, Art Unit 2127
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
The term “similar weighted” in claims 1 line 1, claim 7 line 2, claim 11 line 2, claim 15 line 2-3, claim 16 line 2, and claim 18 line 6 is relative term which renders the claim indefinite. The term “similar” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The “similar weighted” is unclear, as the standard for measuring the similarity is not given by the claim. The claim fails to define what value the weighted values should be similar to.
Claim 13 is a non-transitory machine-readable medium claim having similar limitation to the claim 1. Therefore, the claim is rejected under the same rationale as claim 1 above.
Dependent claims 2-5, 7-12 and 14-17 inherit the same deficiency as claim 1 and claim 13.
Claim 2 recites “wherein the functions are in a same level of the machine learning algorithm.” It is unclear what does it mean for ‘in a same level of the machine learning model.’ Does it mean that the functions have to be part of the machine learning algorithm? Or the functions are aligned in parallel with the machine learning algorithm?
For purpose of examination, the examiner treats the above limitation to mean: the functions are part of the machine learning algorithm.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-5 and 7-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1,
Step 1: Claim 1 recites a process comprising: identifying functions, determining a representative weighted value, calculating a summation, and multiplying the summation. Therefore, it is directed to the statutory category of processes.
2A Prong 1: A process comprising: identifying functions in a machine learning algorithm that comprise similar weighted values; (a mental process of evaluation. The limitation merely recites grouping similar functions by comparing the values of the functions, which can be done with the aid of pen and paper)
determining a representative weighted value for the identified functions; (a mental process of evaluation. Selecting a representative value from a set of value does not require a computer component and can be performed in the human mind)
calculating a summation of a plurality of input values for the identified functions; and (As stated in the claim, the claim recites a mathematical concept of calculating a summation of values)
multiplying the summation by the representative weighted value, thereby generating an output for the identified functions; (As stated in the claim, the claim recites a mathematical concept of multiplying a value)
2A Prong 2: providing the output to a next level in the machine learning algorithm. (insignificant extra-solution activity MPEP 2106.05(g) of transmitting data over a network)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere insignificant extra solution activity, combination of generic computer functions that are restricted to field of use are implemented to perform the disclosed abstract idea above.
2B: providing the output to a next level in the machine learning algorithm. (indicated as an insignificant extra-solution activity MPEP 2106.05(g) in Step 2A Prong 2. Therefore, the claim is re-evaluated as well understood, routine, and conventional activity MPEP 2106.05(d) of transmitting data over a network)
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed in combination of generic computer functions and usage of elements that are restricted to field of use that are implemented to perform the disclosed abstract idea above.
Regarding claim 2,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 1.
2A Prong 2: wherein the functions are in a same level of the machine learning algorithm. (a field of use and technological environment MPEP 2106.05(h) as the limitation merely defines where the functions are located)
2B: wherein the functions are in a same level of the machine learning algorithm. (a field of use and technological environment MPEP 2106.05(h) as the limitation merely defines where the functions are located)
Regarding claim 3,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 1.
2A Prong 2: wherein the machine learning algorithm comprises one or more of an artificial neural network and a support vector machine. (a field of use and technological environment MPEP 2106.05(h) as the limitation merely defines a type of the function used to implement the method)
2B: wherein the machine learning algorithm comprises one or more of an artificial neural network and a support vector machine. (a field of use and technological environment MPEP 2106.05(h) as the limitation merely defines a type of the function used to implement the method)
Regarding claim 4,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 3.
2A Prong 2: wherein the artificial neural network comprises a trained artificial neural network. (a field of use and technological environment MPEP 2106.05(h) as the limitation merely defines a type of the function used to implement the method)
2B: wherein the artificial neural network comprises a trained artificial neural network. (a field of use and technological environment MPEP 2106.05(h) as the limitation merely defines a type of the function used to implement the method)
Regarding claim 5,
Step 1: Processes, as above.
2A Prong 1: Incorporates the rejection of claim 3.
2A Prong 2: wherein the functions are associated with perceptrons. (a field of use and technological environment MPEP 2106.05(h) as the limitation merely defines a type of the function used to implement the method)
2B: wherein the functions are associated with perceptrons. (a field of use and technological environment MPEP 2106.05(h) as the limitation merely defines a type of the function used to implement the method)
Regarding claim 7,
Step 1: Processes, as above.
2A Prong 1: The process of claim 1, wherein the identifying functions in the machine learning algorithm that comprise similar weighted values comprises clustering the functions based on the weighted values of the functions. (a mental process of evaluation. The limitation merely recites grouping similar functions by comparing the values of the functions, which can be done with the aid of pen and paper)
2A Prong 2: This judicial exception is not integrated into a practical application.
2B: The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Regarding claim 8,
Step 1: Processes, as above.
2A Prong 1: wherein the clustering comprises a one-dimensional k-means algorithm. (a mathematical concept)
2A Prong 2: This judicial exception is not integrated into a practical application.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Regarding claim 9,
Step 1: Processes, as above.
2A Prong 1: wherein the representative weighted value is determined by averaging the weighted values of the functions for the cluster. (a mental process of evaluation. Selecting a representative value by comparing the value to the average value can be done in the human mind without a computer component)
2A Prong 2: This judicial exception is not integrated into a practical application.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Regarding claim 10,
Step 1: Processes, as above.
2A Prong 1: comprising checking an accuracy of the machine learning algorithm after the multiplying the summation by the representative weighted value. (a mental process of evaluation. Checking an accuracy of the algorithm by comparing the result of the algorithm to a value does not require a computer and can be performed in the human mind)
2A Prong 2: This judicial exception is not integrated into a practical application.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Regarding claim 11,
Step 1: Processes, as above.
2A Prong 1: The process of claim 1, wherein the identifying functions in the machine learning algorithm that comprise similar weighted values and the determining a representative weighted value for the functions comprise: (a mental process of evaluation. The limitation merely recites grouping similar functions by comparing the values of the functions, which can be done with the aid of pen and paper)
selecting a threshold number of clusters; (a mental process of evaluation. The limitation merely recites picking up pre-determined number of group of data which can be done in human mind)
assigning cluster ranges to the clusters based on a range of the weighted values and the threshold number; (The broadest reasonable interpretation of ‘Assigning cluster ranges’ encompasses the mental process of recording scores which can be done with the aid of pen and paper)
determining a particular cluster with which a particular function is associated based on the weighted value of the particular function and the cluster range of the particular cluster; (a mental process of evaluation. The limitation merely recites grouping similar functions by comparing the values of the functions, which can be done with the aid of pen and paper)
determining an average weighted value for the particular cluster; and (As stated in the claim, the limitation recites a mathematical concept of calculating a mean value)
replacing the weighted value of the particular function with the average weighted value for the particular cluster. (The broadest reasonable interpretation of ‘replacing the value of the particular function’ encompasses the mental process of recording updated scores which can be done with the aid of pen and paper)
2A Prong 2: This judicial exception is not integrated into a practical application.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Regarding claim 12,
Step 1: Processes, as above.
2A Prong 1: The process of claim 11, comprising:
determining an accuracy of the machine learning algorithm after replacing the weighted value of the particular function with the average weighted value for the particular cluster; (a mental process of evaluation. The limitation recites evaluating accuracy of the machine learning algorithm after the update by calculating average value for the scores which can be done with the aid of pen and paper)
comparing the accuracy of the machine learning algorithm to a previous accuracy of the machine learning algorithm; (a mental process of evaluation. Comparing accuracy value of programs can be done in the human mind and does not require a computer component)
incrementing the threshold number of clusters when the accuracy of the machine learning algorithm transgresses a threshold; and (The broadest reasonable interpretation of ‘incrementing the threshold number when the accuracy transgresses a threshold’ encompasses the mental process of evaluating whether the accuracy value is greater or less than the threshold value)
repeating the assigning cluster ranges to the clusters based on a range of the weighted values and the threshold number, the determining a particular cluster into which a particular function is associated based on the weighted value of the particular function and the range, the determining an average weighted value for the particular cluster, and the replacing the weighted value of the particular function with the average weighted value for the particular cluster. (The broadest reasonable interpretation of ‘assigning cluster ranges to the clusters’ and ‘replacing the value of the particular function’ encompasses the mental process of recording updated scores which can be done with the aid of pen and paper. ‘The determining a particular cluster into which particular function is associated’ is a mental process of selecting (judging) a particular function that is relevant)
2A Prong 2: This judicial exception is not integrated into a practical application.
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Regarding claim 13,
Step 1: Claim 13 recites a non-transitory machine-readable medium comprising instructions. Therefore, it is directed to the statutory category of a machine.
2A Prong 1: Claim 13 is a non-transitory machine-readable medium claim having similar limitation to claim 1 above. Therefore, the claim is rejected under the same rationale as claim 1.
2A Prong 2: A non-transitory machine-readable medium comprising instructions that when executed by a processor execute a process comprising: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
2B: A non-transitory machine-readable medium comprising instructions that when executed by a processor execute a process comprising: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
Regarding claim 14,
Step 1: A machine, as above.
2A Prong 1: Incorporates the rejection of claim 13.
2A Prong 2: wherein the machine learning algorithm comprises one or more of an artificial neural network and a support vector machine; wherein the artificial neural network comprises a trained artificial neural network; wherein the functions are associated with perceptrons. (a field of use and technological environment MPEP 2106.05(h))
2B: wherein the machine learning algorithm comprises one or more of an artificial neural network and a support vector machine; wherein the artificial neural network comprises a trained artificial neural network; wherein the functions are associated with perceptrons. (a field of use and technological environment MPEP 2106.05(h))
Regarding claim 15,
Step 1: A machine, as above.
2A Prong 1: The non-transitory machine-readable medium of claim 13,
wherein the identifying functions in the machine learning algorithm that comprise similar weighted values comprises clustering the functions based on the weighted values of the functions; and (a mental process of evaluation. The limitation merely recites grouping similar functions by comparing the values of the functions, which can be done with the aid of pen and paper)
wherein the clustering comprises a one-dimensional k-means algorithm; (a mathematical concept)
wherein the representative weighted value is determined by averaging the weighted values of the functions for the cluster. (a mental process of evaluation. Selecting a representative value by comparing the value to the average value can be done in human mind without a computer component)
2A Prong 2: This judicial exception is not integrated into a practical application.
2B: The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Regarding claim 16,
Step 1: A machine, as above.
2A Prong 1: Claim 16 is a non-transitory machine-readable medium claim having similar limitation to claim 11 above. Therefore, the claim is rejected under the same rationale as claim 11.
2A Prong 2: The non-transitory machine-readable medium of claim 13 (mere instructions to apply an exception using a computer MPEP 2106.05(f))
2B: The non-transitory machine-readable medium of claim 13 (mere instructions to apply an exception using a computer MPEP 2106.05(f))
Regarding claim 17,
Step 1: A machine, as above.
2A Prong 1: Claim 17 is a non-transitory machine-readable medium claim having similar limitation to claim 12 above. Therefore, the claim is rejected under the same rationale as claim 12.
2A Prong 2: The non-transitory machine-readable medium of claim 16, comprising instructions for: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
2B: The non-transitory machine-readable medium of claim 16, comprising instructions for: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
Regarding claim 18,
Step 1: Claim 18 recites a system comprising: a computer processor; and a computer memory coupled to the computer processor. Therefore, it is directed to the statutory category of a machine.
2A Prong 1: Claim 18 is a system claim having similar limitation to claim 1 above. Therefore, claim 18 is rejected under the same rationale as claim 1.
2A Prong 2: A system comprising: a computer processor; and a computer memory coupled to the computer processor; wherein the computer processor and computer memory are operable for: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
2B: A system comprising: a computer processor; and a computer memory coupled to the computer processor; wherein the computer processor and computer memory are operable for: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
Regarding claim 19,
Step 1: A machine, as above.
2A Prong 1: Claim 19 is a system claim having similar limitation to claim 11 above. Therefore, the claim is rejected under the same rationale as claim 11.
2A Prong 2: The system of claim 18, wherein the computer processor and computer memory are operable for: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
2B: The system of claim 18, wherein the computer processor and computer memory are operable for: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
Regarding claim 20,
Step 1: A machine, as above.
2A Prong 1: Claim 20 is a system claim having similar limitation to claim 12 above. Therefore, the claim is rejected under the same rationale as claim 12.
2A Prong 2: The system of claim 19, wherein the computer processor and computer memory are operable for: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
2B: The system of claim 19, wherein the computer processor and computer memory are operable for: (mere instructions to apply an exception using a computer MPEP 2106.05(f))
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim 1-5, 7-8, 10, 13-14 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (Han et al, 2019, “DEEP COMPRESSION: COMPRESSING DEEP NEURAL NETWORKS WITH PRUNING, TRAINED QUANTIZATION AND HUFFMAN CODING”, hereinafter ‘Han’) in view of Garland et al. (Garland & Gregg, “Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks”, 2017, hereinafter ‘Garland’)
Regarding claim 1, Han teaches a process comprising:
identifying functions in a machine learning algorithm that comprise similar weighted values; ([Han, page 3, Figure 3] and [Han, page 3, the last paragraph, line 1 – page 4, line 2] collectively disclose grouping similar weights (identifying similar weighted values) to share the same value (blue, green, red and orange))
determining a representative weighted value for the identified functions; ([Han, page 3, Figure 3] and [Han, page 3, the last paragraph, line 1 – page 4, line 2] collectively disclose grouping similar weights to share the same value (blue, green, red and orange). [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer. The weights are not shared across layers. [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-9] discloses that the centroids of the one-dimensional k-means clustering are the shared weights)
However, Han does not specifically disclose:
calculating a summation of a plurality of input values for the identified functions; and
multiplying the summation by the representative weighted value, thereby generating an output for the identified functions;
providing the output to a next level in the machine learning algorithm.
Garland teaches:
calculating a summation of a plurality of input values for the identified functions; and ([Garland, page 133, left col, line 7-21] and [Garland, page 133, Fig. 3 and Fig. 4] Fig.3 and Fig.4 shows how image value 26.7 and 6.1 are accumulated to 26.7+6.1=32.8 (a summation of a plurality of input values 26.7 and 6.1). The paragraph furtehr discloses multiplying the accumulated image values (the summation) by the representative weight value. For example,)
multiplying the summation by the representative weighted value, thereby generating an output for the identified functions; ([Garland, page 133, left col, line 7-21] and [Garland, page 133, Fig. 3 and Fig. 5] discloses multiplying the accumulated image values (the summation) by the representative weight value. For example, Fig.3 and Fig.4 shows how image value 26.7 and 6.1 are accumulated to 26.7+6.1=32.8 and multiplied by the representative weight value 1.7 for the bin 0 to obtain the final result 55.76)
providing the output to a next level in the machine learning algorithm. ([Garland, page 133, right col, Fig. 5] discloses each accumulated and multiplied values are summed again (the next level) to generate the final result 32.8*1.7+0.4*3.4+1.3*4.8+2.0*17.7 = 98.8)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Han and Garland to use the method of multiplying the summation of input data of Garland to implement the machine learning method of Han. Han teaches how to compress a network using weight sharing, but is silent on performing inference on a weight-shared network. Garland discloses how to efficiently perform the inference in a network with weight sharing in [Garland, page 135, left col, 5 RELATED WORK, line 18-27] “Both weight sharing [1], [2] and weight reuse reduce redundant data movement through different but complementary approaches.” Therefore, applying the method of Garland to implement the method of Han improves the technology.
Regarding claim 2, Han teaches:
wherein the functions are in a same level of the machine learning algorithm. ([Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer. The weights are not shared across layers. This indicates that the clustered weights are in a same level (layer) of the machine learning algorithm)
Regarding claim 3, Han teaches:
The process of claim 1, wherein the machine learning algorithm comprises one or more of an artificial neural network and a support vector machine. ([Han, page 3, 3 TRAINED QUANTIZATION AND WEIGHT SHARING, last para; Figure 3] Since the limitation recites ‘one or more of’ an artificial neural network and a support vector machine, therefore the algorithm that comprises only an ANN teaches the claim element. The paragraph discloses a single layer neural network with four input units and four output units)
Regarding claim 4, Han teaches:
the process of claim 3, wherein the artificial neural network comprises a trained artificial neural network. ([Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses using k-means clustering to identify the shared weights for each layer of a trained neural network)
Regarding claim 5, Han teaches:
the process of claim 3, wherein the functions are associated with perceptrons. ([Han, page 3, 3 TRAINED QUANTIZATION AND WEIGHT SHARING, last para; Figure 3] discloses a single layer neural network with four input units and four output units. The perceptron means a simplest form of a neural network that makes decision by receiving input data and generating output data)
Regarding claim 7, Han teaches:
The process of claim 1, wherein the identifying functions in the machine learning algorithm that comprise similar weighted values comprises clustering the functions based on the weighted values of the functions. ([Han, page 3, Figure 3] and [Han, page 3, the last paragraph, line 1 – page 4, line 2] collectively disclose grouping similar weights to share the same value (blue, green, red and orange). [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer based on k-means clustering method)
Regarding claim 8, Han teaches:
The process of claim 7, wherein the clustering comprises a one-dimensional k-means algorithm. ([Han, page 4, 3.1 WEIGHT SHARING, line 1-8] and [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-9] collectively disclose that the centroids of the one-dimensional k-means clustering are the shared weights)
Regarding claim 10, Han teaches:
The process of claim 1, comprising checking an accuracy of the machine learning algorithm after the [Han, page 8, Figure 6, Figure 7, and Figure 8] discloses comparing accuracy v.s. compression rate of different compression methods. [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses compressing a neural network by identifying representative weights)
Han does not specifically disclose:
checking an accuracy of the machine learning algorithm after the multiplying the summation by the representative weighted value.
Garland teaches:
checking an [Garland, page 133, right col, 4 EVALUATION, line 1 – page 134, right col, 4th para] discloses measuring power consumption performance measured after the multiplication disclosed in [Garland, page 133, left col, line 7-21] and [Garland, page 133, Fig. 3 and Fig. 4])
Regarding claim 13,
Han teaches:
A non-transitory machine-readable medium comprising instructions that when executed by a processor execute a process comprising: ([Han, page 9, 6.3 SPEEDUP AND ENERGY EFFICIENCY, 2nd and 3rd para] discloses that the system is implemented using the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor)
Claim 13 is a non-transitory machine-readable medium claim having similar limitation to claim 1 above. Therefore, the claim is rejected under the same rationale as claim 1.
Regarding claim 14, Han teaches:
The non-transitory machine-readable medium of claim 13, wherein the machine learning algorithm comprises one or more of an artificial neural network and a support vector machine; ([Han, page 3, 3 TRAINED QUANTIZATION AND WEIGHT SHARING, last para; Figure 3] Since the limitation recites ‘one or more of’ an artificial neural network and a support vector machine, therefore the algorithm that comprises only an ANN teaches the claim element. The paragraph discloses a single layer neural network with four input units and four output units)
wherein the artificial neural network comprises a trained artificial neural network; ([Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses using k-means clustering to identify the shared weights for each layer of a trained neural network)
wherein the functions are associated with perceptrons. ([Han, page 3, 3 TRAINED QUANTIZATION AND WEIGHT SHARING, last para; Figure 3] discloses a single layer neural network with four input units and four output units. The perceptron means a simplest form of a neural network that makes decision by receiving input data and generating output data)
Regarding claim 18,
Han teaches:
A system comprising: a computer processor; and a computer memory coupled to the computer processor; wherein the computer processor and computer memory are operable for: ([Han, page 9, 6.3 SPEEDUP AND ENERGY EFFICIENCY, 2nd and 3rd para] discloses that the system is implemented using the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor)
Claim 18 is a system claim having similar limitation to claim 1 above. Therefore, the claim is rejected under the same rationale as claim 1.
Claims 9, 11-12, 15-17 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Han in view of Garland and further in view of Bradley et al. (US 6449612 B1, hereinafter ‘Bradley’).
Regarding claim 9, Han teaches:
The process of claim 7, wherein the representative weighted value is determined by [Han, page 3, Figure 3] and [Han, page 3, the last paragraph, line 1 – page 4, line 2] collectively disclose grouping similar weights to share the same value (blue, green, red and orange). [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer. The weights are not shared across layers. [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-9] discloses that the centroids of the one-dimensional k-means clustering, which are mean values of the clusters, are the shared weights. The calculation of centroids is disclosed in detail by Bradley)
However, Han in view of Garland does not specifically disclose:
the representative weighted value is determined by averaging the weighted values of the functions for the cluster.
Bradley teaches:
the representative averaging the [Bradley, col 13, line 64-66] and [col 4, line 59-67] specifically disclose that the centroids are determined based on means (average) of the data value within a cluster)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Han, Garland, and Bradley to use the method of calculating average (mean, centroids) value of the cluster of Bradley to implement the clustering method of Han. The suggestion and/or motivation to do so is to improve the accuracy of the clustering system. The combination of Han and Garland teaches how to compress a network with weight sharing and k-means clustering, but is silent on adjusting the threshold number of clusters in the k-means clustering algorithm based on the evaluation and calculating ‘average’ of the cluster. Bradley discloses how to improve the accuracy of the clustering algorithm by calculating centroids using ‘average’ and adjusting the threshold number of clusters in [Bradley, ABSTRACT]. Therefore, combining Han, Garland, and Bradley improves the technology.
Regarding claim 11, Han teaches:
The process of claim 1, wherein the identifying functions in the machine learning algorithm that comprise similar weighted values and the determining a representative weighted value for the functions comprise: ([Han, page 3, Figure 3] and [Han, page 3, the last paragraph, line 1 – page 4, line 2] collectively disclose grouping similar weights to share the same value (blue, green, red and orange). [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer based on a one-dimensional k-means algorithm. The weights are not shared across layers)
determining an [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer. The weights are not shared across layers. [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-9] discloses that the centroids of the one-dimensional k-means clustering, which are the mean values of each cluster, are the shared weights. The calculation of centroids is disclosed in detail by Bradley)
replacing the weighted value of the particular function with the [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer. The weights are not shared across layers. [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-9] discloses that the centroids of the one-dimensional k-means clustering, which are the mean values of each cluster, are the shared weights. The calculation of centroids is disclosed in detail by Bradley)
Han in view of Garland does not specifically disclose:
selecting a threshold number of clusters;
assigning cluster ranges to the clusters based on a range of the weighted values and the threshold number;
determining a particular cluster with which a particular function is associated based on the weighted value of the particular function and the cluster range of the particular cluster;
determining an average weighted value for the particular cluster;
Bradley teaches:
selecting a threshold number of clusters; ([Bradley, col 13, line 9-15] and [Fig. 7, block 115] discloses updating the cluster number K (threshold number) at Step 115. The cluster number may increase by one. The iteration continues when stopping criteria 140 has not been met)
assigning cluster ranges to the clusters based on a range of the weighted values and the threshold number; ([Bradley, col 17, line 9-51] and [Bradley, Fig. 3, block 140] collectively disclose the first stopping criteria that is defined by a probability function p(x). The maximum difference parameter over the last r iterations are evaluated and if no difference exceeds a stopping tolerance ST (cluster ranges) then the first stopping criteria has been satisfied and the model is output)
determining a particular cluster with which a particular [Bradley, col 4, line 59 – col 5, line 34] discloses assigning a particular data point (weight value) to a particular cluster based on its Gaussian G2 with a weighting factor proportional to h2 (probability density value) that is given by the vertical distance from the horizontal axis to the curve G2 or G1 (cluster range) )
determining an average weighted value for the particular cluster; ([Bradley, col 13, line 64-66] and [col 4, line 59-67] specifically disclose that the centroids are determined based on means (average) of the data value within a cluster)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Han, Garland, and Bradley to use the method of adjusting a threshold number of clusters in a k-means clustering system based on an evaluation of Bradley to implement the machine learning method of Han. The suggestion and/or motivation to do so is to improve the accuracy of the clustering system. The combination of Han and Garland teaches how to compress a network with weight sharing and k-means clustering, but is silent on adjusting the threshold number of clusters in the k-means clustering algorithm based on the evaluation. Bradley discloses how to improve the accuracy of the clustering algorithm by adjusting the threshold number of clusters and evaluating the cluster model in [Bradley, ABSTRACT]. Therefore, combining Han, Garland, and Bradley improves the technology.
Regarding claim 12, Han teaches:
The process of claim 11, comprising: determining an accuracy of the machine learning algorithm after replacing the weighted value of the particular function with the average weighted value for the particular cluster; ([Han, page 8, Figure 6, Figure 7, and Figure 8] discloses comparing accuracy v.s. compression rate of different compression methods. [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses compressing a neural network by identifying representative weights and [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-5] discloses that the centroids of the one-dimensional k-means clustering are the shared weights. It is inherent that the centroids of the one-dimensional k-means clustering are determined by averaging all data points (weights) assigned to the cluster)
comparing the accuracy of the machine learning algorithm to a previous accuracy of the machine learning algorithm; ([Han, page 6, Table 1, Table 2, and Table 3] discloses comparing the original accuracy of LeNet 300-100 Ref, LeNet-5 Ref, AlexNet Ref, and VGG-16 Ref to the accuracy of the compressed neural networks LeNet 300-100 Compressed, LeNet-5 Compressed, AlexNet Compressed, and VGG-16 Compressed)
the determining an [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer. The weights are not shared across layers. [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-9] discloses that the centroids of the one-dimensional k-means clustering, which are the mean values of each cluster, are the shared weights. The calculation of centroids is disclosed in detail by Bradley)
However, Han in view of Garland does not specifically disclose:
incrementing the threshold number of clusters when the accuracy of the machine learning algorithm transgresses a threshold; and
repeating the assigning cluster ranges to the clusters based on a range of the weighted values and the threshold number, the determining a particular cluster into which a particular function is associated based on the weighted value of the particular function and the range;
the determining an average weighted value for the particular cluster
Bradley teaches:
incrementing the threshold number of clusters when the accuracy of the machine learning algorithm transgresses a threshold; and ([Bradley, col 13, line 9-15] and [Fig. 7, block 115] discloses updating the cluster number K (threshold number) at Step 115. The cluster number may increase by one. The iteration continues when stopping criteria 140 has not been met. [Bradley, col 17, line 9-51] and [Bradley, Fig. 3, block 140] collectively disclose the first stopping criteria that is defined by a probability function p(x). The maximum difference parameter over the last r iterations are evaluated and if no difference exceeds a stopping tolerance ST (cluster ranges) then the first stopping criteria has been satisfied and the model is output)
repeating the assigning cluster ranges to the clusters based on a range of the weighted values and the threshold number, the determining a particular cluster into which a particular average [Bradley, col 13, line 9-15] and [Fig. 7, block 115] discloses updating the cluster number K (threshold number) at Step 115. The cluster number may increase by one. The iteration continues when stopping criteria 140 has not been met (repeating the assigning cluster ranges to the clusters). The maximum difference parameter over the last r iterations are evaluated and if no difference exceeds a stopping tolerance ST (cluster ranges) then the first stopping criteria has been satisfied and the model is output. [Bradley, col 4, line 59 – col 5, line 34] discloses assigning a particular data point (weight value) to a particular cluster based on its Gaussian G2 with a weighting factor proportional to h2 (probability density value) that is given by the vertical distance from the horizontal axis to the curve G2 or G1 (cluster range). [Bradley, col 13, line 64-66] and [col 4, line 59-67] specifically disclose that the centroids are determined based on means (average) of the data value within a cluster)
Regarding claim 15, Han teaches:
The non-transitory machine-readable medium of claim 13, wherein the identifying functions in the machine learning algorithm that comprise similar weighted values comprises clustering the functions based on the weighted values of the functions; ([Han, page 3, Figure 3] and [Han, page 3, the last paragraph, line 1 – page 4, line 2] collectively disclose grouping similar weights (identifying similar weighted values) to share the same value (blue, green, red and orange))
wherein the clustering comprises a one-dimensional k-means algorithm; ([Han, page 3, Figure 3] and [Han, page 3, the last paragraph, line 1 – page 4, line 2] collectively disclose grouping similar weights to share the same value (blue, green, red and orange). [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] and [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-9] collectively disclose that the centroids of the one-dimensional k-means clustering are the shared weights)
wherein the representative weighted value is determined by [Han, page 4, 3.1 WEIGHT SHARING, line 1-8] discloses generating a representative weight (shared weights) for each layer. The weights are not shared across layers. [Han, 3.3 FEED-FORWARD AND BACK-PROPAGATION, line 1-9] discloses that the centroids of the one-dimensional k-means clustering, which are the mean values of each cluster, are the shared weights)
Han in view of Garland does not specifically disclose:
wherein the representative weighted value is determined by averaging the weighted values of the functions for the cluster;
Bradley teaches:
wherein the representative averaging the [Bradley, col 13, line 64-66] and [col 4, line 59-67] specifically disclose that the centroids are determined based on means (average) of the data value within a cluster)
Claim 16 is a non-transitory machine-readable medium claim having similar limitation to claim 11 above. Therefore, the claim is rejected under the same rationale as claim 11.
Claim 17 is a non-transitory machine-readable medium claim having similar limitation to claim 12 above. Therefore, the claim is rejected under the same rationale as claim 12.
Regarding claim 19, Han teaches:
The system of claim 18, wherein the computer processor and computer memory are operable for: ([Han, page 9, 6.3 SPEEDUP AND ENERGY EFFICIENCY, 2nd and 3rd para] discloses that the system is implemented using the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor)
Claim 19 is a system claim having similar limitation to claim 11 above. Therefore, the claim is rejected under the same rationale as claim 11.
Regarding claim 20, Han teaches:
The system of claim 19, wherein the computer processor and the computer memory are operable for: ([Han, page 9, 6.3 SPEEDUP AND ENERGY EFFICIENCY, 2nd and 3rd para] discloses that the system is implemented using the NVIDIA GeForce GTX Titan X and the Intel Core i7 5930K as desktop processors (same package as NVIDIA Digits Dev Box) and NVIDIA Tegra K1 as mobile processor)
Claim 20 is a system claim having similar limitation to claim 12 above. Therefore, the claim is rejected under the same rationale as claim 12.
Response to Arguments
Response to Arguments under 35 U.S.C. 101
Arguments: [Remarks, page 7-8] Applicant asserts that the consolidation is an improvement to neural networks, and is recited in the claims by determining a representative weighted value for functions in a neural network, calculating a summation of a plurality of input values for the functions, and multiplying the summation by the representative weighted value, thereby generating an output for the functions.
Examiner’s Response: Examiner respectfully disagrees. First, as discussed in the 35 U.S.C. 101 rejection above, the claim merely recites mental processes and math calculations that can be performed in one’s mind with the aid of pencil and paper. The claim merely recites comparing and identifying functions with similar weighted values, determining a representative value, calculating a summation of the value, which can practically be performed mentally. ‘Providing the output to a next level in the machine learning algorithm’ is an insignificant extra-solution activity.
MPEP 2106.04(d)(1) provides explanation of how examiners should evaluate this consideration and detailed explanation are further provided in MPEP 2106.05(a). MPEP 2106.05(a) states that a claim whose entire scope can be performed mentally, cannot be said to improve computer technology. Synopsys, Inc. v. Mentor Graphics Corp., 839 F.3d 1138, 120 USPQ2d 1473 (Fed. Cir. 2016) (a method of translating a logic circuit into a hardware component description of a logic circuit was found to be ineligible because the method did not employ a computer and a skilled artisan could perform all the steps mentally). Similarly, a claimed process covering embodiments that can be performed on a computer, as well as embodiments that can be practiced verbally or with a telephone, cannot improve computer technology. See RecogniCorp, LLC v. Nintendo Co., 855 F.3d 1322, 1328, 122 USPQ2d 1377, 1381 (Fed. Cir. 2017) (process for encoding/decoding facial data using image codes assigned to particular facial features held ineligible because the process did not require a computer). Since entire scope of the claim can be performed mentally, the applicant’s claimed subject matter cannot be said to improve computer technology.
Accordingly, arguments to claims 1, 13 and 18 are not persuasive. Therefore, arguments to dependent claims 2-5, 7-12, 14-17 and 19-20 are not persuasive.
Response to Arguments under 35 U.S.C. 112
Arguments: [Remarks, page 8] Applicant asserts that the specification provides a standard for ascertaining the requisite degree of the claim terms “similarly weighted” in paragraph [0018], [0019] and [0020].
Examiner’s Response: Examiner respectfully disagrees. Paragraphs [0018], [0019] and [0020] do not appear to provide any standard for ascertaining the requisite degree of the claim terms “similarly weighted”. The paragraph merely discloses that the identification of functions in the machine learning algorithm are determined based on the weighted values of the functions, and does not provide any standards for ‘similarly weighted’.
Accordingly, arguments to claims 1, 13 and 18 are not persuasive. Therefore, arguments to dependent claims 2-5, 7-12, 14-17 and 19-20 are not persuasive.
Arguments: [Remarks, page 9] Applicant asserts that any value that is input into a function of the machine learning algorithm is an input value for the function and not overly broad.
Examiner’s Response: The 35 U.S.C. 112(b) rejection regarding the claim phrase “input values for the identified functions” has been withdrawn.
Response to Arguments under 35 U.S.C. 102 and 103
Arguments: [Remarks, page 10] Applicant asserts that the applicant’s claimed subject matter sums the inputs associated with weights in a group to form a single input value and conduct a single multiplication between the aggregated input and the averaged group weights, which is different than performing multiple multiplications between individual inputs and weights as in Kung. [Remarks, page 11] Applicant further asserts that the effects of Kung and Applicant’s claimed subject matter are different, as Kung’s method achieves efficiency but applicant’s claimed subject matter reduces both the time and power.
Examiner’s Response: Applicant’s arguments with respect to the rejection of claim 1 under 35 U.S.C. 102 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground of rejection is made in view of Garland & Gregg.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Han et al, "EIE: Efficient Inference Engine on Compressed Deep Neural Network", 2016 (This prior art is pertinent as it discloses compressing a deep neural network based on similarity between the weight values)
Chen et al, “Compressing Neural Networks with the Hashing Trick”, 2015 (This prior art is pertinent as it discloses compressing a deep neural network by grouping the same neural network weight values)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUN KWON whose telephone number is (571)272-2072. The examiner can normally be reached Monday – Friday 7:30AM – 4:30PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached at (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JUN KWON/Examiner, Art Unit 2127
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127