DETAILED ACTION
This action is in response to the claims filed 03/28/2023 for Application number 18/191,700 which claims foreign priority to IN202241047830 filed 08/22/2022. Claims 1-20 are currently pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in India on 08/22/2022. It is noted, however, that applicant has not filed a certified copy of the IN202241047830 application as required by 37 CFR 1.55.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1,
Step 1 Analysis: Claim 1 is directed to a process, which falls within one of the four statutory categories.
Step 2A Prong 1 Analysis: Claim 1 recites, in part, The limitations of:
determining an accuracy improvement of a layer of a neural network implemented using a first bit precision relative to using a second bit precision can be considered to be an evaluation in the human mind,
determining a latency degradation of the layer of the neural network implemented using the first bit precision relative to using the second bit precision can be considered to be an evaluation in the human mind
selecting, based on the accuracy improvement and the latency degradation, the first bit precision or the second bit precision for use in implementing the layer of the neural network can be considered to be an evaluation in the human mind
These limitations as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind or with the aid of pen and paper which falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements - “layer of a neural network”. This element that is recited is only generally linked to the judicial exception. Please see MPEP 2106.05(h). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing a layer of a neural network to perform steps of the claimed invention is merely generally linking the additional element to the judicial exception. The claim is not patent eligible.
Regarding claim 2, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the first bit precision is based on a first quantization of floating point data to fixed point data of a first length, and wherein the second bit precision is based on a second quantization of the floating point data to fixed point data of a second length that differs relative to the first length. This limitation amounts to more specifics of the judicial exception identified in the rejection of claim 1 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.
Regarding claim 3, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein determining the accuracy improvement comprises:
determining a first accuracy of the layer of the neural network when implemented using the first bit precision;
determining a second accuracy of the layer of the neural network when implemented using the second bit precision; and
calculating a difference between the first accuracy and the second accuracy. This claim recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1, thus recites a judicial exception.
The claim does not include any additional elements that amount to an integration of the judicial exceptions into a practical application, nor to significantly more than the judicial exceptions. The claim is not patent eligible.
Regarding claim 4, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein determining the accuracy improvement comprises:
wherein determining the latency degradation comprises:
determining a first latency of the layer of the neural network when implemented using the first bit precision;
determining a second latency of the layer of the neural network when implemented using the second bit precision; and
calculating a difference between the first latency and the second latency. This claim recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1, thus recites a judicial exception.
The claim does not include any additional elements that amount to an integration of the judicial exceptions into a practical application, nor to significantly more than the judicial exceptions. The claim is not patent eligible.
Regarding claim 5, the rejection of claim 1 is further incorporated, and further, the claim recites: determining an impact factor of the layer based on the accuracy improvement and the latency degradation, wherein selecting the first bit precision or the second bit precision is based further on the impact factor. This claim recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1, thus recites a judicial exception.
The claim does not include any additional elements that amount to an integration of the judicial exceptions into a practical application, nor to significantly more than the judicial exceptions. The claim is not patent eligible.
Regarding claim 6, the rejection of claim 1 is further incorporated, and further, the claim recites: further comprising
determining a mixed precision factor for the neural network in which the layer uses the second bit precision and one or more further layers use the first bit precision, wherein selecting the first bit precision or the second bit precision is based further on a comparison between the mixed precision factor and a threshold mixed precision factor. This claim recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1, thus recites a judicial exception.
The claim does not include any additional elements that amount to an integration of the judicial exceptions into a practical application, nor to significantly more than the judicial exceptions. The claim is not patent eligible.
Regarding claim 7, the rejection of claim 1 is further incorporated, and further, the claim recites: further comprising
selecting the second bit precision for use in implementing a further layer of the neural network;
determining a further mixed precision factor for the neural network in which the layer and the further layer use the second bit precision;
comparing the further mixed precision factor and the threshold mixed precision factor; and
selecting the first bit precision for use in implementing the further layer of the neural network based on the further mixed precision factor exceeding the threshold mixed precision factor. This claim recites additional mental steps in addition to the judicial exception identified in the rejection of claim 1, thus recites a judicial exception.
The claim does not include any additional elements that amount to an integration of the judicial exceptions into a practical application, nor to significantly more than the judicial exceptions. The claim is not patent eligible.
Claim 8 recites features similar to claim 1 and is rejected for at least the same reasons therein. Claim 8 additionally requires analysis for A configuration engine, comprising: one or more computer-readable storage media; a processing system coupled to the one or more computer-readable storage media; and program instructions stored on the one or more computer-readable storage media that, based on being read and executed by the processing system, direct the configuration engine to… However, this additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Please see MPEP 2106.05(f).
Regarding claims 9-14, they are substantially similar to claims 2-7 respectively, and are rejected in the same manner, the same art, and reasoning applying.
Claim 15 recites features similar to claim 1 and is rejected for at least the same reasons therein. Claim 15 additionally requires analysis for One or more computer-readable storage media having program instructions stored thereon, wherein the program instructions, when read and executed by a processing system, direct the processing system to… However, this additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Please see MPEP 2106.05(f).
Regarding claims 16-20, they are substantially similar to claims 2-6 respectively, and are rejected in the same manner, the same art, and reasoning applying.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-3, 5, 8-10, 12, 15-17, and 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Darvish Rouhani et al. ("US 20200210840 A1", hereinafter "Darvish").
Regarding claim 1, Darvish teaches A method, comprising:
determining an accuracy improvement of a layer of a neural network implemented using a first bit precision relative to using a second bit precision (“In some examples of the disclosed technology, the training performance metric is based on accuracy or change in accuracy of at least one layer of the trained neural network [¶0162; See also ¶0146; “At process block 1230, a floating-point precision parameter of the neural network is adjusted based on the determined performance metric. The parameters selected to improve the metric in the neural network once the parameter has been adjusted. For example, it is expected that accuracy of the neural network will improved with increased floating-point precision.”])
determining a latency degradation of the layer of the neural network implemented using the first bit precision relative to using the second bit precision (“Traditionally NNs have been trained and deployed using single-precision floating-point (32-bit floating-point or float32 format). However, it has been shown that lower precision floating-point formats, such as 16-bit floating-point (float16) or fixed-point formats can be used to perform inference operations with minimal loss in accuracy. On specialized hardware, such as FPGAs, reduced precision formats can greatly improve the latency and throughput of DNN processing.” [¶0027]); and
selecting, based on the accuracy improvement and the latency degradation, the first bit precision or the second bit precision for use in implementing the layer of the neural network. (“Performance metrics can be determined for the neural network at different points during training processes. Based on determine performance metrics, parameters of the neural network or individual layers of the neural network can be adjusted. One adjustment parameter is floating-point format used to represent values for a layer of a neural network. Values can be converted from normal precision floating-point to block floating-point, from normal precision floating-point to a different normal precision floating-point format, or from a first block floating-point format to a second block floating-point format.” [¶0108])
Regarding claim 2, Darvish teaches The method of claim 1, wherein the first bit precision is based on a first quantization of floating point data to fixed point data of a first length, and wherein the second bit precision is based on a second quantization of the floating point data to fixed point data of a second length that differs relative to the first length. ("A given number can be represented using different precision (e.g., different quantized precision) formats. For example, a number can be represented in a higher precision format (e.g., float32) and a lower precision format (e.g., float16). Lowering the precision of a number can include reducing the number of bits used to represent the mantissa or exponent of the number…” [¶0039])
Regarding claim 3, Darvish teaches The method of claim 1, wherein determining the accuracy improvement comprises:
determining a first accuracy of the layer of the neural network when implemented using the first bit precision;
determining a second accuracy of the layer of the neural network when implemented using the second bit precision; and
calculating a difference between the first accuracy and the second accuracy.
(“As discussed further herein, the training performance metric can be used to determine when to adjust parameters of the neural network. Examples of suitable training performance metrics include: accuracy of a neural network, a change in accuracy of a neural network over two or more training epochs, accuracy of at least one layer of a trained neural network, a change in accuracy of at least one layer of a neural network over two or more training epochs, entropy of at least one layer or all of a neural network, or a change in entropy of at least one layer or all of a neural network.” [¶0112; calculating the first and second accuracy would be implied as part of Darvish’s performance metrics in order to determine the change in accuracy.])
Regarding claim 5, Darvish teaches The method of claim 1, further comprising determining an impact factor of the layer based on the accuracy improvement and the latency degradation, wherein selecting the first bit precision or the second bit precision is based further on the impact factor. (“Parameters of particular BFP formats can be selected for a particular implementation to tradeoff precision and storage requirements. (“impact factor”) For example, rather than storing an exponent with every floating-point number, a group of numbers can share the same exponent. To share exponents while maintaining a high level of accuracy, the numbers should have close to the same magnitude, since differences in magnitude are expressed in the mantissa. If the differences in magnitude are too great, the mantissa will overflow for the large values, or may be zero (“underflow”) for the smaller values. Depending on a particular application, some amount of overflow and/or underflow may be acceptable.” [¶0033])
Claim 8 recites features similar to claim 1 and is rejected for at least the same reasons therein. Claim 8 additionally requires A configuration engine, comprising: one or more computer-readable storage media; a processing system coupled to the one or more computer-readable storage media; and program instructions stored on the one or more computer-readable storage media that, based on being read and executed by the processing system, direct the configuration engine to (Darvish, ¶0023, “Any of the computer-executable instructions for implementing the disclosed techniques, as well as any data created and used during implementation of the disclosed embodiments, can be stored on one or more computer-readable media (e.g., computer-readable storage media)”)
Regarding claims 9-10 and 12, they are substantially similar to claims 2, 3 and 5 respectively, and are rejected in the same manner, the same art, and reasoning applying.
Regarding claim 15, it is substantially similar to claims 1 and 8 respectively, and is rejected in the same manner, the same art, and reasoning applying.
Regarding claims 16-17 and 19, they are substantially similar to claims 2, 3 and 5 respectively, and are rejected in the same manner, the same art, and reasoning applying.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Darvish in view of Li et al. ("US 20230325656 A1", hereinafter "Li").
Regarding claim 4, Darvish teaches The method of claim 1, however fails to explicitly teach wherein determining the latency degradation comprises:
determining a first latency of the layer of the neural network when implemented using the first bit precision;
determining a second latency of the layer of the neural network when implemented using the second bit precision; and
calculating a difference between the first latency and the second latency.
Li teaches wherein determining the latency degradation comprises:
determining a first latency of the layer of the neural network when implemented using the first bit precision;
determining a second latency of the layer of the neural network when implemented using the second bit precision; (“In at least one embodiment, training system 106 calculates a layer's runtime cost (“first/second latency”) based in part on any suitable operations associated with said layer, such as multiply-add operations, multiply-accumulate operations, and/or any suitable mathematical operation and/or sequence of operations.” [¶0070]) and
calculating a difference between the first latency and the second latency. (“In at least one embodiment, training system 106 trains one or more portions (e.g., one or more layers) of neural network 102 by iteratively adjusting precision of weight parameters associated with said one or more portions by at least training neural network 102 utilizing bit-width settings with highest runtime costs, then switching to next less costly bit-width settings after two training epochs, or any suitable number of training epochs, until target bit-width is reached and/or no more bit-width settings remain.” [¶0076; calculating a difference between latencies of each layer is implied given the iteratively adjusting procedure disclosed by Li])
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Darvish’s teachings by determining a latency at each layer of a neural network and adjusting precision settings based off the calculated latencies as taught by Li. One would have been motivated to make this modification in order to improve the amount of memory, time or computing resources a neural network uses. [¶0003, Li]
Regarding claims 11 and 18, they are substantially similar to claim 4 respectively, and is rejected in the same manner, the same art, and reasoning applying.
Claims 6, 7, 13, 14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Darvish in view of Yoo et al. ("US 20220222523 A1", hereinafter "Yoo").
Regarding claim 6, Darvish teaches The method of claim 1, however fails to explicitly teach further comprising determining a mixed precision factor for the neural network in which the layer uses the second bit precision and one or more further layers use the first bit precision, wherein selecting the first bit precision or the second bit precision is based further on a comparison between the mixed precision factor and a threshold mixed precision factor.
Yoo teaches further comprising determining a mixed precision factor for the neural network in which the layer uses the second bit precision and one or more further layers use the first bit precision wherein selecting the first bit precision or the second bit precision is based further on a comparison between the mixed precision factor and a threshold mixed precision factor. (“In an embodiment, the layer-wise precision determination module may store a first threshold value (“mixed precision factor”), which is preset as a reference for a similarity based on which whether the precision of each layer is to be changed is determined and a second threshold value (“threshold mixed precision factor”), which is preset to a value greater than the first threshold value, and the layer-wise precision determination module may be configured to, when the similarity is less than or equal to the first threshold value, change the precision of the corresponding layer to a value higher than the first precision, when the similarity is equal to or greater than the second threshold value, change the precision of the corresponding layer to a value lower than the first precision, and when the similarity is between the first threshold value and the second threshold value, maintain the first precision.” [¶0025])
It would been obvious to one of ordinary skill in the before the effective filing date to modify Darvish’s teachings by implementing a threshold factor in order to determine/select the precision of the corresponding layer as taught by Yoo. One would have been motivated to make this modification to minimize the loss accuracy. [¶0004, Yoo]
Regarding claim 7, Darvish/Yoo teaches The method of claim 6, where Yoo teaches further comprising:
selecting the second bit precision for use in implementing a further layer of the neural network (“In an embodiment, the layer-wise precision determination module may store a first threshold value, which is preset as a reference for a similarity based on which whether the precision of each layer is to be changed is determined and a second threshold value, which is preset to a value greater than the first threshold value” [¶0025]);
determining a further mixed precision factor for the neural network in which the layer and the further layer use the second bit precision (“the layer-wise precision determination module may be configured to, when the similarity is less than or equal to the first threshold value, change the precision of the corresponding layer to a value higher than the first precision” [¶0025]);
comparing the further mixed precision factor and the threshold mixed precision factor (“when the similarity is equal to or greater than the second threshold value” [¶0025]); and
selecting the first bit precision for use in implementing the further layer of the neural network based on the further mixed precision factor exceeding the threshold mixed precision factor (“and when the similarity is between the first threshold value and the second threshold value, maintain the first precision.” [¶0025]).
Same motivation to combine the teachings of Darvish/Yoo as claim 6.
Regarding claims 13 and 14, they are substantially similar to claims 6 and 7 respectively, and are rejected in the same manner, the same art, and reasoning applying.
Regarding claim 20, it is substantially similar to claim 6 respectively, and is rejected in the same manner, the same art, and reasoning applying.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL H HOANG whose telephone number is (571)272-8491. The examiner can normally be reached Mon-Fri 8:30AM-4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL H HOANG/Examiner, Art Unit 2122