Prosecution Insights
Last updated: April 19, 2026
Application No. 18/241,469

4.6-BIT QUANTIZATION FOR FAST AND ACCURATE NEURAL NETWORK INFERENCE

Non-Final OA §101§103
Filed
Sep 01, 2023
Examiner
PEACH, POLINA G
Art Unit
2165
Tech Center
2100 — Computer Architecture & Software
Assignee
Smart Engines Service LLC
OA Round
1 (Non-Final)
50%
Grant Probability
Moderate
1-2
OA Rounds
3y 7m
To Grant
73%
With Interview

Examiner Intelligence

Grants 50% of resolved cases
50%
Career Allow Rate
229 granted / 461 resolved
-5.3% vs TC avg
Strong +23% interview lift
Without
With
+23.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 7m
Avg Prosecution
34 currently pending
Career history
495
Total Applications
across all art units

Statute-Specific Performance

§101
17.9%
-22.1% vs TC avg
§103
49.9%
+9.9% vs TC avg
§102
14.5%
-25.5% vs TC avg
§112
11.2%
-28.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 461 resolved cases

Office Action

§101 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims at a high level recite classifying and marching documents. Step 1: Does the Claim Fall within a Statutory Category? Yes. Claims 1-20 recite a method and a system and therefore, are directed to the statutory class of machine and a product. The USPTO Guidance recites: (1) any judicial exceptions, including certain groupings of abstract ideas (i.e., mathematical concepts, certain methods of organizing human activity such as a fundamental economic practice, or mental processes) (Step 2A, Prong 1); and (2) additional elements that integrate the judicial exception into a practical application (Step 2A, Prong 2). MPEP §§ 2106.04(a), (d). Only if the claim (1) recites a judicial exception and (2) does not integrate that exception into a practical application, do we then look in Step 2B to whether the claim: (3) adds a specific limitation beyond the judicial exception that is not “well-understood, routine, conventional” in the field; or (4) simply appends well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception. MPEP § 2106.05(d). Step 2A, Prong One: Is a Judicial Exception Recited? First, determine whether the claims recite any judicial exceptions, including certain groupings of abstract ideas (i.e., mathematical concepts, certain methods of organizing human activity, or mental processes). MPEP § 2106.04(a). Claim 1 recites - ▪ determine a number of quantization bins for weights and a number of quantization bins for activations that ensure that a product of a quantized weight and a quantized activation coefficient is representable by a signed 8-bit integer ( Abstract Idea of a mental process, see MPEP § 2106.04(a)(2)(III). Under the broadest reasonable interpretation, this limitation is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a mathematical evaluation, which performs the determination, thereby further defining the abstract idea. A human being may use this mathematical calculation to facilitate the mental evaluation in order to arrive at the necessary determination. This claim limitation appears to recite both a mathematical formula and mental process ); ▪ during training of each of the one or more layers to be quantized in the neural network, quantize weights of the layer into the quantization bins for weights ( Abstract Idea of a mental process, see MPEP § 2106.04(a)(2)(III). Under the broadest reasonable interpretation, this limitation is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a mathematical evaluation, which performs the determination, thereby further defining the abstract idea. A human being may use this mathematical calculation to facilitate the mental evaluation in order to arrive at the necessary determination. This claim limitation appears to recite both a mathematical formula and mental process ); These limitations, based on their broadest reasonable interpretation, recite a mental process, i.e. a judicial exception. For these reasons, the independent claim 1, as well as independents claims 10 and 19, which include limitations commensurate in scope with claim 1, recite a judicial exception. A method, like the claimed method, “a process that employs mathematical algorithms to manipulate existing information to generate additional information is not patent eligible.” See Digitech Image Techs, LLC v. Elecs. for Imaging, Inc., 758 F.3d 1344, 1351 (Fed. Cir. 2014). See Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350 (Fed. Cir. 2016) where collecting information, analyzing it, and displaying results from certain results of the collection and analysis was held to be an abstract idea. See In re Meyer, 688 F.2d 789, 795—96 (CCPA 1982), which held that “a mental process that a neurologist should follow” when testing a patient for nervous system malfunctions was not patentable. Accordingly, the claims recite an abstract idea. Step 2A, Prong Two: Is the Abstract Idea Integrated into a Practical Application? Next determine whether the claims recite additional elements that integrate the judicial exception into a practical application (see MPEP §§ 2106.05(a)-(c), (e)-(h)). To integrate the exception into a practical application, the additional claim elements must, for example, improve the functioning of a computer or any other technology or technical field (see MPEP § 2106.05(a)), apply the judicial exception with a particular machine (see MPEP § 2106.05(b)), or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment (see MPEP § 2106.05(e)). Additional elements: ▪ a neural network ( Amount to “Apply it”. Merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, see MPEP § 2106.05(f). Examiner’s note: high level application of using machine learning model to be trained amount to merely invoking a computer component to apply the exception ); ▪ training of each of the one or more layers ( A c omputer function that is well-understood, routine, conventional activities previously known to the industry ). The term “additional elements” for claim features, limitations, or steps that the claim recites beyond the identified judicial exception. Claim 1 recites the additional elements of “ hardware processor ”, claim 1 9 additionally recite “ least one hardware processor; and one or more software modules ” and claim 20 recites “ computer-readable medium .” However, claims do not recite any improvements to these additional elements, nor does the claims recite any particularly programmed or configured computer system, device, or machine learning. Rather, the additional elements in claims 1, 1 9 and 20 serve merely to automate the abstract idea. See Int’l Bus. Machs. Corp. v. Zillow Group , Inc., 50 F. 4" 1371, 1382 (Fed. Cir. 2022) (“[A] patent that ‘automate[s] “pen and paper methodologies” to conserve human resources and minimize errors’ is a ‘quintessential “do it on a computer” patent’ directed to an abstract idea.”) (quoting Univ. of Fla. Rsch. Found., Inc. v. Gen. Elec. Co., 916 F.3d 1363, 1367 (Fed. Cir. 2019)). Therefore, none of these recited additional elements, whether considered individually or in combination, integrates the judicial exception into a practical application. The additional elements listed above that relate to computing components are recited at a high level of generality (i.e., as generic components performing generic computer functions such as communicating and processing known data) such that they amount to no more than mere instructions to apply the exception using generic computing components. Simply implementing the abstract idea on a generic computer is not a practical application of the abstract idea. Additionally, the claims do not purport to improve the functioning of the computer itself. There is no technological problem that the claimed invention solves. Rather, the computer system is invoked merely as a tool. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Therefore, these claims are directed to an abstract idea. Step 2B: The additional elements are not sufficient to amount to significantly more than the judicial exception. For these reasons, independent claim 1, as well as independent claims 1 9 and 20 , which include similar additional elements as claim 1, are directed to an abstract idea. Step 2B: Does the Claim Provide an Inventive Concept? Next, determine whether the claims recite an “inventive concept” that “must be significantly more than the abstract idea itself, and cannot simply be an instruction to implement or apply the abstract idea on a computer.” BASCOM Glob. Internet Servs., Inc. v. AT&T Mobility LLC, 827 F.3d 1341, 1349 (Fed. Cir. 2016); see MPEP § 2106.05(d). There must be more than “computer functions [that] are “well-understood, routine, conventional activit[ies]’ previously known to the industry.” Alice Corp. v. CLS Bank Int'l, 573 U.S. 208, 225 (2014) (second alteration in original) (quoting Mayo Collaborative Servs. v. Prometheus Labs., Inc., 566 U.S. 66, 73 (2012)); see MPEP § 2106.05(d). Step 2B: The additional elements are not sufficient to amount to significantly more than the judicial exception. Additional elements: see MPEP 2106.05(d)(Il). Taking the claim elements separately, the function performed by the computer at each step of the process is purely conventional. Using a computer and associated computer network to obtain data, use data to identify other data, and comparing data, are some of the most basic functions of a computer. All of these computer functions are well-understood, routine, conventional activities previously known to the industry. The method claims do not, for example, purport to improve the functioning of the computer itself. Nor do they effect an improvement in any other technology or technical field. Instead, the claims at issue amount to nothing significantly more than an instruction to apply the abstract idea of displaying, processing and storing data using some unspecified, generic computer . Note, that in similar case, such as Collecting information, analyzing it, and displaying certain results of the collection and analysis (Electric Power Group), the Courts have identified that the additional elements of displaying and analyzing data, as shown in the independent claims 1, 19 , 20 do not amount to significantly more than the judicial exception. Consequently, that is not enough to transform an abstract idea into a patent-eligible invention. No “inventive concept” sufficient to transform the abstract method of organizing human activity into a patent-eligible application. See MPEP § 2106.05. Rather, the additional elements identified above are merely well-understood, conventional computer components, as confirmed by the Specification. See MPEP § 2106.05(d)(1). For example, the Specification refers to the additional elements in generic terms. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements relating to computing components amount to no more than applying the exception using a generic computing components. Mere instructions to apply an exception using a generic computing component cannot provide an inventive concept. Furthermore, the broadest reasonable interpretation of the claimed computer components (i.e., additional elements) includes any generic computing components that are capable of being programmed to communicate and process known data. Additionally, the computer components are used for performing insignificant extra-solution activity and well understood, routine, and conventional functions. For example, the claimed processor and machine learning merely communicates and processes known data. Activities such as these are insignificant extra-solution activity and, therefore, well understood, routine, and conventional. See MPEP 2106.05(d); see also, e.g., OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d at 1363, 115 USPQ2d at 1092-93 (Presenting offers to potential customers and gathering statistics generated based on the testing about how potential customers responded to the offers; the statistics are then used to calculate an optimized price); CyberSource v. Retail Decisions, Inc., 654 F.3d 1366, 1375, 99 USPQ2d 1690, 1694 (Fed. Cir. 2011) (Obtaining information about transactions using the Internet to verify credit card transactions); Ultramercial, Inc. v. Hulu, LLC, 772 F.3d at 715, 112 USPQ2d at 1754 (Consulting and updating an activity log); Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016) (Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display); Apple, Inc. v. Ameranth, Inc., 842 F.3d 1229, 1244, 120 USPQ2d 1844, 1856 (Fed. Cir. 2016) (Recording a customer’s order); Return Mail, Inc. v. U.S. Postal Service, -- F.3d --, -- USPQ2d --, slip op. at 32 (Fed. Cir. August 28, 2017) (Identifying undeliverable mail items, decoding data on those mail items, and creating output data); Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1331, 115 USPQ2d 1681, 1699 (Fed. Cir. 2015) (Arranging a hierarchy of groups, sorting information, eliminating less restrictive pricing information and determining the price). Furthermore, limitations such as integrating account details are well-understood, routine, and conventional activity. See Alice Corp., 134 S. Ct. at 2359, 110 USPQ2d at 1984 (creating and maintaining "shadow accounts"); Ultramercial, 772 F.3d at 716, 112 USPQ2d at 1755 (updating an activity log). Independent system claim 1, 1 9 and 20 contain the identified abstract ideas, with the additional elements of a processor, hardware and the media, which is a generic computer component, and thus not significantly more for the same reasons and rationale above. Dependent claims 2-18 further describe the abstract idea. The additional elements of the dependent claims fail to integrate the abstract idea into a practical application and do not amount to significantly more than the abstract idea. Thus, as the dependent claims remain directed to a judicial exception, and as the additional elements of the claims do not amount to significantly more, the dependent claims are not patent eligible. As such, the claims are not patent eligible. With respect to claims 2: Step 2A Prong 1: the claims recite a judicial exception (an abstract idea) ▪ over a plurality of iterations in an inner loop, accumulating products of matrix multiplication, on blocks representing reordered subsets of values from left and right matrices, into inner accumulators and over one or more iterations of an outer loop, accumulating the accumulated products from the plurality of iterations in the inner loop in outer accumulators ( Abstract Idea of a mental process. Under the broadest reasonable interpretation, the obtaining/determining probability distribution and divergence, as drafted, is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a mathematical evaluation, which performs the determination, thereby further defining the abstract idea. A human being may use this mathematical calculation to facilitate the mental evaluation in order to arrive at the necessary determination. This claim limitation appears to recite both a mathematical formula and mental process . ) Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2 : The judicial exception is not integrated into a practical application. Additional elements: the accumulators are a generic computer component. Therefore, the additional element does not amount to an inventive concept, particularly when the activity is well understood or conventional (MPEP 2106.05(d)). With respect to claims 3, 7 : ▪ wherein each of the inner accumulators is 16 bits ; wherein each of the outer accumulators is 32 bits ( a mathematical evaluation, which performs the determination, thereby further defining the abstract idea. A human being may use this mathematical calculation to facilitate the mental evaluation in order to arrive at the necessary determination. This claim limitation appears to recite both a mathematical formula and mental process ) . Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2 : The judicial exception is not integrated into a practical application. Additional elements: accumulator is a generic computer component. Therefore, the additional element does not amount to an inventive concept, particularly when the activity is well understood or conventional (MPEP 2106.05(d)). With respect to claims 4-6, 8 : ▪ wherein each of the inner accumulators is stored in a register of the at least one hardware processor and wherein the at least one hardware processor comprises an Advanced Reduced Instruction Set Computer Machine (ARM) processor ; wherein the products of matrix multiplication are computed using Single Instruction, Multiple Data (SIMD) instructions ; wherein each of the outer accumulators is stored in a Level 1 (Li) cache of the at least one hardware processor ( Amount to mere instruction to apply the abstract idea using a generic computer component. A mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f). ) Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2 : The judicial exception is not integrated into a practical application. Additional elements: the processor is merely generic computer component. Therefore, the additional element does not amount to an inventive concept, particularly when the activity is well understood or conventional (MPEP 2106.05(d)). With respect to claims 9 - 10 : ▪ wherein the outer loop comprises a plurality of iterations, and wherein a number of the plurality of iterations in the outer loop is limited to no more than 258 iterations ( is an abstract idea of “a mental process” because it recites a process using an algorithm of language processing (software). A human is able to perform a plurality of iterations over computations ) ; ▪ using the at least one hardware processor to, after the training, store the quantized weights of the one or more layers ( i.e. additional generic mathematical calculations and do not represent significantly more than the abstract idea. See at least MPEP § 2106.05(a) ("Improvements to the Functioning of a Computer or to Any Other Technology or Technical Field") ) ; Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2 : The judicial exception is not integrated into a practical application. Additional elements: the additional element listed above in step 2A Prong 2 is merely instructions to be implemented on a generic computer component. Therefore, the additional element does not amount to an inventive concept, particularly when the activity is well understood or conventional (MPEP 2106.05(d)). With respect to claim 1 1 : ▪ after the training, deploy the neural network, as a quantized neural network, including the quantized weights, to an application for execution on a mobile or embedded device ( Running / executing software is a generic computer functions of receiving and processing that are well-understood, routine, and conventional activities previously known to the industry. ) Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2 : The judicial exception is not integrated into a practical application. Additional elements: the additional element listed above in step 2A Prong 2 is merely instructions to be implemented on a generic computer component (i.e. neural network ) . Therefore, the additional element does not amount to an inventive concept, particularly when the activity is well understood or conventional (MPEP 2106.05(d)). With respect to claim 1 2 : ▪ u sing the at least one hardware processor to quantize each of the one or more layers to be quantized in the neural network by: training the layer without quantizing inputs to the layer; collecting a histogram of the inputs to the layer; determining input quantization parameters that minimize quantization error for the inputs to the layer based on the histogram; quantizing the inputs to the layer channel by channel using the input quantization parameters; determining weight quantization parameters that minimize quantization error for weights of the layer; quantizing the weights of the layer filter by filter using the weight quantization parameters; and quantize a bias of the layer based on one or both of the input quantization parameters and the weight quantization parameters ( is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion. Here, an additional mathematical /logical reasoning ). Step 2A Prong 2: the additional elements that are not sufficient to integrate the judicial exception into a practical application. Additional elements: hardware processor ( Amount to mere instruction to apply the abstract idea using a generic computer component. A mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f). ) Step 2B: the additional element is not sufficient to amount to significantly more than the judicial exception. With respect to claim s 1 3-14, 16-18 : ▪ input quantization parameters are frozen after being determined ; wherein the weights and the bias of the layer are not frozen during the quantization ; wherein the number of quantization bins for weights is between 9 and 37 ; wherein the number of quantization bins for weights is between 13 and 29 ; wherein, for the one or more layers to be quantized in the neural network, a pairing (Nm, Nx) of the number (Nm) of quantization bins for weights and the number (Nx) of quantization bins for activations is one of:(127,5); (85, 7); (63, 9); (51,11); (43,13); (37,15); (31,17); (29,19); (25, 21); (23, 23); (21, 25); (19,29); (17, 31); (15, 37); (13,43);(11, 51); (9,63); (7,85); and (5,127) ( Abstract Idea of a mental process. Under the broadest reasonable interpretation, the obtaining/determining probability distribution and divergence, as drafted, is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — a user can manually determine necessary parameters and weights ). Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2 : The judicial exception is not integrated into a practical application. Additional elements: the additional element listed above in step 2A Prong 2 is merely instructions to be implemented on a generic computer component. Therefore, the additional element does not amount to an inventive concept, particularly when the activity is well understood or conventional (MPEP 2106.05(d)). With respect to claim 15 : ▪ using the at least one hardware processor to, during quantization of each of the one or more layers to be quantized in the neural network, fine-tune the layer: after quantizing the inputs to the layer and before determining the weight quantization parameters; after quantizing the weights of the layer and before quantizing the bias of the layer; and after quantizing the bias of the layer ( Abstract Idea of a mental process. Under the broadest reasonable interpretation, the obtaining/determining probability distribution and divergence, as drafted, is an abstract idea of “a mental process” because it recites a process that can be performed in the human mind (i.e., observation, determination, evaluation, judgment, and opinion) — Note: under the broadest reasonable interpretation of the claim, the claimed invention encompasses mathematical concept (e.g., Mathematical Formula or Equations) and Generic computer implementation does not provide significantly more than the abstract idea. Amount to no more than mere instructions to apply the abstract idea using a generic computer component- see MPEP 2106.05(f)) ). Step 2A Prong 1: The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. Step 2A Prong 2 : The judicial exception is not integrated into a practical application. Additional elements: the additional element listed above in step 2A Prong 2 is merely instructions to be implemented on a generic computer component. Therefore, the additional element does not amount to an inventive concept, particularly when the activity is well understood or conventional (MPEP 2106.05(d)). Dependent claims 2- 18 are thus, also patent ineligible for the reasons discussed above. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Iandola et al. ( US 2020/0074304 ) in view of the applicant’s admitted prior art JACOB B. et al. “ Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference ” (see IDS 04/19/2024 #20 and attached with page numbers) . Regarding claim 1, Iandola teaches a method comprising using at least one hardware processor to ([0018]) , for one or more layers to be quantized in a neural network ([0035]) : determine a number of quantization bins for weights and a number of quantization bins for activations ([0030], [0048]-[0050]) that ensure that a product of a quantized weight ([0046]) and a quantized activation coefficient is representable by a signed 8-bit integer ([0031], [0057]); and during training of each of the one or more layers to be quantized in the neural network, quantize weights of the layer into the quantization bins for weights ( [0041]- [0043]). Iandola does not explicitly teach, however JACOB discloses during training of each of the one or more layers to be quantized in the neural network, quantize weights of the layer into the quantization bins for weights (p.2707 C2 ¶3 – p.2708 C1 -C2L1-20 ) . It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola to quantize weights of the layer into the quantization bins during training as disclosed by JACOB . Doing so would improv e the inference speed vs accuracy tradeoff on mobile CPUs ( JACOB p.2705 C2L5-6 ). Regarding claim 19, Iandola teaches a system comprising: at least one hardware processor; and one or more software modules that are configured to, when executed by the at least one hardware processor, for one or more layers to be quantized in a neural network, determine a number of quantization bins for weights and a number of quantization bins for activations that ensure that a product of a quantized weight and a quantized activation coefficient is representable by a signed 8-bit integer, and during training of each of the one or more layers to be quantized in the neural network, quantize weights of the layer into the quantization bins for weights. Claim 19 recites substantially the same limitations as claim 1, and is rejected for substantially the same reasons. Regarding claim 20, Iandola teaches a non-transitory computer-readable medium having instructions stored therein, wherein the instructions, when executed by a processor, cause the processor to, for one or more layers to be quantized in a neural network: determine a number of quantization bins for weights and a number of quantization bins for activations that ensure that a product of a quantized weight and a quantized activation coefficient is representable by a signed 8-bit integer; and during training of each of the one or more layers to be quantized in the neural network, quantize weights of the layer into the quantization bins for weights. Claim 20 recites substantially the same limitations as claim 1, and is rejected for substantially the same reasons. Claims 2-8, 10-11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Iandola as modified and in further view of view of Pillai et al. ( US 20200026745 ) and Jiang et al. ( US 11475102 ) . Regarding claim 2 , Iandola as modified does not explicitly teach, however Pillai discloses t he method of Claim 1, wherein matrix multiplication in each of the one or more layers comprises: over a plurality of iterations in an inner loop ([0148]) , accumulating products of matrix multiplication ([0141]) , on blocks representing reordered subsets of values from left and right matrices, into inner accumulators ([0144]-[0146]) ; and over one or more iterations of an outer loop, accumulating the accumulated products from the plurality of iterations in the inner loop in outer accumulators ([0148] , [0182] , F7 ) . It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola to include matrix multiplication s as disclosed by Pillai. Doing so allows hardware to be more flexible, and thus have maximum utilization under more conditions than other solutions (Pillai [0176]). Jiang additionally discloses over a plurality of iterations in an inner loop, accumulating products of matrix multiplication (C11L1-16), on blocks representing reordered subsets of values from left and right matrices, into inner accumulators (C9L30-49, 62-65); and over one or more iterations of an outer loop, accumulating the accumulated products from the plurality of iterations in the inner loop in outer accumulators (C9L7-29, C10L8-24) . It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola to include matrix multiplications as disclosed by Jiang. Doing so improv es performance of a deep neural network (Jiang C1L29 ). Regarding claim 3 , Iandola as modified teaches the method of Claim 2, wherein each of the inner accumulators is 16 bits ( Pillai [0116] , JACOB p.270 6 C 2 ). Regarding claim 4 , Iandola as modified teaches the method of Claim 3, wherein each of the inner accumulators is stored in a register of the at least one hardware processor ( Pillai [01 80 ], [0 05 2] ) . Regarding claim 5 , Iandola as modified teaches the method of Claim 4, wherein the at least one hardware processor comprises an Advanced Reduced Instruction Set Computer Machine (ARM) processor ( JACOB p.2705 C 1, p.2707 C1 ¶2.4 last para, Pillai [0128] ). Regarding claim 6 , Iandola as modified teaches the method of Claim 5, wherein the products of matrix multiplication are computed using Single Instruction, Multiple Data (SIMD) instructions ( Pillai [0116], [0126] ) . Regarding claim 7 , Iandola as modified teaches the method of Claim 3, wherein each of the outer accumulators is 32 bits ( Pillai [0116]-[0117] , JACOB p.2706 C2 , p.2707 C2 L5 ). Regarding claim 8 , Iandola as modified teaches the method of Claim 7, wherein each of the outer accumulators is stored in a Level 1 (Li) cache of the at least one hardware processor ( Pillai [0125], [0133] ). Regarding claim 10 , Iandola as modified teaches the method of Claim 1, further comprising using the at least one hardware processor to, after the training, store the quantized weights of the one or more layers ( Iandola [0054], JACOB p.270 8 C 1) . Regarding claim 11 , Iandola as modified teaches the method of Claim 10, further comprising using the at least one hardware processor to, after the training, deploy the neural network, as a quantized neural network ( Iandola [0028], [0041]) , including the quantized weights, to an application for execution on a mobile or embedded device ( JACOB p.27 1 0 C 1) . Claim 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Iandola as modified and in further view of view of Wang et al. ( US 20180321347 ) and Rivet-Sabourin et al. ( US 9607241 ) . Regarding claim 9, Iandola as modified does not explicitly teach, however HA and Rivet-Sabourin disclose the method of Claim 2, wherein the outer loop comprises a plurality of iterations, and wherein a number of the plurality of iterations in the outer loop is limited to no more than 258 iterations (Wang [0118], [0152], [0158] , Rivet-Sabourin C10L61-67) . It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola to limiting a number of the plurality of iterations in the outer loop as disclosed by Wang and Rivet-Sabourin. Doing so prevents infinite loops . Claims 12-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Iandola as modified and in further view of view of HA et al. (US 20210201117 ) and GAO ( US 20200082269 ) . Regarding claim 12, Iandola as modified teach es , the method of Claim 1, further comprising using the at least one hardware processor to quantize each of the one or more layers to be quantized in the neural network by: training the layer without quantizing inputs to the layer; collecting a histogram of the inputs to the layer; determining input quantization parameters that minimize quantization error for the inputs to the layer based on the histogram; quantizing the inputs to the layer channel by channel using the input quantization parameters ( Iandola [00 41 ], [00 4 5] -[0046] ); determining weight quantization parameters that minimize quantization error for weights of the layer ( JACOB p.2707 C2 ); quantizing the weights of the layer filter by filter using the weight quantization parameters ( Iandola [0041], [00 43 ] , [0057], [0074] ); and quantize a bias of the layer based on one or both of the input quantization parameters and the weight quantization parameters ( Iandola [0046], [00 49 ] -[0050], [0063] , [0065] “ each element in the filters when quantized, may have a minimum to maximum range of integer values ” ). Iandola as modified does not explicitly teach, however HA discloses training the layer without quantizing inputs to the layer ([0081], [0088], [0098], [0133]); collecting a histogram of the inputs to the layer ([0082], [0099]); determining input quantization parameters that minimize quantization error for the inputs to the layer based on the histogram ([0083], [0096], [0106], [0109]); quantizing the inputs to the layer channel by channel using the input quantization parameters ([0092], [0105]); determining weight quantization parameters that minimize quantization error for weights of the layer ([0109], [0149], [0155]); quantizing the weights of the layer filter by filter using the weight quantization parameters ([0098], [0118]-[0119], [0150]-[0151]); and quantize a bias of the layer based on one or both of the input quantization parameters and the weight quantization parameters ( [0064], [0066], [0106]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola to include training the layer without quantizing as disclosed by HA . Doing so allows for quantization accuracy to be significantly increase d ( HA [0150]) . Still, if GAO Iandola as modified does not explicitly teach, however HA discloses training the layer without quantizing inputs to the layer ([0034]-[0037]) and quantize a bias of the layer based on one or both of the input quantization parameters and the weight quantization parameters ([0024], [0026] , [0033] ). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola to include training the layer without quantizing as disclosed by GAO. Doing so provides be faster and more efficient neural network (GAO [00 44 ]). Regarding claim 13 , Iandola as modified teaches the method of Claim 12, wherein the input quantization parameters are frozen after being determined ( JACOB p.270 8 C2 L7-22 , GAO [0039], [0050] , GAO [00 39 ] , [0053] ) . Regarding claim 14 , Iandola as modified teaches the method of Claim 12, wherein the weights and the bias of the layer are not frozen during the quantization ( JACOB F1.1, p.2707 ¶2.4, GAO [00 39 ] , [0052]-[0053]) . Regarding claim 15 , Iandola as modified teaches the method of Claim 12, further comprising using the at least one hardware processor to, during quantization of each of the one or more layers to be quantized in the neural network, fine-tune the layer (GAO [0052]) : after quantizing the inputs to the layer and before determining the weight quantization parameters; after quantizing the weights of the layer and before quantizing the bias of the layer ( JACOB p.2708 C1) ; and after quantizing the bias of the layer ( JACOB p.2708 C1 , p.2709 C1 ¶3.2, GAO [0042] , [0046]-[0048] , [0052]-[0053] ) . Regarding claim 16 , Iandola as modified teaches the method of Claim 1, wherein the number of quantization bins for weights is between 9 and 37 ( Iandola [0042]-[0045], GAO [0034]) . Regarding claim 17 , Iandola as modified teaches the method of Claim 1, wherein the number of quantization bins for weights is between 13 and 29 ( Iandola [0042]-[0045], GAO [0034]) . Regarding claim 18 , Iandola as modified teaches the method of Claim 1, wherein, for the one or more layers to be quantized in the neural network, a pairing (Nm, Nx) of the number (Nm) of quantization bins for weights and the number (Nx) of quantization bins for activations is one of:(127,5); (85, 7); (63, 9); (51,11); (43,13); (37,15); (31,17); (29,19); (25, 21); (23, 23); (21, 25); (19,29); (17, 31); (15, 37); (13,43);(11, 51); (9,63); (7,85); and (5,127) ( Iandola [0042]-[0045], [0052], [0057], Benoit p.2708 C2L1-4 , GAO [0034] ). Claims 11, 1 3 -18 is/are additionally and alternatively rejected under 35 U.S.C. 103 as being unpatentable over Iandola as modified and in further view of view of Karabutov et al. ( US 20230106778 ) . Regarding claim 11, Iandola as modified teaches the method of Claim 10, as disclosed above, Karabutov further teaches further comprising using the at least one hardware processor to, after the training, deploy the neural network, as a quantized neural network, including the quantized weights, to an application for execution on a mobile or embedded device ( [0058], [0087], [0089] ). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola as modified to including the quantized weights, to an application for execution on a mobile or embedded devic e as disclosed by Karabutov. Doing so would minimizing the complexity of the compression process may be of great importance given that mobile devices also have limitations on energy consumption and hardware resources (Karabutov [00 04 ]). Regarding claim 13, Iandola as modified teaches the method of Claim 12, as disclosed above, Karabutov further teaches wherein the input quantization parameters are frozen after being determined ([009 4 ] - [00 9 5], [009 7 ] - [00 99 ]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola as modified to freeze quantization parameters as disclosed by Karabutov. Doing so would increase network accuracy (Karabutov [00 96 ]). Regarding claim 14, Iandola as modified teaches the method of Claim 12, as disclosed above, Karabutov further teaches wherein the weights and the bias of the layer are not frozen during the quantization ([00 7 3], [00 81 ] , [00 84 ] , [0094] ). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola as modified to not freeze quantization parameters as disclosed by Karabutov. Doing so would increase network accuracy despite possibly increased quantization error (Karabutov [0096]). Regarding claim 15, Iandola as modified teaches the method of Claim 12 as disclosed above , Karabutov further teaches during quantization of each of the one or more layers to be quantized in the neural network, fine-tune the layer ( 0052]): after quantizing the inputs to the layer and before determining the weight quantization parameters ([0071]-[0072], [0076]) ; after quantizing the weights of the layer and before quantizing the bias of the layer ([0025], [007], [0131] ); and after quantizing the bias of the layer ( [0115]-[0116] ). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola as modified to include d etermining the weight quantization parameters and after quantizing the weights of the layer and before quantizing the bias of the layer as disclosed by Karabutov. Doing so optimiz e the quantization levels and/or thresholds for a certain set of training data and minimize cost function (Karabutov [00 73 ]). Regarding claim 16, Iandola as modified teaches the method of Claim 1 as disclosed above, Karabutov further teaches, wherein the number of quantization bins for weights is between 9 and 37 ([00 6 4 ] -[00 66 ], [00 70 ] -[0072], [0077] , [0102], [0143] ). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola as modified to include quantization bins for weights as disclosed by Karabutov. Doing so would increase network accuracy (Karabutov [0096]). Regarding claim 17, Iandola as modified teaches the method of Claim 1 as disclosed above, Karabutov further teaches, wherein the number of quantization bins for weights is between 13 and 29 ([0064]-[0066], [0070]-[0072], [0077], [0102] , [0143]) . It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola as modified to include quantization bins for weights as disclosed by Karabutov. Doing so would increase network accuracy (Karabutov [0096]). Regarding claim 18, Iandola as modified teaches the method of Claim 1, as disclosed above, Karabutov further teaches wherein, for the one or more layers to be quantized in the neural network, a pairing (Nm, Nx) of the number (Nm) of quantization bins for weights and the number (Nx) of quantization bins for activations is one of:(127,5); (85, 7); (63, 9); (51,11); (43,13); (37,15); (31,17); (29,19); (25, 21); (23, 23); (21, 25); (19,29); (17, 31); (15, 37); (13,43);(11, 51); (9,63); (7,85); and (5,127) ([0064]-[0066], [0070]-[0072], [0077], [0143] -[0144] ) . It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Iandola as modified to include quantization bins for weights as disclosed by Karabutov. Doing so would increase network accuracy and facilitate efficient operation of a distributed neural network (Karabutov [0096] , [0133]). Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is indicated on PTO-892. Any inquiry concerning this communication or earlier communications from the examiner should be directed to FILLIN "Examiner name" \* MERGEFORMAT POLINA G PEACH whose telephone number is FILLIN "Phone number" \* MERGEFORMAT (571)270-7646 . The examiner can normally be reached FILLIN "Work Schedule?" \* MERGEFORMAT Monday-Friday, 9:30 - 5:30 . Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FILLIN "SPE Name?" \* MERGEFORMAT Aleksandr Kerzhner can be reached at FILLIN "SPE Phone?" \* MERGEFORMAT 571-270-1760 . The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /POLINA G PEACH/ Primary Examiner, Art Unit 2165 March 16, 2026
Read full office action

Prosecution Timeline

Sep 01, 2023
Application Filed
Mar 16, 2026
Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596921
Stochastic Bitstream Generation with In-Situ Function Mapping
2y 5m to grant Granted Apr 07, 2026
Patent 12585998
DETERMINING QUALITY OF MACHINE LEARNING MODEL OUTPUT
2y 5m to grant Granted Mar 24, 2026
Patent 12585632
METHOD, DEVICE, AND MEDIUM FOR MANAGING ACTIVITY DATA WITHIN AN APPLICATION
2y 5m to grant Granted Mar 24, 2026
Patent 12579191
IDENTIFYING SEARCH RESULTS IN A HISTORY REPOSITORY
2y 5m to grant Granted Mar 17, 2026
Patent 12572575
USING LARGE LANGUAGE MODELS TO GENERATE SEARCH QUERY ANSWERS
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
50%
Grant Probability
73%
With Interview (+23.2%)
3y 7m
Median Time to Grant
Low
PTA Risk
Based on 461 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month