Prosecution Insights
Last updated: April 19, 2026
Application No. 18/129,165

METHODS AND SYSTEMS FOR EVALUATING QUANTIZABILITY OF A COMPUTATION GRAPH

Non-Final OA §102§103§112
Filed
Mar 31, 2023
Examiner
BOSTWICK, SIDNEY VINCENT
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
Aptiv Technologies AG
OA Round
1 (Non-Final)
52%
Grant Probability
Moderate
1-2
OA Rounds
4y 7m
To Grant
90%
With Interview

Examiner Intelligence

Grants 52% of resolved cases
52%
Career Allow Rate
71 granted / 136 resolved
-2.8% vs TC avg
Strong +38% interview lift
Without
With
+38.2%
Interview Lift
resolved cases with interview
Typical timeline
4y 7m
Avg Prosecution
68 currently pending
Career history
204
Total Applications
across all art units

Statute-Specific Performance

§101
24.4%
-15.6% vs TC avg
§103
40.9%
+0.9% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
21.9%
-18.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 136 resolved cases

Office Action

§102 §103 §112
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Detailed Action This action is in response to the claims filed 3/31/2023: Claims 1 – 15 are pending. Claims 1 is independent. Claim Objections Claim 3 objected to because of the following informalities: "the method claim 1" should read "the method of claim 1". Appropriate correction is required. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. Claims 4-6 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Regarding claim 4, "the user-provided reference" lacks antecedent basis. "A user-provided reference" is recommended. Regarding claim 5, "wherein the performance metric comprises an Euclidean metric or a Manhattan metric between a number of detections in an output [...] and an output" is indefinite. First, it's grammatically unclear whether or not the claim limitation should be read "An Euclidean metric between a number of detections [...] or a Manhattan metric between a number of detections" or if the Manhattan metric is solely between a number of detections in an output. Second, it's grammatically unclear what the metric is actually computed over and what the second operand is intended to be. Specifically, Euclidean and Manhattan differences are computed over two things of the same type, however, the claim appears to compute a distance between "a number of detections in an output of the computational graph" (scalar) to "an output of the computational graph" (one of ordinary skill in the art would expect a vector) such that it would be unclear how a Euclidean distance could be determined between a scalar and a vector. Examiner recommends further clarifying the language to either specify that the comparison is to "a number of detections in an output of the computational graph to which the quantization is not applied" or simply "an output of the computational graph to which the quantization is applied". Regarding claim 6, "the quantization" lacks antecedent basis. Claim 1 introduces "a quantization scheme" however, it's clear from the language in claim 6 that the quantization scheme is comprised by quantization, where the quantization has not been introduced. Similarly, if "the quantization" reads on "quantizing" introduced in claim 1, Examiner notes that claim 1 performs quantizing "for each of the nodes in the subset and for each of the at least one quantization parameter" such that it would be unclear what respective node "the quantization" was with respect to. "A quantization" is recommended. Claim Rejections - 35 USC § 102 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claims 1, 3-6, 9-15 are rejected under U.S.C. §102(a)(1) as being unpatentable over the combination of Qadeer (US20210174172A1). PNG media_image1.png 660 446 media_image1.png Greyscale FIG. 1 of (US20210174172A1) Regarding claim 1, Qadeer teaches A computer implemented method ([¶0066] "The systems and methods described herein can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions") for evaluating a performance metric of a computational graph,([¶0006] "a method S100 for quantizing artificial neural networks includes: accessing a floating-point network including a set of floating-point layers in Block S110; and accessing a set of validation examples for the floating-point network in Block S112. [...] calculating an accuracy of the quantized network based on the set of validation examples in Block S170" floating-point network interpreted as synonymous with computational graph. Validation examples interpreted as performance metrics for evaluating accuracy.) the method comprising: acquiring an input data set associated with the computational graph ([¶0006] " accessing a set of validation examples for the floating-point network in Block S112." Accessing set of validation examples interpreted as synonymous with acquiring an input data set associated with the computational graph) comprising a plurality of nodes, ([¶0010] "each layer includes a set of pretrained weights and/or biases represented as floating-point numbers" See FIG. 1 S112 where each point with a corresponding connecting edge is interpreted as a node in the floating-point network) and a quantization scheme defined by at least one quantization parameter;([¶0012] "the system can convert each set of floating-point weights for each floating-point layer of the floating-point network in a set of low-bit-width weights. Therefore, the quantized network initially includes a set of low-bit-width layers, wherein each low-bit-width layer includes a set of weights represented as a low-bit-width fixed-point number " bit-width interpreted as a quantization parameter) processing the input data set by feeding it into the computational graph;([¶0013] "the system can calculate a set of example input activations for each floating-point layer and a set of example output activations for each floating-point layer by executing the floating-point network on the set of validation examples and recording the resulting input activations and output activations for each floating-point layer") extracting data output of at least a non-empty subset of the plurality of nodes of the computational graph;([¶0013] "the system can calculate a set of example input activations for each floating-point layer and a set of example output activations for each floating-point layer by executing the floating-point network on the set of validation examples and recording the resulting input activations and output activations for each floating-point layer" calculating a set of example output activations interpreted as synonymous with extracting data output of at least a non-empty subset of the plurality of nodes of the computational graph) evaluating the performance metric of the computational graph by performing for each of the nodes in the subset and for each of the at least one quantization parameter the steps of: varying a value of the respective quantization parameter; quantizing the respective node based on the value of the respective quantization parameter; and evaluating the computational graph based on the quantized respective node. ([¶0012] "The system can selectively increase the bit-width (e.g., to sixteen-bit fixed-point numbers), and therefore resolution, of specific low-bit-width layers on a per-layer basis within the quantized network until this hybrid quantized network satisfies the user-provided loss-of-accuracy threshold" As previously indicated each layer is a set of nodes (as would be understand by one of ordinary skill in the art and is explicitly reinforced by Qadeer) such that quantizing the layer necessarily quantizes respective nodes of the layer based on the bit-width which is iteratively increased in Qadeer and evaluated based on accuracy.). Regarding claim 3, Qadeer teaches The method claim 1, wherein performance metric is based on a) a user-provided reference; or evaluation of the computational graph using the input data set; and(Qadeer [¶0013] "the system can calculate a set of example input activations for each floating-point layer and a set of example output activations for each floating-point layer by executing the floating-point network on the set of validation examples and recording the resulting input activations and output activations for each floating-point layer") b) evaluation of the computational graph based on the quantization of the respective node.(Qadeer [¶0012] "The system can selectively increase the bit-width (e.g., to sixteen-bit fixed-point numbers), and therefore resolution, of specific low-bit-width layers on a per-layer basis within the quantized network until this hybrid quantized network satisfies the user-provided loss-of-accuracy threshold" As previously indicated each layer is a set of nodes (as would be understand by one of ordinary skill in the art and is explicitly reinforced by Qadeer) such that quantizing the layer necessarily quantizes respective nodes of the layer based on the bit-width which is iteratively increased in Qadeer and evaluated based on accuracy.). Regarding claim 4, Qadeer teaches The method of claim 1, wherein the performance metric compares: a part of the evaluation of the computational graph based on the quantization of the respective node to the evaluation of the computational graph using the input data set; (Qadeer [¶0012] "The system can selectively increase the bit-width (e.g., to sixteen-bit fixed-point numbers), and therefore resolution, of specific low-bit-width layers on a per-layer basis within the quantized network until this hybrid quantized network satisfies the user-provided loss-of-accuracy threshold" As previously indicated each layer is a set of nodes (as would be understand by one of ordinary skill in the art and is explicitly reinforced by Qadeer) such that quantizing the layer necessarily quantizes respective nodes of the layer based on the bit-width which is iteratively increased in Qadeer and evaluated based on accuracy.) and the rest of evaluation of the computational graph based on the quantization of the respective node to the user-provided reference.(Qadeer [¶0009] "iteratively (in the sorted order) increase the bit-width of each successive quantized layer until the quantized network satisfies a loss-of-accuracy threshold specified by a user." User specified accuracy threshold interpreted as user-provided reference). Regarding claim 5, Qadeer teaches The method of claim 1, wherein the at least one quantization parameter is user defined; and/or wherein the subset is user defined; and/or wherein the performance metric is user defined; and/or wherein the performance metric comprises an Euclidean metric or a Manhattan metric between a number of detections in an output of the computational graph to which the quantization is applied and an output of the computational graph to which the quantization is not applied.(Qadeer [¶0023] "the system can receive and execute a more complex accuracy measure defined as a statistical similarity (between the output vectors or matrices/tensors) of the quantized network and the expected output vectors (or matrices/tensors) provided in the set of validation examples. The system can, therefore, receive and execute accuracy measures such as cosine similarity, Euclidian distance, or Manhattan distance"). Regarding claim 6, Qadeer teaches The method of claim 1, wherein the quantization comprises a symmetrical quantization scheme or an asymmetrical quantization scheme; and/or wherein the quantization indicates at least one clipping limit and a quantization step size.(Qadeer [¶0033] "Based on these weight statistics, the system can calculate a low-bit-width Q-format representation for the set of floating-point weights and/or a scaling factor for the set of floating-point weights in order to best represent these floating-point weights as low-bit-width fixed-point weights" Quantization with scaling factor interpreted as symmetrical quantization scheme). Regarding claim 9, Qadeer teaches The method of claim 1, further comprising: determining for each node of the subset a number of quantization bits.(Qadeer [¶0009] "the system generates a hybrid quantized network characterized by a select set of layers represented at a high bit-width (e.g., sixteen-bit fixed-point), while most layers of the network are represented at a low-bit-width (e.g., eight-bit fixed-point)."). Regarding claim 10, Qadeer teaches The method of claim 1, the method further comprising: quantizing the computational graph based on the performance metric.(Qadeer [¶0012] "The system can selectively increase the bit-width (e.g., to sixteen-bit fixed-point numbers), and therefore resolution, of specific low-bit-width layers on a per-layer basis within the quantized network until this hybrid quantized network satisfies the user-provided loss-of-accuracy threshold" As previously indicated each layer is a set of nodes (as would be understand by one of ordinary skill in the art and is explicitly reinforced by Qadeer) such that quantizing the layer necessarily quantizes respective nodes of the layer based on the bit-width which is iteratively increased in Qadeer and evaluated based on accuracy.). Regarding claim 11, Qadeer teaches The method of claim 10, further comprising: deploying the quantized computational graph on a resource-constrained embedded system.(Qadeer [¶0009] "the quantized network is executable on specialized hardware at edge devices with limited memory and/or power supply" [¶0053] " the system can, in response to the accuracy of the quantized network exceeding the loss-of-accuracy threshold, load the quantized network onto an edge device in Block S198" Edge device interpreted as resource-constrained embedded system). Regarding claim 12, Qadeer teaches The method of claim 11, wherein the embedded system is a mobile computing device, a mobile phone, a tablet computing device, or a vehicle.(Qadeer [¶0066] "The instructions can be executed by computer-executable components integrated with the application, applet, host, server, network, website, communication service, communication interface, hardware/firmware/software elements of a user computer or mobile device, wristband, smartphone, or any suitable combination thereof."). Regarding claim 13, Qadeer teaches A method for evaluating deployability of a computational graph, the method comprising evaluating a performance metric of the computational graph based on the method of claim 1, wherein the deployability is evaluated based on the performance metric.(Qadeer [¶0031] "the system initially quantizes the floating-point network into an entirely low-bit-width network, which provides substantially improved performance (e.g., inferences per-second, memory footprint) over a floating-point network when executed on application specific hardware or on any device with stringent power requirements or storage space" Measuring inferences per second, memory footprint, peak power, total energy, etc. interpreted as evaluating deployability based on the performance metric (validation set used for bit-width determination and accuracy validation)). Regarding claim 14, Qadeer teaches A computer system comprising a plurality of computer hardware components configured to carry out steps of the method of claim 1.(Qadeer [¶0066] "The systems and methods described herein can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions. The instructions can be executed by computer-executable components integrated with the application, applet, host, server, network, website, communication service, communication interface, hardware/firmware/software elements of a user computer or mobile device, wristband, smartphone, or any suitable combination thereof. Other systems and methods of the embodiment can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions. The instructions can be executed by computer-executable components integrated by computer-executable components integrated with apparatuses and networks of the type described above. The computer-readable medium can be stored on any suitable computer readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device. The computer-executable component can be a processor but any suitable dedicated hardware device can (alternatively or additionally) execute the instructions."). Regarding claim 15, Qadeer teaches Non-transitory computer readable medium comprising instructions for carrying out the method of claim 1.(Qadeer [¶0066] "The systems and methods described herein can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions. The instructions can be executed by computer-executable components integrated with the application, applet, host, server, network, website, communication service, communication interface, hardware/firmware/software elements of a user computer or mobile device, wristband, smartphone, or any suitable combination thereof. Other systems and methods of the embodiment can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions. The instructions can be executed by computer-executable components integrated by computer-executable components integrated with apparatuses and networks of the type described above. The computer-readable medium can be stored on any suitable computer readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device. The computer-executable component can be a processor but any suitable dedicated hardware device can (alternatively or additionally) execute the instructions."). Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 2, 7, and 8 are rejected under U.S.C. §103 as being unpatentable over the combination of Qadeer and Burger (US20190340499A1). Regarding claim 2, Qadeer teaches The method of claim 1. However, Qadeer doesn't explicitly teach wherein evaluating the performance metric further comprises dequantizing the quantized respective node. Burger, in the same field of endeavor, teaches The method of claim 1, wherein evaluating the performance metric further comprises dequantizing the quantized respective node. ([¶0050] "The quantized values can then be converted back to a normal-floating-point format using a quantized floating-point to normal-floating-point converter which produces normal-precision floating-point values." [¶0051] "operations that may be desirable in particular neural network implementations can be performed based on normal-precision formats including adding a bias to one or more nodes of a neural network, applying a hyperbolic tangent function or other such sigmoid function, or rectification functions (e.g., ReLU operations) to normal-precision values that are converted back from the quantized floating-point format." [¶0085] "an exponent can be selected such that accuracy of the representation is improved overall, or according to another metric."). Qadeer as well as Burger are directed towards mixed-precision neural network quantization. Therefore, Qadeer as well as Burger are analogous art in the same field of endeavor. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Qadeer with the teachings of Burger by using the block floating-point quantization scheme in Burger as the mixed-precision quantization scheme described at a high level in Qadeer. Burger provides as additional motivation for combination ([¶0028] “Typically, NNs are created with floating-point computation in mind, but when an FPGA is targeted for NN processing it would be beneficial if the neural network could be expressed using integer arithmetic. Examples of the disclosed technology include hardware implementations of block Floating-point (BFP), including the use of BFP in NN, FPGA, and other hardware environments.”). Regarding claim 7, Qadeer teaches The method of claim 1. However, Qadeer doesn't explicitly teach, further comprising: for each node of the subset, determining quantization parameter value thresholds that separates quantization parameters which result in acceptable performance metric from those quantization parameters which result in not acceptable performance metric; and/or for each node of the subset, determining a set of quantization parameter values that result in close-to-optimum or optimum performance metric values for a given quantization bit width. Burger, in the same field of endeavor, teaches The method of claim 1, further comprising: for each node of the subset, determining quantization parameter value thresholds that separates quantization parameters which result in acceptable performance metric from those quantization parameters which result in not acceptable performance metric; and/or for each node of the subset, determining a set of quantization parameter values that result in close-to-optimum or optimum performance metric values for a given quantization bit width.([¶0059] "Each of the nodes produces an output by applying a weight to each input generated from the preceding node and collecting the weights to produce an output value. In some examples, each individual node can have an activation function and/or a bias applied" [¶0077] "block floating-point numbers can be used to represent neural network nodes and edges in the quantized model 630. In some examples, one or more aspects of the quantization process can be specified by a user or programmer. One or more operations are performed with the quantized model 630, thereby producing quantized results 640. The quantized results can be evaluated by performing inference evaluation with inputs that were used to train the input neural network model W 610. In addition or alternatively, the quantized results 640 can be evaluated by providing new input data to the quantized model 630, and evaluating the accuracy of the results." [¶0089] "A number of different quantization parameters can be selected. These include bit widths for node weights, for example 3, 4, or 5 bits for values; bit widths for input or activation values for a neural network, for example 3, 4, or 5 bits for representing values;" [¶0124] "The instructions can include instructions that cause the system to evaluate the neural network having its node weights and edges stored in the memory as a normal-precision floating-point format, instructions that cause the system to convert at least one of the tensors to values expressed in a quantized-precision format, instructions that cause the system to perform at least one mathematical operation with the at least one of the quantized tensors, producing modified tensors, and instructions that cause the system to convert the modified tensors to a normal-precision floating-point format."). Qadeer as well as Burger are directed towards mixed-precision neural network quantization. Therefore, Qadeer as well as Burger are analogous art in the same field of endeavor. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Qadeer with the teachings of Burger by using the block floating-point quantization scheme in Burger as the mixed-precision quantization scheme described at a high level in Qadeer. Burger provides as additional motivation for combination ([¶0028] “Typically, NNs are created with floating-point computation in mind, but when an FPGA is targeted for NN processing it would be beneficial if the neural network could be expressed using integer arithmetic. Examples of the disclosed technology include hardware implementations of block Floating-point (BFP), including the use of BFP in NN, FPGA, and other hardware environments.”). Regarding claim 8, the combination of Qadeer, and Burger teaches The method of claim 7, wherein the set of quantization parameter values comprises at least one clipping limit (Burger [¶0033] " If the differences in magnitude are too great, the mantissa will overflow for the large values, or may be zero (“underflow”) for the smaller values. Depending on a particular application, some amount of overflow and/or underflow may be acceptable." Acceptable range of Overflow/underflow interpreted as clipping limit) and a quantization step size.(Burger [¶0086] "the mantissa width can be selected for the selected shared exponent as follows. A floating-point number is re-scaled such that mantissa bits that will be kept in the block floating-point representation are expressed in the integer portion of the scale value. The scale value is then rounded to remove fractional bits, and then scaling is reversed" mantissa width interpreted as quantization step size). Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Tailor (“Degree-Quant: Quantization-Aware Training for Graph Neural Networks”, 2021) is directed towards a performance aware quantization system for graph neural networks. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124
Read full office action

Prosecution Timeline

Mar 31, 2023
Application Filed
Jan 15, 2026
Non-Final Rejection — §102, §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12561604
SYSTEM AND METHOD FOR ITERATIVE DATA CLUSTERING USING MACHINE LEARNING
2y 5m to grant Granted Feb 24, 2026
Patent 12547878
Highly Efficient Convolutional Neural Networks
2y 5m to grant Granted Feb 10, 2026
Patent 12536426
Smooth Continuous Piecewise Constructed Activation Functions
2y 5m to grant Granted Jan 27, 2026
Patent 12518143
FEEDFORWARD GENERATIVE NEURAL NETWORKS
2y 5m to grant Granted Jan 06, 2026
Patent 12505340
STASH BALANCING IN MODEL PARALLELISM
2y 5m to grant Granted Dec 23, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
52%
Grant Probability
90%
With Interview (+38.2%)
4y 7m
Median Time to Grant
Low
PTA Risk
Based on 136 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month