Prosecution Insights
Last updated: May 29, 2026
Application No. 17/467,079

CONFIGURABLE NONLINEAR ACTIVATION FUNCTION CIRCUITS

Non-Final OA §103
Filed
Sep 03, 2021
Examiner
MAC, GARY
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Qualcomm Incorporated
OA Round
5 (Non-Final)
41%
Grant Probability
Moderate
5-6
OA Rounds
0m
Est. Remaining
72%
With Interview

Examiner Intelligence

Grants 41% of resolved cases
41%
Career Allowance Rate
7 granted / 17 resolved
-13.8% vs TC avg
Strong +31% interview lift
Without
With
+30.6%
Interview Lift
resolved cases with interview
Typical timeline
4y 3m
Avg Prosecution
13 currently pending
Career history
50
Total Applications
across all art units

Statute-Specific Performance

§101
10.2%
-29.8% vs TC avg
§103
89.8%
+49.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 17 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 02/02/2026 has been entered. Response to Arguments Applicant’s argument filed 02/02/2026 have been fully considered but they are not persuasive. Applicant’s Argument: On page 10-13 of Applicant’s response, applicant states that Singh and view of Deville and Ware fails to teach the claim limitations as recited in amended independent claim 1. Examiner’s Response: Applicant’s argument is not persuasive. Applicant’s arguments with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Double Patenting The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969). A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13. The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer. Claims 1 and 18 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 and 10 of copending Application No. 17807125 (reference application). This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented. Although the claims at issue are not identical, they are not patentably distinct from each other because the reference claims are an obvious variation of the instant claims. The reference claims 1 disclose a similar version of the instant application claim 1 with similar intended scope. Claim 1 of the instant application discloses a nonlinear activation function that comprises of a first approximator circuit and a second approximator circuit connected in series to approximate the nonlinear activation function. Each of the approximator circuits use one or more function parameters. A bypass circuit is also disclosed that provides the input data to the second approximator circuit without being processed by the first approximator circuit. Claim 1 of the reference application discloses at least one nonlinear approximator that consists of two successive linear approximators that approximate a linear function. Claim 1 of the instant application does not explicitly disclose whether the first and second function are linear or non-linear functions. Under the broadest reasonable interpretation, the first and second function in the instant application can be linear. The two successive linear approximators in the reference application also uses one or more function parameters and discloses a bypass circuit that bypass at least one of the successive linear approximators. Under the broadest reasonable interpretation, one embodiment includes the bypass circuit to be place on the first linear approximator. Thus, the data will pass to the second linear approximator without being processed by the first linear approximator. Reference claim 1 does not explicitly recite “determine a nonlinear activation function for application to input data”, but it does disclose “selected nonlinear activation function”. It would have been obvious to a person having ordinary skills in the arts that a prior step of determining a nonlinear activation function would have been implied in the reference application. The following table compare the instant claims to the reference claims, where underlines indicate differences between the instant claims and the reference claims. Instant Application (17/467,079) Reference Application #1 (17/807,125) 1. A processor comprising: a configurable nonlinear activation function circuit configured to: determine a nonlinear activation function for application to input data; determine, based on the determined nonlinear activation function, a set of parameters for the nonlinear activation function; and generate output data based on application of the set of parameters for the nonlinear activation function, wherein the configurable nonlinear activation function circuit comprises: a first approximator circuit configured to implement a first function using one or more first function parameters of the set of parameters; a second approximator circuit coupled in series with the first approximator circuit and configured to implement a second function using one or more second function parameters of the set of parameters, wherein the first function, the second function, or a combination including the first function and the second function approximates the determined nonlinear activation function; and a first bypass circuit coupled between an input of the first approximator circuit and an output of the first approximator circuit, the first bypass circuit being configured to selectively bypass the first approximator circuit such that when the first approximator circuit is bypassed, the input data is provided to an input of the second approximator circuit without being processed by the first approximator circuit. 1. A processor comprising: a configurable nonlinear activation function circuit configured to :determine, based on a selected nonlinear activation function, a set of parameters for the selected nonlinear activation function; and generate output data based on application of the set of parameters for the selected nonlinear activation function, wherein: the configurable nonlinear activation function circuit comprises at least one nonlinear approximator comprising at least two successive linear approximators; each linear approximator of the at least two successive linear approximators is configured to approximate a linear function using one or more function parameters of the set of parameters; and the configurable nonlinear activation function circuit further comprises a bypass circuit configured to selectively bypass at least one of the successive linear approximators. 18. A method for processing input data by a configurable nonlinear activation function circuit, comprising: determining a nonlinear activation function for application to the input data; determining, based on the determined nonlinear activation function, a set of parameters for the configurable nonlinear activation function circuit; and processing the input data with the configurable nonlinear activation function circuit based on the set of parameters to generate output data, wherein the configurable nonlinear activation function circuit comprises: a first approximator circuit configured to implement a first function using one or more first function parameters of the set of parameters; a second approximator circuit coupled in series with the first approximator circuit and configured to implement a second function using one or more second function parameters of the set of parameters, wherein the first function, the second function, or a combination including the first function and the second function approximates the determined nonlinear activation function; and a first bypass circuit coupled between an input of the first approximator circuit and an output of the first approximator circuit, the first bypass circuit being configured to selectively bypass the first approximator circuit such that when the first approximator circuit is bypassed, the input data is provided to an input of the second approximator circuit without being processed by the first approximator circuit. 10. A method for processing data with a configurable nonlinear activation function circuit, comprising: determining, based on a selected nonlinear activation function, a set of parameters for the selected nonlinear activation function; and generating output data based on application of the set of parameters for the selected nonlinear activation function, wherein: the configurable nonlinear activation function circuit comprises at least one nonlinear approximator comprising at least two successive linear approximators ;each linear approximator of the at least two successive linear approximators is configured to approximate a linear function using one or more function parameters of the set of parameters; and the configurable nonlinear activation function circuit further comprises a bypass circuit configured to selectively bypass at least one of the successive linear approximators. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-2, 10-11, 13, 15-22, and 31 are rejected under 35 U.S.C. 103 as being unpatentable over Singh (US20190266479A1) in view of Lim, “AA-ResNet: Energy Efficient All-Analog ResNet Accelerator”. Regarding claim 1, Singh teaches: “A processor, comprising: a configurable nonlinear activation function circuit configured to” (abstract, [0070-0071], Singh teaches an integrated circuit with a reconfigurable stream switch for acceleration of convolutional neural network operations. The stream switch is responsible for receiving the input data and directing the input data to multiple activation function accelerations.) “determine a nonlinear activation function for application to input data” ([0178-0179, Figure 8A], Input data is received by an activation function evaluation selection logic module. The selection logic module sends the input data to different activation units that may perform various activation functions, such as a second order polynomial approximation.) “determine, based on the determined nonlinear activation function, a set of parameters for the nonlinear activation function” ([0179, Figure 8A], The function evaluator logic can be used to accelerate any activation function by converting the function into a second order polynomial approximation. The coefficients (parameters) are obtained from a coefficient lookup table memory.) “generate output data based on application of the set of parameters for the nonlinear activation function, wherein the configurable nonlinear activation function circuit comprises” ([0191, Figure 8A], The activation function accelerator uses the input data, activation function, and coefficients to generate output data.) “a first approximator circuit configured to implement a first function using one or more first function parameters of the set of parameters” ([0179-0181, Figure 8A], Activation units (first approximator) may be added and arranged as a rectified linear unit variant evaluator (ReLU). Data from the coefficient lookup table memory can be passed into the activation units through communication lines.) “a second approximator circuit coupled ” ([0179-0181, 190, Figure 8A], The dedicated activation units implement a particular activation function. In some embodiment, the activation function accelerator may include a first and second dedicated activation units. The function evaluator logic is an adaptable hardware logic block arranged to accelerate any desirable activation function and represent the function in the form of a piece-wise second order polynomial approximation.) “a first bypass circuit coupled between an input of the first approximator circuit such that when the first approximator circuit is bypassed, ” ([211-212, 214-219, Figure 8A & 8C], The circuitry consists of an operating mode circuitry (first bypass circuit), which contains a plurality of operating mode to determine which activation function will be applied to the input data and which activation functions may not be selected (bypass). From Figure 8A, the activation function evaluation selection logic is shown to have a connection with the input of the activation units but it is not explicitly clear that there is a connection with the output of the activation units. The operating mode circuitry represents a module that makes a determination of selecting the desired activation function to process the input data.) Singh does not explicitly disclose an implementation of “a second approximator circuit coupled in series with the first approximator circuit” and “a first bypass circuit coupled between an input of the first approximator circuit and an output of the first approximator circuit, the first bypass circuit being configured to selectively bypass the first approximator circuit such that when the first approximator circuit is bypassed, the input data is provided to an input of the second approximator circuit without being processed by the first approximator circuit”. However, Lim discloses in the same field of endeavor: “a second approximator circuit coupled in series with the first approximator circuit and configured to implement a second function ” ([pg. 1-2, Section A, par. 1-3; pg. 2, Section D, par. 1; pg. 2, Fig. 1; pg. 2, Fig. 3(b)], Figure 1 discloses a circuit that describes a single layer of the AA-ResNet accelerator. The proposed design includes shortcut connections that improves the accuracy of deep networks through residual learning. Figure 3(b) discloses a residual network that consist of a plurality of layers that are connected in a series connection with one another. The grey colored layers have feedforward shortcut connections and shows multiple ReLU functions. Under the broadest reasonable interpretation, the first approximator and second approximator circuits apply a mathematical operation on the input data as recited. The approximator circuit implements a function and the claim does not limit the scope of the definition. Thus, a residual network consists of a plurality of convolution layers that consists of ReLU activation functions in a series connection to process the data teaches the approximator circuits.) “a first bypass circuit coupled between an input of the first approximator circuit and an output of the first approximator circuit, the first bypass circuit being configured to selectively bypass the first approximator circuit such that when the first approximator circuit is bypassed, the input data is provided to an input of the second approximator circuit without being processed by the first approximator circuit” ([pg. 1-2, Section II.A, par. 3; pg. 2, Section D, par. 1; pg. 2, Section III.A, par. 1; pg. 2, Fig. 3(b)], The accelerator implements a ResNet that consists of shortcut connections for skipping one or more layers. Shortcut connections perform an identity mapping, which is a key concept of ResNet. In Figure 3(b), the input data skips the convolution layer and is not processed by the first ReLU activation function. The skip connection directs the input data into an addition block that gets processed by the second ReLU activation function.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of “a second approximator circuit coupled in series with the first approximator circuit” and “a first bypass circuit coupled between an input of the first approximator circuit and an output of the first approximator circuit, the first bypass circuit being configured to selectively bypass the first approximator circuit such that when the first approximator circuit is bypassed, the input data is provided to an input of the second approximator circuit without being processed by the first approximator circuit” from Lim into the teaching of Singh. Doing so can improve the energy efficiency of machine learning accelerators by implementing an all-analog ResNet accelerator that can improve the accuracy of deep neural networks through residual learning (Lim, abstract). Claim 18 recites a method that performs the same process as the system in Claim 1. Therefore claim 18 is rejected under the same reasons mention for claim 1. Regarding claim 20, Singh teaches: “wherein the set of parameters includes a combination of one or more gain parameters, a constant parameter, and one or more approximation functions to apply to the input data via the configurable nonlinear activation function circuit” ([175, 179, Equation 1-11], The activation functions can be approximated using a second order polynomial approximation and the coefficients to the polynomial equation is stored in the lookup table memory.) Regarding claims 2 and 21, Singh teaches: “a gain multiplier coupled to the second approximator circuit and configured to multiply a gain value based on one or more gain parameters of the set of parameters” ([225, Figure 8G], The function evaluator logic includes a multiplier circuitry that can perform operations on the coefficients from the coefficient lookup table.) “a constant adder coupled to the gain multiplier and configured to add a constant value based on a constant parameter of the set of parameters” ([0228, Figure 8G], The function evaluator logic includes an adder circuitry that can perform operations on the coefficients from the coefficient lookup table. The adder circuitry and multiplier circuitry can be of the same configuration.) Regarding claim 10, Singh teaches: “the determined nonlinear activation function comprises a hyperbolic tangent (tanh) function” ([175, Equation 5A], The activation function acceleration may process the input data based on a hyperbolic tangent function.) “the gain parameters comprise a dependent parameter value of 0 and an independent parameter value of 1” ([175, 0185, 225, Equation 5A and 5B], From the specification of the claimed invention, the gain parameters are interpreted to be coefficients used to support the computation of activation functions. The multiplier circuitry may used these coefficients to perform operations on the input data.) “the constant value is 0” ([0175, 0228, Equation 5A], If the activation function has a constant parameter of zero. The adder circuitry may perform the sum operation based on the constant parameter.) “the first function is quadratic” ([179], The function evaluator logic converts an activation function into a second order polynomial (quadratic approximator).) “the second function is a tanh look-up table” ([0175, 0179, Equation 5A], Equation 5A describes a tanh function. The coefficient lookup table stores the coefficient values of any activation function and the activation units may retrieve these parameters from the lookup table.) Regarding claim 11, Singh teaches: “the determined nonlinear activation function comprises a sigmoid function” ([175, Equation 4A], Equation 4A describes the sigmoid function.) “the gain parameters comprise a dependent parameter value of 0 and an independent parameter value of 1” ([175, 0185, 225, Equation 4A and 4B], From the specification of the claimed invention, the gain parameters are interpreted to be coefficients used to support the computation of activation functions. The multiplier circuitry may use these coefficients to perform operations on the input data.) “the constant value is 0” ([0175, 0228, Equation 4A]”, If the activation function has a constant parameter of zero. The adder circuitry may perform the sum operation based on the constant parameter.) “the first function is linear” ([179, Equation 1], The function evaluator logic converts an activation function into a second order polynomial (quadratic approximator). When the first term coefficient, A is set to zero in Equation 1, the function becomes a first-order polynomial or a linear function.) “the second function is a sigmoid look-up table” ([0175, 0179, Equation 4A], Equation 4A describes a sigmoid function. The coefficient lookup table stores the coefficient values of any activation function and the activation units may retrieve these parameters from the lookup table.) Regarding claim 13, Singh teaches: “the determined nonlinear activation function comprises a rectified linear unit (ReLU) function” ([175, Equation 8A], Equation 8A describes the ReLU function.) “the gain parameters comprise a dependent parameter value of 0 and an independent parameter value of 1” ([175, 0185, 225, Equation 8A and 8B], From the specification of the claimed invention, the gain parameters are interpreted to be coefficients used to support the computation of activation functions. The multiplier circuitry may use these coefficients to perform operations on the input data.) “the constant value is 0” ([0175, 0228, Equation 8A], If the activation function has a constant parameter of zero. The adder circuitry may perform the sum operation based on the constant parameter.) “the first function is quadratic” ([179, Equation 1], The function evaluator logic converts an activation function into a second order polynomial (quadratic approximator).) “the second function is a max function” ([0181, Figure 8A], The activation function accelerator may include an activation unit that performs ReLU computation and may include any ReLU variant. The ReLU formula is f(x) = max(0,x), which contains a maximum function.) Regarding claim 15, Singh teaches: “the determined nonlinear activation function comprises an exponential linear unit (ELU) function” ([175, Equation 10A], Equation 10A describes the ELU function.) “the gain parameters comprise a dependent parameter value of 0 and an independent parameter value of a” ([175, 0185, 225, Equation 10A and 10B], From the specification of the claimed invention, the gain parameters are interpreted to be coefficients used to support the computation of activation functions. The multiplier circuitry may use these coefficients to perform operations on the input data.) “the constant value is 0” ([0175, 0228, Equation 10A], If the activation function has a constant parameter of zero. The adder circuitry may perform the sum operation based on the constant parameter.) “the first function is: quadratic if an input data value is > 0; or bypassed if the input data value is < 0” ([179, Equation 1 and 10A], The function evaluator logic converts an activation function into a second order polynomial (quadratic approximator). Equation 10A shows the formula for when the input data is greater than zero and when the input data is less than zero. It would have been obvious to a person of ordinary skills in the arts to have perform various operations on the segmented input data.) “the second function is: bypassed if the input data value is > 0; or an exponential look-up table if the input data value is < 0” ([179, Equation 10A], Equation 10A shows the formula for when the input data is greater than zero and when the input data is less than zero. It would have been obvious to a person of ordinary skills in the arts to have perform various operations on the segmented input data. The coefficients for when the input data is less than zero can be obtained from the coefficient lookup table memory.) Regarding claim 16, Singh teaches: “an input memory buffer configured to store as the input data one or more outputs received from a processing circuit” ([166], The convolution accelerator contains memory buffers to store the input data.) “an output memory buffer configured to store the generated output data for output from the configurable nonlinear activation function circuit” ([165, 168, Figure 6], Figure 6 depicts a memory buffer 614 and the output data stream 608. The output data is store in memory buffer 614 and is configured to send the output data to other components outside of the convolution accelerator.) Regarding claim 17, Singh teaches: “further comprising a compute-in-memory array configured to provide the input data to the configurable nonlinear activation function circuit” ([178, Figure 8A], The activation function evaluation selection logic receives input data and directs the input data to the activation units to perform the activation functions.) Regarding claim 19, Singh teaches: “further comprising retrieving the set of parameters from a memory based on the determined nonlinear activation function” ([179, Figure 8A], The lookup table memory contains coefficients (parameters) to support the activation accelerator.) Regarding claim 22, Singh in view of Lim teaches: “a second bypass circuit coupled between an input of the second approximator circuit and an output of the second approximator circuit, the second bypass circuit being configured to selectively bypass the second approximator circuit” ([Singh, 0187-0190, 211-212, 214-219, Figure 8A & 8C], The circuitry consists of an operating mode circuitry, which contains a plurality of operating mode to determine which activation function will be applied to the input data and which activation functions may be bypassed. The different operation mode can determine whether the function evaluator or a particular activation unit is active. The parametric logic (second bypass circuit) works with the activation function evaluation selection logic to stream the input data into the proper circuitry. From Figure 8A, the parametric logic is shown with communication lines 826 and 834 connected to the input and output of the activation units, respectively. The parametric logic module may include switching logic to stream input data to the proper activation units based on the control information and the operation mode information. Singh teaches a plurality of activation units and the implementation of an operating mode circuitry and parametric logic to pass input data to one or more desirable activation units. Lim (Figure 3) teaches a ResNet accelerator that implements feedforward shortcut connections to convolution layers and the shortcut connection can selectively skip the operations of the convolution layer.) Regarding claim 31, Singh teaches: “wherein the first function and the second function are nonlinear functions” ([0175, 0178-0179], The selection logic module sends the input data to different activation units that may perform various activation functions, such as hyperbolic tangent functions, rectified linear unit functions, and exponential linear unit functions.) Claims 3-7, 23-24, and 26-27 are rejected under 35 U.S.C. 103 as being unpatentable over Singh (US20190266479A1), in view of Lim, “AA-ResNet: Energy Efficient All-Analog ResNet Accelerator” and Deville (US5796925A). Regarding claims 3 and 23, Singh in view of Lim does not explicitly disclose an implementation of using a cubic approximation for an activation function. However, Deville discloses in the same field of endeavor: “wherein at least one of the first approximator circuit or the second approximator circuit is a cubic approximator circuit” ([col. 3, lines 40-53], Deville teaches an implementation where an activation function is converted into a third order polynomial (cubic approximation) and the parameters of the polynomial function is stored into a table.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of having a cubic approximator for activation function from Deville into the teaching of Singh in view of Lim. Doing so can increase the precision of approximation of non-linear activation functions (Deville, abstract). Regarding claims 4 and 24, Singh teaches: “wherein another one of the first approximator circuit or the second approximator circuit is one of a quadratic approximator circuit or a linear approximator circuit” ([179], The function evaluator logic converts an activation function into a second order polynomial (quadratic approximator).) Regarding claim 5, Singh in view of Lim teaches: “wherein both the first approximator circuit and the second approximator circuit ” ([Singh, 0180-0181, 190, Figure 8A], The activation function acceleration contains multiple units to compute the activation function and it has been shown to improve the efficiency of the neural network operation. It would be obvious to one of the ordinary skill in the art to include multiple approximators in the activation function computation.) Singh in view of Lim does not explicitly disclose an implementation of using a cubic approximation for an activation function. However, Deville discloses in the same field of endeavor: “wherein both the first approximator circuit and the second approximator circuit are cubic approximators circuits” ([col. 3, lines 40-53], Deville teaches an implementation where an activation function is converted into a third order polynomial (cubic approximation) and the parameters of the polynomial function is stored into a table.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of having a cubic approximator for activation function from Deville into the teaching of Singh in view of Lim. Doing so can increase the precision of approximation of non-linear activation functions (Deville, abstract). Regarding claims 6 and 26, Singh teaches: “wherein another one of the first approximator circuit or the second approximator circuit is configured to access a look-up table for an approximated value” ([0179, Figure 8A], The function evaluator logic can be used to accelerate any activation function by converting the function into a second order polynomial approximation. The coefficients (parameters) are obtained from a coefficient lookup table memory.) Regarding claims 7 and 27, Singh teaches: “wherein another one of the first approximator circuit or the second approximator circuit is configured to perform a minimum or maximum function” ([0181, Figure 8A], The activation function accelerator may include an activation unit that performs ReLU computation and may include any ReLU variant. The ReLU formula is f(x) = max(0,x), which contains a maximum function.) Claims 8, and 28 is rejected under 35 U.S.C. 103 as being unpatentable over Singh (US20190266479A1), in view of Lim, “AA-ResNet: Energy Efficient All-Analog ResNet Accelerator” and in view of Lin (US20210150306A1). Regarding claims 8 and 28, Singh in view of Lim teaches: “the gain parameters comprise a dependent parameter value of 1 and an independent parameter value of 0” ([Singh, 0175, Equation 2A], “The activation functions may include identity functions”, From the specification of the claimed invention, a gain is consider to have the form “ax+b”. When the gain parameters have a dependent parameter value of 1 and an independent parameter value of 0, the gain becomes “x”. Equation 2A is an identity function which is equivalent to the gain having parameters of 1 and 0.) “the constant value is 0” ([Singh, 0175, 0228, Equation 2A],"Adder circuitry 888 may be arranged to sum any one or more values”, If the activation function is an identity function, the constant parameter is zero. The adder circuitry may perform the sum operation based on the constant parameter.) “the first function is quadratic” ([Singh, 179], “represent the function in the form of a piece-wise second order polynomial approximation”, The function evaluator logic converts an activation function into a second order polynomial (quadratic approximator).) “the second function is a sigmoid look-up table” ([Singh, 0175, 0179, Equation 4A],"the coefficient lookup table memory 812 is a larger memory that stores coefficient data to support many or all of the activation functions”, The equation that describes a logistic function is the same for a sigmoid function. The coefficient lookup table stores the coefficient values of any activation function and the activation units may retrieve these parameters from the lookup table.) Singh in view of Lim do not explicitly disclose an implementation the activation function is a swish function. However, Lin discloses in the same field of endeavor: “the determined nonlinear activation function comprises a swish function” ([0060], Lin proposes an implementation where the activation function may comprise of a swish function.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of having the activation function as a swish function from Lin into the teaching of Singh in view of Lim. Doing so can increase the precision of approximation of non-linear activation functions (Lin, abstract). Claims 9, 14 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Singh (US20190266479A1) in view of Lim, “AA-ResNet: Energy Efficient All-Analog ResNet Accelerator”, in view of Lin (US20210150306A1), and Ho, "NnCore: A Parameterized Non-Linear Function Generator for Machine Learning Applications in FPGAs". Regarding claims 9 and 29, Singh in view of Lim, and Lin teaches: “the determined nonlinear activation function comprises a hard swish function” ([Lin, 0060], Lin proposes an implementation where the activation function may comprise of a hard swish function.) “the gain parameters comprise a dependent parameter value of 1/6 and an independent parameter value of 0” ([Lin, 0060], The equation for a hard swish function is: h . s w i s h x = x R e L U 6 ( x + 3 ) 6 The equation for the hard swish function describes a dependent parameter value of 1/6 and an independent parameter value of 0.) “the constant value is 3” ([Lin, 0060], The equation for a hard swish function describes a formula where the constant value is 3.) Singh in view of Lim, and Lin does not explicitly disclose an implementation where the ReLU variant is maxout or ReLU6 to teach the min and max functionality. However, Ho discloses in the same field of endeavor: “the first function is a max function” ([pg. 3, col. 2, par. 2], The activation function accelerator may include an activation unit that performs ReLU computation and may include any ReLU variant. One of the common ReLU variant is maxout. In maxout, the maximum value is determined from a set of inputs.) “the second function is a min function” ([pg. 3, col. 2, par. 1], The activation function accelerator may include an activation unit that performs ReLU computation and may include any ReLU variant. One of the common ReLU variant is ReLU6. ReLU6 contains a minimum function.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of the ReLU variant is maxout or ReLU6 to teach the min and max functionality from Ho into the teaching of Singh in view of Lim, and Lin. Doing so can increase the precision of approximation of non-linear activation functions (Ho, abstract). Regarding claim 14, Singh in view of Lim, and Lin teaches: “the gain parameters comprise a dependent parameter value of 0 and an independent parameter value of 1” ([Lin, 175, 0185, 225], From the specification of the claimed invention, the gain parameters are interpreted to be coefficients used to support the computation of activation functions. The multiplier circuitry may use these coefficients to perform operations on the input data.) “the constant value is 0” ([Lin, 0175, 0228], If the activation function has a constant parameter of zero. The adder circuitry may perform the sum operation based on the constant parameter.) Singh in view of Lim, and Lin does not explicitly disclose an implementation where the ReLU variant is ReLU6 to teach the min and max functionality. However, Ho discloses in the same field of endeavor: “the determined nonlinear activation function comprises a rectified linear unit-six (ReLU6) function” ([pg. 3, col. 2, par. 1], The reference proposes an implementation where the activation function may comprise of a ReLU6 function.) “the first function is a max function” ([pg. 3, col. 2, par. 2], The activation function accelerator may include an activation unit that performs ReLU computation and may include any ReLU variant. One of the common ReLU variant is maxout. In maxout, the maximum value is determined from a set of inputs.) “the second function is a min function” ([pg. 3, col. 2, par. 1], The activation function accelerator may include an activation unit that performs ReLU computation and may include any ReLU variant. One of the common ReLU variant is ReLU6. ReLU6 contains a minimum function.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of having the activation function as a ReLU6 function from Ho into the teaching of Singh in view of Lim, and Lin. Doing so can increase the precision of approximation of non-linear activation functions (Ho, abstract). Claims 12 and 30 is rejected under 35 U.S.C. 103 as being unpatentable over Singh (US20190266479A1), in view of Lim, “AA-ResNet: Energy Efficient All-Analog ResNet Accelerator”, Deville (US5796925A), and Hendrycks, "Gaussian Error Linear Units". Regarding claims 12 and 30, Singh in view of Lim and Deville teaches: “the gain parameters comprise a dependent parameter value of 0 and an independent parameter value of 1” ([Singh, 175, 0185, 225], From the specification of the claimed invention, the gain parameters are interpreted to be coefficients used to support the computation of activation functions. The multiplier circuitry may use these coefficients to perform operations on the input data.) “the constant value is 1” ([Singh, 0175, 0228], If the activation function has a constant parameter of zero. The adder circuitry may perform the sum operation based on the constant parameter.) “the first function is cubic” ([Deville, col. 3, lines 40-53], Deville teaches an implementation where an activation function is converted into a third order polynomial (cubic approximation) and the parameters of the polynomial function is stored into a table.) “the second function is a tanh look-up table” ([Singh, 0175, 0179], The coefficient lookup table stores the coefficient values of any activation function and the activation units may retrieve these parameters from the lookup table.) Singh in view of Lim and Deville does not explicitly disclose an implementation where the activation function is a GELU function. However, Hendrycks discloses in the same field of endeavor: “the determined nonlinear activation function comprises a Gaussian error linear unit (GELU) function” ([abstract], Hendrycks teaches the GELU activation function.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of having the activation function as a GELU function from Hendrycks into the teaching of Singh in view of Lim and Deville. Doing so can increase the precision of approximation of non-linear activation functions (Hendrycks, abstract). Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to GARY MAC whose telephone number is (703)756-1517. The examiner can normally be reached Monday - Friday 8:00 AM - 5:00 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on (571) 270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /GARY MAC/Examiner, Art Unit 2127 /ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127
Read full office action

Prosecution Timeline

Show 8 earlier events
Jun 19, 2025
Response after Non-Final Action
Jul 10, 2025
Non-Final Rejection mailed — §103
Oct 08, 2025
Response Filed
Dec 04, 2025
Final Rejection mailed — §103
Feb 02, 2026
Response after Non-Final Action
Mar 03, 2026
Request for Continued Examination
Mar 12, 2026
Response after Non-Final Action
May 07, 2026
Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12626130
METHOD AND DEVICE FOR COMPRESSING NEURAL NETWORK
4y 5m to grant Granted May 12, 2026
Patent 12608643
GENERATING WORKFLOW REPRESENTATIONS USING REINFORCED FEEDBACK ANALYSIS
4y 7m to grant Granted Apr 21, 2026
Patent 12596907
NEURAL NETWORK OPERATION APPARATUS AND METHOD
4y 8m to grant Granted Apr 07, 2026
Patent 12572842
METHODS AND SYSTEMS FOR DECENTRALIZED FEDERATED LEARNING
5y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 4 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

5-6
Expected OA Rounds
41%
Grant Probability
72%
With Interview (+30.6%)
4y 3m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 17 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month