Last updated: April 19, 2026
Application No. 17/700,911
CHAINED MULTIPLY ACCUMULATE USING AN UNROUNDED PRODUCT

Final Rejection §101§102§103
Filed
Mar 22, 2022
Examiner
GUDAS, JAKOB OSCAR
Art Unit
2151
Tech Center
2100 — Computer Architecture & Software
Assignee
Arm Limited
OA Round
2 (Final)
This examiner grants 44% of cases after interview

— +71.1% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 9 resolved cases, 2023–2026
Examiner Intelligence

GUDAS, JAKOB OSCAR View full profile →
Grants 44% of resolved cases
Career Allow Rate
4 granted / 9 resolved
-10.6% vs TC avg
Strong +71% interview lift
Without
With
+71.1%
Interview Lift
resolved cases with interview
Typical timeline
4y 2m
Avg Prosecution
28 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
33.2%
-6.8% vs TC avg
§103
37.0%
-3.0% vs TC avg
§102
8.0%
-32.0% vs TC avg
§112
19.9%
-20.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 9 resolved cases
Office Action

§101 §102 §103
Detailed Action
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is final and is in response to claims filed on 11/05/2025 via amendment. Claims 1-20 are pending examination. Claims 1, 5, 9-10, and 19 are currently amended. Claims 2-4, 6-8, 11-18, and 20 are as originally filed.

Objection to the Specification
The abstract has been amended, therefore the objection to the specification has been withdrawn.

Rejections Under 35 U.S.C. 101
Applicant’s arguments regarding the 35 U.S.C. 101 rejections have been fully considered. Regarding the rejection under 35 U.S.C. 101, Applicant argues that “the claims do not recite a mental process but defines a specific apparatus configured to perform specific operations in order to improve the efficiency of CMAC operations in a data processing apparatus”. See Remarks 9 filed 11/05/2025. 
	Examiner respectfully disagrees with Applicant’s arguments. The claims do not recite a specific apparatus. They recite many objects in black box circuitry that is clearly generally linking and an apply it scenario. see MPEP 2106.05(h). The claims do note recite any specifics of the arrangements, structure, or connections of the circuitry. 
	Applicant further argues “This leads to a significant reduction in the time taken to perform the CMAC operation”. See Remarks 10.
	Examiner respectfully disagrees with Applicant’s arguments. Applicant has not linked these purported improvements to the specification.
	Applicant further points to Ex parte Desjardins, 2024-000567, Sept. 26, 2025 and argues “Like the ARP decision in Ex parte Desjardins, this constitutes an improvement to how the data processing apparatus model itself operates, and not, mathematical calculation”. See Remarks 10-11.
	Examiner respectfully disagrees with Applicant’s arguments. The claims fail to recite any language that links the claims to an improvement in neural networks or machine learning.
Further, it is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements. See the discussion of Diamond v. Diehr, 450 U.S. 175, 187 and 191-92, 209 USPQ 1, 10 (1981)) in subsection II, below. In addition, the improvement can be provided by the additional element(s) in combination with the recited judicial exception... However, it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology...”. See MPEP 2106.05(a).

Rejections Under 35 U.S.C. 102
Applicant’s arguments regarding the 35 U.S.C. 102 rejections have been fully considered. Regarding the rejection under 35 U.S.C. 102, Applicant argues that “Manzo fails to disclose ‘generate a sum based on adding the unrounded product, a value based on the first rounding increment, and the third floating-point operand;  determine a second rounding increment based on the sum’”. See Remarks 11-12.
	Examiner respectfully disagrees with Applicant’s arguments. Manzo explicitly teaches the above limitations. First Manzo generates multiplication rounding data based on an unrounded multiplication result (Manzo [0032]). Then Manzo adds the multiplication result to a third operand to generate an unrounded accumulation result (Manzo [0035]). Then Manzo adds the multiplication rounding data to the unrounded accumulation result to generate a sum (Manzo [0037]). Then Manzo generates second rounding data based on the sum (Manzo [0032]). Therefore, Manzo explicitly teaches the claimed limitations. Furthermore, as seen in Fig. 5 of Manzo, Applicant can see that the steps of the present application are explicitly taught by Manzo. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract ideas without significantly more.

With regards to claim 1, at step 1, the claim is directed to a machine, which is a statutory category of invention.

At Step 2A Prong 1, the examiner notes that the claim is directed to mental processes and/or mathematical concepts. The claim language has been reproduced below:
An apparatus comprising: (mental process, evaluation)
instruction decode circuitry configured to decode instructions; and (mental process, evaluation)
processing circuitry configured to execute the instructions decoded by the instruction decode circuitry, (mental process, evaluation)
wherein the processing circuitry comprises chained-floating-point-multiply-accumulate circuitry responsive to a chained-floating-point-multiply-accumulate instruction decoded by the instruction decoder, (mental process, evaluation) the chained-floating-point-multiply-accumulate instruction specifying a first floating-point operand, a second floating-point operand and a third floating-point operand, to: (mental process, evaluation)
generate an unrounded product based on multiplying the first floating-point operand and the second floating-point operand; (mathematical calculation)
generate a first rounding increment based on the unrounded product; (mathematical calculation)
generate a sum based on adding the unrounded product, a value based on the first rounding increment, and the third floating-point operand; (mathematical calculation)
determine a second rounding increment based on the sum; and (mathematical calculation)
perform rounding based on the second rounding increment. (mathematical calculation)

	Each of the non-bolded limitations are mental processes and/or mathematical calculations. The “An apparatus comprising:” limitation is an evaluation mental process that can be performed by choosing what the apparatus comprises. The “instruction decode circuitry configured to decode instructions” limitation is an evaluation mental process that can be performed by choosing what the instruction decode circuitry does. The “processing circuitry configured to execute the instructions decoded by” limitation is an evaluation mental process that can be performed by choosing what the processing circuitry does. The “the chained-floating-point-multiply-accumulate instruction specifying” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate instruction specifies. The “generate an unrounded product based on multiplying” limitation is a mathematical calculation that can be performed by generating the product by hand using pen and paper. The “generate a first rounding increment” limitation is a mathematical calculation that can be performed by generating the first rounding increment by hand using pen and paper. The “generate a sum based on adding” limitation is a mathematical calculation that can be performed by generating the sum by hand using pen and paper. The “determine a second rounding increment” limitation is a mathematical calculation that can be performed by determining the second rounding increment by hand using pen and paper. The “perform rounding based on the” limitation is a mathematical calculation that can be performed by rounding by hand using pen and paper.
At step 2A Prong 2, the additional elements are bolded above. The additional elements amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f).
Under Step 2B, , the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 19, it recites similar language to claim 1 and is rejected for, at least, the same reasons therein. Herein claim 19 is directed towards the statutory category of a method, thus also satisfying step 1. Moreover under step 2A prong 2 and 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 20, it recites similar language to claim 1 and is rejected for, at least, the same reasons therein. Herein claim 20 is directed towards the statutory category of a method, thus also satisfying step 1. Moreover under step 2A prong 2 the additional elements are “A non-transitory computer-readable medium”. These are no more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). The additional element “to store”, as claimed under BRI, are additional elements that are insignificant extra-solution activity. The ‘store’ in the context of the claim encompasses mere data gathering. Under Step 2B, the claim recites “A non-transitory computer-readable medium to store computer-readable code”, and, per MPEP 2106.05(d) (Il), the courts have recognized the following computer functions as well understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity:

i. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607, 610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network);

iv. Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092- 93.

With regards to claim 2, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “align the unrounded product and the third” limitation is a mathematical calculation that can be performed by aligning the inputs by hand using pen and paper. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 3, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “generate the unrounded product” limitation is a mathematical calculation that can be performed by generating the product by hand using pen and paper. The “ before aligning the unrounded product and the third floating-point operand, append an” limitation is a mathematical calculation that can be performed by appending the bit by hand using pen and paper. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 4, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “align the unrounded product” limitation is a mathematical calculation that can be performed by aligning the inputs by hand using pen and paper. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 5, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “increment, after calculating the exponent difference, either the exponent associated” limitation is a mathematical calculation that can be performed by incrementing the exponent by hand using pen and paper. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 6, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “ generate the unrounded product in an un-normalized form” limitation is a mathematical calculation that can be performed by generating the product by hand using pen and paper. The “ generate the unrounded product in an un-normalized form; generate an exponent difference” limitation is a mathematical calculation that can be performed by generating the exponent difference by hand using pen and paper. The “ generate the unrounded product in an un-normalized form; generate an exponent difference based on an exponent associated with the unrounded product and an exponent of the third floating-point operand; and align the unrounded product” limitation is a mathematical calculation that can be performed by aligning the inputs by hand using pen and paper. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 7, It is directed to mental processes and/or mathematical concepts. The “wherein: the chained-floating-point-multiply-accumulate circuitry comprises” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry comprises. The “the floating-point-multiply circuitry is configured” limitation is an evaluation mental process that can be performed by choosing what the floating-point-multiply circuitry is configured to do. The “generate the unrounded product and generate the first rounding increment” limitation is a mathematical calculation that can be performed by generating the unrounded product and rounding increment by hand using pen and paper. The “the floating-point-add circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the floating-point-add circuitry is configured to do. The “generate the sum based” limitation is a mathematical calculation that can be performed by generating the sum by hand using pen and paper. The “determine the second rounding increment” limitation is a mathematical calculation that can be performed by determining the rounding increment by hand using pen and paper. The “perform the rounding based on” limitation is a mathematical calculation that can be performed by performing the rounding by hand using pen and paper. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, floating-point-multiply circuitry, floating-point-add circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 8, It is directed to mental processes and/or mathematical concepts. The “wherein the floating-point-multiply circuitry comprises” limitation is an evaluation mental process that can be performed by choosing what the floating-point-multiply circuitry comprises. The “the floating- point-add circuitry comprises” limitation is an evaluation mental process that can be performed by choosing what the floating-point-add circuitry comprises. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. floating-point-multiply circuitry, floating-point-add circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 9, It is directed to mental processes and/or mathematical concepts. The “comprising issue circuitry to issue the instructions” limitation is an evaluation mental process that can be performed by choosing what issue circuitry does. The “wherein when a first instance of the chained-floating-point-multiply-accumulate instruction and a second instance of the chained-floating-point-multiply-accumulate instruction are issued sequentially” limitation is an evaluation mental process that can be performed by choosing what to do in the event of two instructions being issued sequentially. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. floating-point-multiply circuitry, floating-point-add circuitry, issue circuitry, instruction decode circuitry, processing circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 10, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “determine the value based on the first rounding increment in dependence on at least one of” limitation is an evaluation mental process that can be performed by choosing how to determine the value. The “whether an exponent associated with the unrounded product is larger” limitation is an evaluation mental process and mathematical calculation that can be performed by seeing which exponent is larger. The “whether the sum based on adding the unrounded product” limitation is an evaluation mental process than can be performed by seeing if the sum is a like-signed or unlike-signed addition. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 11, It is directed to mental processes and/or mathematical concepts. The “the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “align the unrounded product and the third floating-point operand” limitation is a mathematical calculation that can be performed by aligning the inputs by hand using pen and paper. The “and the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “determine the value based on the first rounding increment in dependence on values” limitation is a mathematical calculation that can be performed by determining the value by hand using pen and paper. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 12, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “select, as the value based on the first rounding increment” limitation is an evaluation mental process that can be performed by choosing the value based on the first rounding increment. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 13, It is directed to mental processes and/or mathematical concepts. The “comprising a central processing unit or a graphics processing unit” limitation is an evaluation mental process that can be performed by choosing what the apparatus comprises. The “wherein the central processing unit or the graphics processing unit” limitation is an evaluation mental process that can be performed by choosing what the central processing unit or the graphics processing unit comprises. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. processing circuitry, the central processing unit, the graphics processing unit, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 14, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “truncate the unrounded product before generating the sum” limitation is a mathematical calculation that can be performed by truncating the product by hand using pen and paper. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 15, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry comprises” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry comprises. The “the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, 3:2 carry save adder, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 16, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “output a result value equivalent to generating a” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry outputs. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 17, It is directed to mental processes and/or mathematical concepts. The “wherein when one or more of the first floating-point operand, the second floating-point operand and the third floating-point operand comprises a sub-normal floating point value” limitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “treat the sub-normal floating point value as zero when processing” limitation is an evaluation mental process that can be performed by choosing how sub-normal values are treated. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

With regards to claim 18, It is directed to mental processes and/or mathematical concepts. The “wherein the chained-floating-point-multiply-accumulate circuitry is configured to” imitation is an evaluation mental process that can be performed by choosing what the chained-floating-point-multiply-accumulate circuitry does. The “flush the unrounded product or a result value” limitation is an evaluation mental process that can be performed by setting the result to zero if the result is too small. Under step 2A Prong 2, none of the additional elements regarding the generic computer components (i.e. chained-floating-point-multiply-accumulate circuitry, etc.) are more than high level generic computer components that amount to no more than components comprising mere instructions to apply the exception and do not integrate the judicial exception into a practical application. See MPEP 2106.05(f). Under Step 2B, the claim does not recite any additional elements that integrate the abstract idea into a practical application, nor do they amount to significantly more than the judicial exception.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 7, 12, 16, and 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Manzo et al. (US 20190026077 A1) hereinafter Manzo.

With regards to claim 1, Manzo teaches An apparatus comprising: instruction decode circuitry configured to decode instructions; (Manzo [0030]: The floating point chained multiply accumulate pipeline of FIG. 1 comprises decoder circuitry 2 for decoding a program instruction specifying a floating point chained multiply accumulate operation to be performed)
	and processing circuitry configured to execute the instructions decoded by the instruction decode circuitry, wherein the processing circuitry comprises chained-floating-point-multiply-accumulate circuitry (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation)
	responsive to a chained-floating-point-multiply-accumulate instruction decoded by the instruction decoder, (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation)
	the chained-floating-point-multiply-accumulate instruction specifying a first floating-point operand, a second floating-point operand and a third floating-point operand, (Manzo [0030]: The decoder circuitry 2 responds to the decoding of such an instruction to generate control signals which then control the other portions of the pipeline circuitry illustrated in FIG. 1 to perform the operations described below to perform the specified floating point chained multiply accumulate operation. A floating point register file 4 stores three floating point input operands A, B and C)
	to: generate an unrounded product based on multiplying the first floating-point operand and the second floating-point operand; (Manzo [0031]: The multiplier 14 multiplies the two input operands it receives to form an unrounded multiplication result)
	generate a first rounding increment based on the unrounded product; (Manzo [0032]: The rounding value determination circuitry 22 serves to generate multiplication rounding data)
	generate a sum based on adding the unrounded product, a value based on the first rounding increment, and the third floating-point operand; (Manzo [0035]: During the second processing cycle a third input operand A is read from the floating point register file 4 and supplied as one input operand to the adder 12. The unrounded multiplication result is read from the unrounded multiplication result register 18 and passed via the adder-input multiplexer 10 to the other input of the adder 12. Thus, the adder 12 during the second processing clock cycle serves to add the third input operand A to the unrounded multiplication result and generate an unrounded accumulation; Manzo [0037]: the multiplication rounding data stored within the multiply accumulate rounding data register 26 is supplied via chained multiply accumulate compensation circuitry 24 (where it is subject to any adjustment required to take account of the late application of the rounding associated with the multiplication) and from where it is then passed to the carry-save adder 20 as another input operand. A third input operand to the carry-save adder 20 is the unrounded accumulation result from the unrounded accumulation result register)
	determine a second rounding increment based on the sum; (Manzo [0032]: accumulation rounding data derived from the unrounded accumulate result)
	and perform rounding based on the second rounding increment (Manzo [0037]: Thus, during the third processing clock cycle illustrated in FIG. 4, the rounding circuitry in the final stage of the pipeline serves to generate accumulate rounding data and the rounded accumulate result which is formed from a carry-save add of the unrounded accumulate result, the accumulate rounding data and the multiplication rounding data).

With regards to claim 7, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches wherein: the chained-floating-point-multiply-accumulate circuitry comprises floating-point- multiply circuitry and floating-point-add circuitry; (Manzo [0030]: FIG. 1 schematically illustrates processing circuitry for performing a floating point chained multiply accumulate operation; Manzo Fig. 1: shows the circuit with adder circuitry, multiplier circuitry, and rounding circuitry which is part of both the adder and multiplier circuitry)
	the floating-point-multiply circuitry is configured to generate the unrounded product and generate the first rounding increment; (Manzo [0031]: The multiplier 14 multiplies the two input operands it receives to form an unrounded multiplication result; Manzo [0032]: The rounding value determination circuitry 22 serves to generate multiplication rounding data)
	and the floating-point-add circuitry is configured to generate the sum based on adding the unrounded product, the value based on the first rounding increment, and the third floating- point operand, (Manzo [0035]: During the second processing cycle a third input operand A is read from the floating point register file 4 and supplied as one input operand to the adder 12. The unrounded multiplication result is read from the unrounded multiplication result register 18 and passed via the adder-input multiplexer 10 to the other input of the adder 12. Thus, the adder 12 during the second processing clock cycle serves to add the third input operand A to the unrounded multiplication result and generate an unrounded accumulation; Manzo [0037]: the multiplication rounding data stored within the multiply accumulate rounding data register 26 is supplied via chained multiply accumulate compensation circuitry 24 (where it is subject to any adjustment required to take account of the late application of the rounding associated with the multiplication) and from where it is then passed to the carry-save adder 20 as another input operand. A third input operand to the carry-save adder 20 is the unrounded accumulation result from the unrounded accumulation result register)
	determine the second rounding increment (Manzo [0032]: accumulation rounding data derived from the unrounded accumulate result)
	and perform the rounding based on the second rounding increment (Manzo [0037]: Thus, during the third processing clock cycle illustrated in FIG. 4, the rounding circuitry in the final stage of the pipeline serves to generate accumulate rounding data and the rounded accumulate result which is formed from a carry-save add of the unrounded accumulate result, the accumulate rounding data and the multiplication rounding data).

With regards to claim 12, Manzo teaches all of the limitations of  claim 1 above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to select, as the value based on the first rounding increment, one of 0, 1 and 2 (Manzo [0036]: This multiplication rounding data includes a multiplication rounding bit (either a “0” or a “1”) to be added to an unrounded value).

With regards to claim 16, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches  wherein the chained-floating-point-multiply-accumulate circuitry is configured to output a result value equivalent to generating a product of the first floating-point operand and the second floating-point operand, then rounding the product, then adding the rounded product to the third floating-point operand to generate an unrounded sum, and then rounding of the unrounded sum to generate the result value (Manzo [0038]: The multiplication rounding data and the accumulation rounding data are subject to compensation such that the arithmetic result of the chained multiply accumulation operation in which all the rounding is performed in the final stage is the same as a conventional chained multiply accumulation operation during which the rounding of the intermediate multiplication result is performed and applied to that intermediate multiplication result before that intermediate multiplication result is added to the third input operand to perform the accumulate operation).

With regards to claim 19, Manzo teaches A method comprising: decoding instructions with instruction decode circuitry; (Manzo [0030]: The floating point chained multiply accumulate pipeline of FIG. 1 comprises decoder circuitry 2 for decoding a program instruction specifying a floating point chained multiply accumulate operation to be performed)
	executing the instructions decoded by the instruction decode circuitry, in response to a chained-floating-point-multiply-accumulate instruction decoded by the instruction decoder,  (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation)
	the chained-floating-point-multiply-accumulate instruction specifying a first operand, a second operand and a third operand: (Manzo [0030]: The decoder circuitry 2 responds to the decoding of such an instruction to generate control signals which then control the other portions of the pipeline circuitry illustrated in FIG. 1 to perform the operations described below to perform the specified floating point chained multiply accumulate operation. A floating point register file 4 stores three floating point input operands A, B and C)
	generating, by chained-floating-point-multiply-accumulate circuitry, an unrounded product based on multiplying the first operand and the second operand; (Manzo [0031]: The multiplier 14 multiplies the two input operands it receives to form an unrounded multiplication result)
	generating, by the chained-floating-point-multiply-accumulate circuitry, a first rounding increment based on the unrounded product; (Manzo [0032]: The rounding value determination circuitry 22 serves to generate multiplication rounding data)
	generating, by the chained-floating-point-multiply-accumulate circuitry, a sum based on adding the unrounded product, a value based on the first rounding increment, and the third operand; (Manzo [0035]: During the second processing cycle a third input operand A is read from the floating point register file 4 and supplied as one input operand to the adder 12. The unrounded multiplication result is read from the unrounded multiplication result register 18 and passed via the adder-input multiplexer 10 to the other input of the adder 12. Thus, the adder 12 during the second processing clock cycle serves to add the third input operand A to the unrounded multiplication result and generate an unrounded accumulation; Manzo [0037]: the multiplication rounding data stored within the multiply accumulate rounding data register 26 is supplied via chained multiply accumulate compensation circuitry 24 (where it is subject to any adjustment required to take account of the late application of the rounding associated with the multiplication) and from where it is then passed to the carry-save adder 20 as another input operand. A third input operand to the carry-save adder 20 is the unrounded accumulation result from the unrounded accumulation result register)
	determining, by the chained-floating-point-multiply-accumulate circuitry, a second rounding increment based on the sum; (Manzo [0032]: accumulation rounding data derived from the unrounded accumulate result)
	and performing, by the chained-floating-point-multiply-accumulate circuitry, rounding based on the second rounding increment (Manzo [0037]: Thus, during the third processing clock cycle illustrated in FIG. 4, the rounding circuitry in the final stage of the pipeline serves to generate accumulate rounding data and the rounded accumulate result which is formed from a carry-save add of the unrounded accumulate result, the accumulate rounding data and the multiplication rounding data).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 2 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Elmer et al. (US 20160004507 A1) hereinafter Elmer.

With regards to claim 2, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation)
	generating the sum based on adding the unrounded product, the value based on the first rounding increment, and the third floating-point operand (Manzo [0035]: During the second processing cycle a third input operand A is read from the floating point register file 4 and supplied as one input operand to the adder 12. The unrounded multiplication result is read from the unrounded multiplication result register 18 and passed via the adder-input multiplexer 10 to the other input of the adder 12. Thus, the adder 12 during the second processing clock cycle serves to add the third input operand A to the unrounded multiplication result and generate an unrounded accumulation; Manzo [0037]: the multiplication rounding data stored within the multiply accumulate rounding data register 26 is supplied via chained multiply accumulate compensation circuitry 24 (where it is subject to any adjustment required to take account of the late application of the rounding associated with the multiplication) and from where it is then passed to the carry-save adder 20 as another input operand. A third input operand to the carry-save adder 20 is the unrounded accumulation result from the unrounded accumulation result register).
	Manzo fails to teach align the unrounded product and the third floating-point operand.
	However, Elmer does teach align the unrounded product and the third floating-point operand before generating the sum of Manzo (Elmer [0019]: In one implementation, if A, B and/or C meet a sufficient condition for performing a joint accumulation of C with partial products of A and B, then ExpDelta is used to align a mantissa of C with partial products of mantissas of A and B).
Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with aligning of the inputs as taught by Elmer. One of ordinary skill in the art would be motivated to make this combination because it would ensure that the results are correct when generating the sum.

With regards to claim 14, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation).
	Manzo fails to teach truncate the unrounded product before generating the sum.
	However, Elmer does teach truncate the unrounded product before generating the sum of Manzo (Elmer [0017]: The unrounded result is then truncated to generate an unrounded non-redundant intermediate result vector that excludes one or more least significant bits of the unrounded non-redundant result).
Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with truncating the unrounded product as taught by Elmer. One of ordinary skill in the art would be motivated to make this combination because would generate an unrounded non-redundant intermediate result vector that has a mantissa width equal to a mantissa width of a target data format, which would help make sure there is no overflow as taught by Elmer (Elmer [0018]).

Claims 3 and 4 are rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Elmer further in view of Chen et al. (US 20190138274 A1) hereinafter Chen further in view of Langhammer et al. (US 10061579 B1) hereinafter Langhammer.

With regards to claim 3, Manzo in view of Elmer teaches all of the limitations of claim 2 above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to: generate the unrounded product (Manzo [0031]: The multiplier 14 multiplies the two input operands it receives to form an unrounded multiplication result).
While Manzo teaches unrounded products, The fail to specifically teach the unrounded product being in an un-normalized form.
However, Chen does teach of an unrounded product being in an un-normalized form (Chen [0018]: First, the multiplication circuit 110 receives the operands A and B in stage 1 (step S210) and then multiplies the operands A and B to thereby generate a product D and a product D_r (step S220). The product D is the result which is not trimmed (normalized, or rounded/truncated))
Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo in view of Elmer with the un-normalized form as taught by Chen. One of ordinary skill in the art would be motivated to make this combination because it would speed up calculations as the results would not have to be normalized during calculations.
Manzo in view of Chen fails to teach and before aligning the unrounded product and the third floating-point operand, append an additional bit at a most significant end of a mantissa of the third floating-point operand to align a binary point position of the third floating-point operand with a binary point position of the unrounded product.
However, Langhammer does teach and before aligning the unrounded product and the third floating-point operand, append an additional bit at a most significant end of a mantissa of the third floating-point operand to align a binary point position of the third floating-point operand with a binary point position of the unrounded product (Langhammer Col. 4 lines 32-33: Two bits may be added beyond the most significant bit position; Langhammer Col. 7 Lines 33-35: thereby aligning the MSBs of the mantissas of the first and second double-precision floating-point numbers; the aligning happens after the appending).
Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo in view of Elmer further in view of Chen with the appending as taught by Langhammer. One of ordinary skill in the art would be motivated to make this combination because it may absorb any overflow produced by a floating-point arithmetic operation, which would ensure that the results are not wrong due to overflow as taught by Langhammer (Langhammer Col. 4 line 33-34).

With regards to claim 4, Manzo in view of Elmer further in view of Chen further in view of Langhammer teaches all of the limitations of claim 3 above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation).
Manzo fails to teach align the unrounded product and the third floating-point operand based on an exponent difference, wherein the exponent difference has a value corresponding to Exponent difference = |a_exp + b_exp - bias - expc|.
However, Elmer does teach align the unrounded product and the third floating-point operand based on an exponent difference, wherein the exponent difference has a value corresponding to Exponent difference = |a_exp + b_exp - bias - expc| (Elmer [0019]: In one implementation, if A, B and/or C meet a sufficient condition for performing a joint accumulation of C with partial products of A and B, then ExpDelta is used to align a mantissa of C with partial products of mantissas of A and B; Elmer [0026]: The calculation of the exponent difference ExpDelta may also subtract an exponent bias value from the sum of A and B's exponent values minus C's exponent value).
Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo in view of Elmer further in view of Chen further in view of Langhammer with the alignment equation as taught by Elmer. One of ordinary skill in the art would be motivated to make this combination because it would ensure that the results are correct when generating the sum.

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Elmer further in view of Chen further in view of Langhammer further in view of Boersma et al (US 20150149522 A1) hereinafter Boersma.

With regards to claim 5, Manzo in view of Elmer further in view of Chen further in view of Langhammer teaches all of the limitations of claim 4 above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to: (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation). 
Manzo fails to teach increment, after calculating the exponent difference, either the exponent associated with the unrounded product or the exponent of the third floating-point operand.
However, Boersma does teach increment, after calculating the exponent difference, either the exponent associated with the unrounded product or the exponent of the third floating-point operand (Boersma [0037]: FPU 102 corrects the exponent, i.e., the exponent for the rounded exponent of the normalized faction result is corrected by incrementing the rounded exponent of the normalized exponent by 1 decimal place).
Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo in view of Elmer further in view of Chen further in view of Langhammer with incrementing as taught by Boersma. One of ordinary skill in the art would be motivated to make this combination because it would make sure that the exponents are correct, ensuring that there are no errors during the calculations.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Elmer further in view of Chen.

With regards to claim 6, Manzo teaches all of the limitations of claim 1above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to: generate the unrounded product (Manzo [0031]: The multiplier 14 multiplies the two input operands it receives to form an unrounded multiplication result).
While Manzo teaches unrounded products, The fail to specifically teach the unrounded product being in an un-normalized form.
However, Chen does teach of an unrounded product being in an un-normalized form (Chen [0018]: First, the multiplication circuit 110 receives the operands A and B in stage 1 (step S210) and then multiplies the operands A and B to thereby generate a product D and a product D_r (step S220). The product D is the result which is not trimmed (normalized, or rounded/truncated))
Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo in view of Elmer with the un-normalized form as taught by Chen. One of ordinary skill in the art would be motivated to make this combination because it would speed up calculations as the results would not have to be normalized during calculations.
	Manzo in view of Chen fails to teach generate an exponent difference based on an exponent associated with the unrounded product and an exponent of the third floating-point operand; and align the unrounded product and the third floating-point operand based on the exponent difference.
	However, Elmer does teach generate an exponent difference based on an exponent associated with the unrounded product and an exponent of the third floating-point operand; (Elmer [0019]: In one implementation, if A, B and/or C meet a sufficient condition for performing a joint accumulation of C with partial products of A and B, then ExpDelta is used to align a mantissa of C with partial products of mantissas of A and B)
	and align the unrounded product and the third floating-point operand based on the exponent difference (Elmer [0019]: In one implementation, if A, B and/or C meet a sufficient condition for performing a joint accumulation of C with partial products of A and B, then ExpDelta is used to align a mantissa of C with partial products of mantissas of A and B).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo in view of Chen with aligning of the inputs as taught by Elmer. One of ordinary skill in the art would be motivated to make this combination because it would ensure that the results are correct when generating the sum.

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Boersma.

With regards to claim 8, Manzo teaches all of the limitations of claim 7 above. Manzo further teaches wherein the floating-point-multiply circuitry comprises (Manzo [0030]: FIG. 1 schematically illustrates processing circuitry for performing a floating point chained multiply accumulate operation; Manzo Fig. 1: shows the circuit with adder circuitry, multiplier circuitry, and rounding circuitry which is part of both the adder and multiplier circuitry)
	and the floating- point-add circuitry comprises (Manzo [0030]: FIG. 1 schematically illustrates processing circuitry for performing a floating point chained multiply accumulate operation; Manzo Fig. 1: shows the circuit with adder circuitry, multiplier circuitry, and rounding circuitry which is part of both the adder and multiplier circuitry).
	Manzo fails to teach a first pipeline stage and a second pipeline stage subsequent to the first pipeline stage.
	However, Boersma teaches a first pipeline stage (Boersma [0023]: In one embodiment, the multiplication of operands may be performed in a first stage of the pipeline)
	a second pipeline stage subsequent to the first pipeline stage (Boersma [0023]: outputting two partial products that need to be added in a later pipeline stage).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with the pipeline stages as taught by Boersma. One of ordinary skill in the art would be motivated to make this combination because it would speed up calculations as the first stage of a second computation could begin before the second stage of the first computation finishes.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Boersma further in view of Sanguinetti (“Micro-Analysis of the Titan's Operation Pipe”) hereinafter Sanguinetti.

With regards to claim 9, Manzo in view of Boersma teaches all of the limitations of claim 8 above. Manzo further teaches comprising issue circuitry to issue the instructions decoded by the instruction decode circuitry to the processing circuitry for execution, (Manzo [0030]: The decoder circuitry 2 responds to the decoding of such an instruction to generate control signals which then control the other portions of the pipeline circuitry illustrated in FIG. 1 to perform the operations described below to perform the specified floating point chained multiply accumulate operation)
	[wherein when a first instance of] the chained-floating-point-multiply-accumulate instruction [and a second instance of] the chained-floating-point-multiply-accumulate instruction [are issued sequentially and] input operands [specified by the second instance of] the chained- floating-point-multiply-accumulate instruction [are independent of] a result generated in response to [the first instance of] the chained-floating-point-multiply-accumulate instruction, the floating-point-multiply circuitry [is configured to begin processing the second instance of] the chained-floating-point-multiply-accumulate instruction while the floating-point-add circuitry is processing [the first instance of] the chained-floating-point-multiply-accumulate instruction (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation... three floating point input operands A, B and C; Manzo [0037]: the rounded accumulate result; Manzo Fig. 1: shows the circuit with adder circuitry, multiplier circuitry, and rounding circuitry which is part of both the adder and multiplier circuitry).
	Manzo fails to teach wherein when a first instance of [the chained-floating-point-multiply-accumulate instruction] and a second instance of [the chained-floating-point-multiply-accumulate instruction] are issued sequentially and [input operands] specified by the second instance of [the chained- floating-point-multiply-accumulate instruction] are independent of [a result generated in response to] the first instance of [the chained-floating-point-multiply-accumulate instruction, the floating-point-multiply circuitry] is configured to begin processing the second instance of [the chained-floating-point-multiply-accumulate instruction while the floating-point-add circuitry is processing] the first instance of [the chained-floating-point-multiply-accumulate instruction].
	However, Sanguinetti does teach wherein when a first instance of [the chained-floating-point-multiply-accumulate instruction] and a second instance of [the chained-floating-point-multiply-accumulate instruction] are issued sequentially and [input operands] specified by the second instance of [the chained- floating-point-multiply-accumulate instruction] are independent of [a result generated in response to] the first instance of [the chained-floating-point-multiply-accumulate instruction, the floating-point-multiply circuitry] is configured to begin processing the second instance of [the chained-floating-point-multiply-accumulate instruction while the floating-point-add circuitry is processing] the first instance of [the chained-floating-point-multiply-accumulate instruction] (Sanguinetti Page 192 The Operation Pipe Section: Standard pipeline design (see [Kogge]) follows a greedy strategy. An entry in the pipe moves to the next stage unless stalled by either a stall. condition for the destination stage, e.g. operand fetch stalled, or the stage ahead is stalled).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo in view of Boersma with starting the second instance before the first in finished as taught by Sanguinetti. One of ordinary skill in the art would be motivated to make this combination because it would speed up calculations as the first stage of a second computation could begin before the second stage of the first computation finishes.

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Argade et al (US 20160048374 A1) hereinafter Argade.

With regards to claim 10, Mano teaches all of the limitations of claim 1 above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to determine the value based on the first rounding increment in dependence on at least one of: (Manzo [0032]: The rounding value determination circuitry 22 serves to generate multiplication rounding data)
	Manzo fails to teach whether an exponent associated with the unrounded product is larger than an exponent of the third floating-point operand; and whether the sum based on adding the unrounded product, the value based on the first rounding increment, and the third floating-point operand represents a like-signed addition or an unlike-signed addition.
	However, Argade teaches whether an exponent associated with the unrounded product is larger than an exponent of the third floating-point operand; (Argade [0079]: In some examples, determining the upper value and the lower value based at least in part on adding the third operand to one of the upper intermediate value or the lower intermediate value may include in response to a sign of the product being different from a sign of the third operand, and in response to an exponent of the intermediate value being greater than an exponent of the third operand)
	and whether the sum based on adding the unrounded product, the value based on the first rounding increment, and the third floating-point operand represents a like-signed addition or an unlike-signed addition (Argade [0079]: In some examples, determining the upper value and the lower value based at least in part on adding the third operand to one of the upper intermediate value or the lower intermediate value may include in response to a sign of the product being different from a sign of the third operand, and in response to an exponent of the intermediate value being greater than an exponent of the third operand).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with calculating the value based on the larger exponent or if the addition is like or un-like signed as taught by Argade. One of ordinary skill in the art would be motivated to make this combination because it would ensure that the end result is rounded correctly, preventing errors in the calculations.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Elmer further in view of Burgess et al. (US 20130339412 A1) hereinafter Burgess.

With regards to claim 11, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches
wherein: the chained-floating-point-multiply-accumulate circuitry is configured to ()
and the chained-floating-point-multiply-accumulate circuitry is configured to determine the value based on the first rounding increment [in dependence on values of any shifted-out bits shifted, when aligning] the unrounded product and the third floating-point operand, [to bit positions less significant than a least significant bit position of a result value] obtained by performing the rounding based on the second rounding increment (Manzo [0032]: The rounding value determination circuitry 22 serves to generate multiplication rounding data; Manzo [0031]: The multiplier 14 multiplies the two input operands it receives to form an unrounded multiplication result; Manzo [0035]: During the second processing cycle a third input operand A is read from the floating point register file 4 and supplied as one input operand to the adder 12. The unrounded multiplication result is read from the unrounded multiplication result register 18 and passed via the adder-input multiplexer 10 to the other input of the adder 12; Manzo [0037]: Thus, during the third processing clock cycle illustrated in FIG. 4, the rounding circuitry in the final stage of the pipeline serves to generate accumulate rounding data and the rounded accumulate result which is formed from a carry-save add of the unrounded accumulate result, the accumulate rounding data and the multiplication rounding data).
	Manzo fails to teach align the unrounded product and the third floating-point operand before generating the sum based on adding the unrounded product, the value based on the first rounding increment, and the third floating-point operand; and when aligning.
	However, Elmer does teach align the unrounded product and the third floating-point operand before generating the sum based on adding the unrounded product, the value based on the first rounding increment, and the third floating-point operand (Elmer [0019]: In one implementation, if A, B and/or C meet a sufficient condition for performing a joint accumulation of C with partial products of A and B, then ExpDelta is used to align a mantissa of C with partial products of mantissas of A and B)
	when aligning the unrounded product and the third floating-point operand of Manzo (Elmer [0019]: In one implementation, if A, B and/or C meet a sufficient condition for performing a joint accumulation of C with partial products of A and B, then ExpDelta is used to align a mantissa of C with partial products of mantissas of A and B).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with aligning of the inputs as taught by Elmer. One of ordinary skill in the art would be motivated to make this combination because it would ensure that the results are correct when generating the sum.
	Manzo in view of Elmer fails to teach [and the chained-floating-point-multiply-accumulate circuitry is configured to determine the value based on the first rounding increment] in dependence on values of any shifted-out bits shifted, [when aligning the unrounded product and the third floating-point operand,] to bit positions less significant than a least significant bit position of a result value [obtained by performing the rounding based on the second rounding increment].
	However, Burgess teaches [and the chained-floating-point-multiply-accumulate circuitry is configured to determine the value based on the first rounding increment] in dependence on values of any shifted-out bits shifted, [when aligning the unrounded product and the third floating-point operand,] to bit positions less significant than a least significant bit position of a result value [obtained by performing the rounding based on the second rounding increment] (Burgess [0015]: The value of the rounding value may be dependent on at least one shifted-out bit of the input value that is not present in the shifted value generated by the shifting circuitry. For example, with a right shift, the rounding value may take the value of the most significant shifted-out bit (the bit of the input value lying one place to the right of the bit that becomes the least significant bit of the shifted value); Burgess [0004]: In this way, a sequence of different input values can be added together with the shift operation aligning successive input values with the accumulated value).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo in view of Elmer with the value being based on the shift out bits as taught by Burgess. One of ordinary skill in the art would be motivated to make this combination because it would ensure that the end result is rounded correctly, preventing errors in the calculations.

Claims 13, 17-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Mueller eta al. (US 20210182024 A1) hereinafter Mueller.

With regards to claim 13, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches the processing circuitry (Manzo [0030]: This processing circuitry will typically form a part of the floating point arithmetic pipeline within a processor core).
	Manzo fails to teach comprising a central processing unit or a graphics processing unit, wherein the central processing unit or the graphics processing unit.
	However, Mueller teaches comprising a central processing unit or a graphics processing unit, wherein the central processing unit or the graphics processing unit comprises the processing circuitry of Manzo (Mueller [0091]: In some aspects of the present disclosure, processing system 800 includes a graphics processing unit).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with the graphics processing unit as taught by Mueller. One of ordinary skill in the art would be motivated to make this combination because graphics processing units are more effective than general-purpose CPUs for algorithms where processing of large blocks of data is done in parallel as taught by Mueller (Mueller [0091]).

With regards to claim 17, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches [wherein when one or more of] the first floating-point operand, the second floating-point operand and the third floating-point operand [comprises a sub-normal floating point value,] the chained- floating-point-multiply-accumulate circuitry is configured to [treat the sub-normal floating point value as zero] when processing the chained-floating-point-multiply-accumulate instruction (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation… three floating point input operands A, B and C).
	Manzo fails to teach wherein when one or more of [the first floating-point operand, the second floating-point operand and the third floating-point operand] comprises a sub-normal floating point value, [the chained- floating-point-multiply-accumulate circuitry is configured to] treat the sub-normal floating point value as zero [when processing the chained-floating-point-multiply-accumulate instruction].
	However, Mueller teaches wherein when one or more of [the first floating-point operand, the second floating-point operand and the third floating-point operand] comprises a sub-normal floating point value, [the chained- floating-point-multiply-accumulate circuitry is configured to] treat the sub-normal floating point value as zero [when processing the chained-floating-point-multiply-accumulate instruction] (Mueller [0028]: this approach flushes subnormal operands to zero).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with flushing the operands to zero as taught by Mueller. One of ordinary skill in the art would be motivated to make this combination because this approach can provide a two-fold increase in throughput over a conventional fp32 FMA as taught by Mueller (Mueller [0028]).

With regards to claim 18, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches wherein the chained-floating-point-multiply-accumulate circuitry is configured to [flush] the unrounded product or a result value calculated by the chained-floating-point-multiply- accumulate circuitry [to zero in response to determining that the] unrounded product or the result value [is too small to represent as a normalized floating-point number in a floating-point format to be used for the result value] (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation; Manzo [0031]: The multiplier 14 multiplies the two input operands it receives to form an unrounded multiplication result; Manzo [0037]: the rounded accumulate result).
	Manzo fails to teach [wherein the chained-floating-point-multiply-accumulate circuitry is configured to] flush [the unrounded product or a result value calculated by the chained-floating-point-multiply- accumulate circuitry] to zero in response to determining that [the unrounded product or the result value] is too small to represent as a normalized floating-point number in a floating-point format to be used for the result value.
	However, Mueller teaches [wherein the chained-floating-point-multiply-accumulate circuitry is configured to] flush [the unrounded product or a result value calculated by the chained-floating-point-multiply- accumulate circuitry] to zero in response to determining that [the unrounded product or the result value] is too small to represent as a normalized floating-point number in a floating-point format to be used for the result value (Mueller [0028]: forces subnormal results to zero).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with flushing the results to zero as taught by Mueller. One of ordinary skill in the art would be motivated to make this combination because this approach can provide a two-fold increase in throughput over a conventional fp32 FMA as taught by Mueller (Mueller [0028]).

With regards to claim 20, Manzo teaches an apparatus comprising: instruction decode circuitry to decode instructions; (Manzo [0030]: The floating point chained multiply accumulate pipeline of FIG. 1 comprises decoder circuitry 2 for decoding a program instruction specifying a floating point chained multiply accumulate operation to be performed)
	and processing circuitry to execute the instructions decoded by the instruction decode circuitry, wherein the processing circuitry comprises chained-floating-point-multiply-accumulate circuitry (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation)
	responsive to a chained-floating-point-multiply-accumulate instruction decoded by the instruction decoder, (Manzo [0030]: processing circuitry for performing a floating point chained multiply accumulate operation)
	the chained-floating-point-multiply-accumulate instruction specifying a first floating-point operand, a second floating-point operand and a third floating-point operand, (Manzo [0030]: The decoder circuitry 2 responds to the decoding of such an instruction to generate control signals which then control the other portions of the pipeline circuitry illustrated in FIG. 1 to perform the operations described below to perform the specified floating point chained multiply accumulate operation. A floating point register file 4 stores three floating point input operands A, B and C)
	to: generate an unrounded product based on multiplying the first floating-point operand and the second floating-point operand; (Manzo [0031]: The multiplier 14 multiplies the two input operands it receives to form an unrounded multiplication result)
	generate a first rounding increment based on the unrounded product; (Manzo [0032]: The rounding value determination circuitry 22 serves to generate multiplication rounding data)
	generate a sum based on adding the unrounded product, a value based on the first rounding increment, and the third floating-point operand; (Manzo [0035]: During the second processing cycle a third input operand A is read from the floating point register file 4 and supplied as one input operand to the adder 12. The unrounded multiplication result is read from the unrounded multiplication result register 18 and passed via the adder-input multiplexer 10 to the other input of the adder 12. Thus, the adder 12 during the second processing clock cycle serves to add the third input operand A to the unrounded multiplication result and generate an unrounded accumulation; Manzo [0037]: the multiplication rounding data stored within the multiply accumulate rounding data register 26 is supplied via chained multiply accumulate compensation circuitry 24 (where it is subject to any adjustment required to take account of the late application of the rounding associated with the multiplication) and from where it is then passed to the carry-save adder 20 as another input operand. A third input operand to the carry-save adder 20 is the unrounded accumulation result from the unrounded accumulation result register)
	determine a second rounding increment based on the sum; (Manzo [0032]: accumulation rounding data derived from the unrounded accumulate result)
	and perform rounding based on the second rounding increment (Manzo [0037]: Thus, during the third processing clock cycle illustrated in FIG. 4, the rounding circuitry in the final stage of the pipeline serves to generate accumulate rounding data and the rounded accumulate result which is formed from a carry-save add of the unrounded accumulate result, the accumulate rounding data and the multiplication rounding data).
	Manzo fails to teach A non-transitory computer-readable medium to store computer-readable code for fabrication of the apparatus.
	However, Mueller teaches A non-transitory computer-readable medium to store computer-readable code for fabrication of the apparatus of Manzo (Mueller [0030]: the various components, modules, engines, etc. described regarding FIG. 1 can be implemented as instructions stored on a computer-readable storage medium).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with the non-transitory computer-readable medium as taught by Mueller. One of ordinary skill in the art would be motivated to make this combination because it would allow for the system to be manufactured, allowing for the wide spread adoption of the system.

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Manzo in view of Oklobdzija eta al. (WO 2022204069 A1) hereinafter Oklobdzija.

With regards to claim 15, Manzo teaches all of the limitations of claim 1 above. Manzo further teaches wherein: the chained-floating-point-multiply-accumulate circuitry comprises a [3:2 carry save] adder; (Manzo [0031]: The input operands from the input operand registers 6, 8 may be supplied to either the adder)
	and the chained-floating-point-multiply-accumulate circuitry is configured to generate the sum using the [3:2 carry save] adder (Manzo [0031]: ; Manzo [0035]; Manzo [0037]).
	Manzo fails to teach the 3:2 carry save adder.
	However, Oklobdzija does teach the 3:2 carry save adder (Oklobdzija Page 23 Line 7: The 42-Bits 3:2 Carry-save Adder circuit 614 has three inputs).
	Therefore, it would have been obvious before the effective filing date of the claimed invention for one of ordinary skill in the art to combine the teachings of Manzo with the 3:2 carry save adder as taught by Oklobdzija. One of ordinary skill in the art would be motivated to make this combination because it would avoid long carry propagation and speed up the operation as taught by Oklobdzija (Oklobdzija abstract).

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jakob O Gudas whose telephone number is (571)272-0695. The examiner can normally be reached Monday-Thursday: 7:30AM-5:00PM Friday: 7:30AM-4:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James Trujillo can be reached at (571) 272-3677. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/J.O.G./Examiner, Art Unit 2151

/James Trujillo/Supervisory Patent Examiner, Art Unit 2151
Read full office action
Prosecution Timeline

Mar 22, 2022
Application Filed
Jul 31, 2025
Non-Final Rejection — §101, §102, §103
Nov 05, 2025
Response Filed
Feb 23, 2026
Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/485,179
Patent 12602200
ANALOG MULTIPLY-ACCUMULATE UNIT FOR MULTIBIT IN-MEMORY CELL COMPUTING
2y 5m to grant Granted Apr 14, 2026
17/765,495
Patent 12566586
HIGH-SPEED QUANTUM RANDOM NUMBER GENERATOR BASED ON VACUUM STATE FLUCTUATION TECHNOLOGY
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
44%
Grant Probability
99%
With Interview (+71.1%)
4y 2m
Median Time to Grant
Moderate
PTA Risk
Based on 9 resolved cases by this examiner. Grant probability derived from career allow rate.