DETAILED ACTION
Claims 1-18 are pending in the case. Claims 1, 7, and 13 are independent claims.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to because the instruction encodings table in Figures 4, 8, and 16 are cut off.
The drawings are further objected for failing to comply with 37 CFR 1.84(a)(1) and 37 CFR 1.84(l), which requires the drawings be in black, and that all drawings be made by a process which will give them satisfactory reproduction characteristics. Every line, number, and letter must be durable, clean, solid black (except for color drawings), sufficiently dense and dark, and uniformly thick and well-defined. The weight of all lines and letters must be heavy enough to permit adequate reproduction. This requirement applies to all lines however fine, to shading, and to lines representing cut surfaces in sectional views. Lines and strokes of different thicknesses may be used in the same drawing where different thicknesses have a different meaning. Zooming in to the figures shows pixelation, which is a sign that these drawings weren’t drawn in black.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.
The disclosure is objected to because of the following informalities:
In the abstract, line 4, replace “odd 16-bit floating point values” with “16-bit floating point values from odd data element positions”.
In paragraph 1, line 5, “disproportional” should read --disproportionately--.
In paragraphs 4, 5, 9, 10, 14, 15, 19, 20, 43, 47, 60, 64, 77, 81, 94, 98, line 1, insert --a-- before “single decoded instruction”.
In paragraph 6, 11, 16, 21, 51, 68, 85, 102, line 1, insert --a-- before “method”.
In paragraph 42, line 7, insert --an-- before “odd”.
In paragraph 42, line 9, insert --an-- before “even”.
In paragraphs 45, 49, 62, 66, last line, “to BF16” should read --to a BF16--.
In paragraphs 79, 83, last line, “to FP16” should read --to a FP16--.
In paragraphs 52, 69, 86, 103, first lines, insert --a-- before “destination operand”. This additionally applies to the three instances in paragraph 201.
In paragraphs 55, 72, 89, 106, line 2, insert --is/are-- before “scheduled”.
In paragraph 120, line 5, insert --the-- before “first processor 1970”.
In paragraph 138, line 4, “reservations” should read --reservation--.
In paragraph 138, line 19, delete “a” in “a register maps”.
In paragraph 142, line 1, “instructions sets” should read --instruction sets--.
In paragraph 143, line 2, “1,024 bits” should read --1,024-bit--.
In paragraph 144, line 9, “consists” should read --consist--.
In paragraph 153, line 5, “though” should read --through--.
In paragraph 169, line 1, insert --be-- after “may”.
In paragraph 171, line 1, “Bit position B” should read --Bit position 0--.
In paragraph 172, line 8, “25 04 being” should read --2504 is being--.
In paragraphs 178, 183, line 1, insert --an-- before “instruction”.
In paragraphs 178, 183, line 1, “support” should read --supports--.
In paragraph 179, line 1, “in” should be capitalized and read --In--.
In paragraph 179, line 4, “need” should read --needed--.
In paragraph 185, line 4, “register” should read --registers--.
In paragraph 188, line 4, “consist” should read --consists--.
In paragraph 190, line 4, “a opmask” should read --an opmask--.
In paragraph 191, line 2, “an upper” should read --the upper--.
Appropriate correction is required.
Claim Objections
Claims 1, 7, and 13 are objected to because of the following informalities:
In line 4, insert --a-- before “destination operand”.
Claim 7 is objected to because of the following informalities:
In lines 9 and 10, insert --instruction-- before “set”.
Claim 13 is objected to because of the following informalities:
In line 8, insert --an-- before “instance”.
Appropriate correction is required.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1-18 are provisionally rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1-18, respectively, of copending Application No. 17/560534 in view of Eapen (US 20170031682 A1) in further view of Kashyap (US 20190042544 A1).
The claims of the instant application and the claims of the reference copending application are compared in the table below.
Instant Application 17560557
Copending Application No. 17560534
1. An apparatus comprising:
decoder circuitry to decode a single instruction,
the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand,
wherein the opcode is to indicate instruction processing circuitry is to convert 16-bit floating point values from odd data element positions from the identified source operand into 32-bit floating point values using round to nearest even and
store the 32-bit floating point values in data element positions of the identified destination operand; and
instruction processing circuitry to execute the decoded instruction according to the opcode.
1. An apparatus comprising:
decoder circuitry to decode a single instruction,
the single instruction to include fields for an opcode, an identification of source operand location, and an identification of destination operand location,
wherein the opcode is to indicate instruction processing circuitry is to convert a 16-bit floating-point value from the identified source operand location into a 32-bit floating point value and
store that 32-bit floating point value in one or more data element positions of the identified destination operand; and
instruction processing circuitry to execute the decoded instruction according to the opcode.
2. The apparatus of claim 1, wherein the field for the identifier of the source operand is to identify a vector register.
2. The apparatus of claim 1, wherein the field for an identification of the source operand location is to identify a vector register.
3. The apparatus of claim 1, wherein the field for the identifier of the source operand is to identify a memory location.
3. The apparatus of claim 1, wherein the field for an identification of the source operand location is to identify a memory location.
4. The apparatus of claim 1, wherein the 16-bit floating point values are BF16 values.
4. The apparatus of claim 1, wherein the 16-bit floating-point value is a BF16 value.
5. The apparatus of claim 4, wherein instruction processing circuitry is to convert the BF16 values to 32-bit floating point values by appending sixteen zeros to each of the BF16 values.
5. The apparatus of claim 4, wherein to convert the BF16 value to the 32-bit floating point value, the instruction processing circuitry is to append sixteen zeros to the BF16 value.
6. The apparatus of claim 1, wherein the 16-bit floating point values are FP16 values.
6. The apparatus of claim 1, wherein the 16-bit floating-point value is a FP16 value.
7. A method comprising:
translating a single instruction of a first instruction set into one or more instructions of a second instruction set,
the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand,
wherein the opcode is to indicate instruction processing circuitry is to convert 16-bit floating point values from odd data element positions from the identified source operand into 32-bit floating point values using round to nearest even and
store the 32-bit floating point values in data element positions of the identified destination operand decoding a one or more instructions of the second set; and
executing the decoded one or more instructions of the second set according to the opcode of the single instruction of the first instruction set.
7. A method comprising:
translating a single instruction of a first instruction set architecture into one or more instructions of a second, different instruction set architecture,
the single instruction to include fields for an opcode, an identification of source operand location, and an identification of destination operand location,
wherein the opcode is to indicate instruction processing circuitry is to convert a 16-bit floating- point value from the identified source operand location into a 32-bit floating point value and
store that 32-bit floating point value in one or more data element positions of the identified destination operand decoding one or more instructions of a second, different instruction set architecture; and
executing the decoded one or more instructions of a second, different instruction set architecture according to the opcode of the single instruction of the first instruction set architecture.
8. The method of claim 7, wherein the field for the identifier of the source operand is to identify a vector register.
8. The method of claim 7, wherein the field for an identification of the source operand location is to identify a vector register.
9. The method of claim 7, wherein the field for the identifier of the source operand is to identify a memory location.
9. The method of claim 7, wherein the field for an identification of the source operand location is to identify a memory location.
10. The method of claim 7, wherein the 16-bit floating point values are BF16 values.
10. The method of claim 7, wherein the 16-bit floating-point value is a BF16 value.
11. The method of claim 10, wherein converting the BF16 values to 32-bit floating point values comprises appending sixteen zeros to each of the BF16 values.
11. The method of claim 10, wherein converting the BF16 value to the 32-bit floating point value comprises appending sixteen zeros to the BF16 value.
12. The method of claim 7, wherein the 16-bit floating point values are FP16 values.
12. The method of claim 7, wherein the 16-bit floating-point value is a FP16 value.
13. A system comprising:
memory to store an instance of a single instruction,
the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand,
wherein the opcode is to indicate instruction processing circuitry is to convert 16-bit floating point values from odd data element positions from the identified source operand into 32-bit floating point values using round to nearest even and
store the 32-bit floating point values in data element positions of the identified destination operand
decoder circuitry to decode instance of the single instruction; and
instruction processing circuitry to execute the decoded instance of the single instruction according to the opcode.
13. A system comprising:
memory to store at least one instance of a single instruction,
the single instruction to include fields for an opcode, an identification of source operand location, and an identification of destination operand location,
wherein the opcode is to indicate instruction processing circuitry is to convert a 16-bit floating-point value from the identified source operand location into a 32-bit floating point value and
store that 32-bit floating point value in one or more data element positions of the identified destination operand;
decoder circuitry to decode the at least one instance of the single instruction; and
instruction processing circuitry to execute the decoded the at least one instance of the single instruction according to the opcode.
14. The system of claim 13, wherein the field for the identifier of the source operand is to identify a vector register.
14. The system of claim 14, wherein the field for an identification of the source operand location is to identify a vector register.
15. The system of claim 13, wherein the field for the identifier of the source operand is to identify a memory location.
15. The system of claim 14, wherein the field for an identification of the source operand location is to identify a memory location.
16. The system of claim 13, wherein the 16-bit floating point values are BF16 values.
16. The system of claim 14, wherein the 16-bit floating-point value is a BF16 value.
17. The system of claim 16, wherein instruction processing circuitry is to convert the BF16 values to 32-bit floating point values by appending sixteen zeros to each of the BF16 values.
17. The system of claim 16, wherein to convert the BF16 value to the 32-bit floating point value, the instruction processing circuitry is to append sixteen zeros to the BF16 value.
18. The system of claim 13, wherein the 16-bit floating point values are FP16 values.
18. The system of claim 14, wherein the 16-bit floating-point value is a FP16 value.
The majority of bolded differences are stylistic and are herein addressed. The instant application recites “source operand” whereas the reference application recites “source operand location”. The reference application’s “source operand location” reads on the “source operand” of the instant claims. Moreover, the instant application recites “instruction set” whereas the reference application recites “instruction set architecture”. The reference application’s “instruction set architecture” reads on the “instruction set” of the instant claims.
Claim 1
Claim 1 of the reference copending application recites all of the limitations of claim 1 of the instant application except for 16-bit floating point values “from odd data element positions” and “using round to nearest even”. However, Eapen teaches a type 2 instruction which produces a resulting set based on odd-numbered elements of the input. Specifically, for type 2 the result elements “R0, R1, R2, R3 correspond to input elements X1, X3, X5, X7 respectively” (Eapen, FIG. 9 and [0076]).
It would have been obvious to person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the apparatus recited in claim 1 of copending application No. 17560534 to include the limitation 16-bit floating point values from odd data element positions, as taught by Eapen. Doing so would allow for a more efficient hardware implementation and decrease overhead by allowing “instructions to operate directly on a mask corresponding to a packed vector”. Each processing lane can work with a subset of the input vector without requiring data unpacking or additional register manipulation (Eapen, [0043-0045]).
Kashyap teaches using round to nearest even ([0133] “Round operation control field 959A—just as round operation control field 958, its content distinguishes which one of a group of rounding operations to perform (e.g., Round-up, Round-down, Round-towards-zero and Round-to-nearest)” and [0223] “Example 8 includes the substance of the exemplary processor of Example 1, wherein the rounding mode is one of round to nearest even”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination by incorporating the teachings of Kashyap to use round to nearest even as a rounding mode. Doing so would allow for a rounding mode which provides statistically unbiased rounding.
Claims 2-6
Claim 2-6 of the reference copending application recites the same limitations as claim 2-6 of the instant application respectively.
Claim 7
Claim 7 of the reference copending application recites all of the limitations of claim 7 of the instant application respectfully except for 16-bit floating point values “from odd data element positions” and “using round to nearest even”, similarly to claim 1 and are therefore rejected on the same premises.
Claims 8-12
Claim 8-12 of the reference copending application recites the same limitations as claim 8-12 of the instant application respectively.
Claim 13
Claim 13 of the reference copending application recites all of the limitations of claim 13 of the instant application respectfully except for 16-bit floating point values “from odd data element positions” and “using round to nearest even”, similarly to claim 1 and is therefore rejected on the same premises.
Claims 14-18
Claims 14-18 of the reference copending application recites the same limitations as claims 14-18 of the instant application respectively.
Claims 1-4, 6-10, 12-16, and 18 are provisionally rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1-4, 6-10, 12-16, and 18 of copending Application No. 17/560547 in view of Eapen (US 20170031682 A1) in further view of Kashyap (US 20190042544 A1).
The claims of the instant application and the claims of the reference copending application are compared in the table below.
Instant Application 17560557
Copending Application No. 17560547
1. An apparatus comprising:
decoder circuitry to decode a single instruction,
the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand,
wherein the opcode is to indicate instruction processing circuitry is to convert 16-bit floating point values from odd data element positions from the identified source operand into 32-bit floating point values using round to nearest even and
store the 32-bit floating point values in data element positions of the identified destination operand; and
instruction processing circuitry to execute the decoded instruction according to the opcode.
1. An apparatus comprising:
decoder circuitry to decode a single instruction,
the single instruction to include fields for an opcode, an identification of source operands, and an identification of destination operand,
wherein the opcode is to indicate execution circuitry and/or memory access circuitry is to convert 32-bit floating point values from the identified source operands into 16-bit floating point values and
store 16-bit floating point values in data element positions of the identified destination operand; and
instruction processing circuitry to execute the decoded instruction according to the opcode.
2. The apparatus of claim 1, wherein the field for the identifier of the source operand is to identify a vector register.
2. The apparatus of claim 1, wherein the fields for an identification of the source operands location is to identify two vector registers.
3. The apparatus of claim 1, wherein the field for the identifier of the source operand is to identify a memory location.
3. The apparatus of claim 1, wherein the fields for an identification of the source operands location is to identify a memory location.
4. The apparatus of claim 1, wherein the 16-bit floating point values are BF16 values.
4. The apparatus of claim 1, wherein the 16-bit floating-point value is a BF16 value.
6. The apparatus of claim 1, wherein the 16-bit floating point values are FP16 values.
6. The apparatus of claim 1, wherein the 16-bit floating-point value is a FP16 value.
7. A method comprising:
translating a single instruction of a first instruction set into one or more instructions of a second instruction set,
the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand,
wherein the opcode is to indicate instruction processing circuitry is to convert 16-bit floating point values from odd data element positions from the identified source operand into 32-bit floating point values using round to nearest even and
store the 32-bit floating point values in data element positions of the identified destination operand decoding a one or more instructions of the second set; and
executing the decoded one or more instructions of the second set according to the opcode of the single instruction of the first instruction set.
7. A method comprising:
translating an instance of a single instruction from a first instruction set to one or more instructions of a second, different instruction set,
the single instruction to include fields for an opcode, an identification of source operands, and an identification of destination operand,
wherein the opcode is to indicate execution circuitry and/or memory access circuitry is to convert 32-bit floating point values from the identified source operands into 16-bit floating point values and
store 16-bit floating point values in data element positions of the identified destination operand; and
executing the decoded instruction according to the opcode.
8. The method of claim 7, wherein the field for the identifier of the source operand is to identify a vector register.
8. The method of claim 7, wherein the fields for an identification of the source operands location is to identify two vector registers.
9. The method of claim 7, wherein the field for the identifier of the source operand is to identify a memory location.
9. The method of claim 7, wherein the fields for an identification of the source operands location is to identify a memory location.
10. The method of claim 7, wherein the 16-bit floating point values are BF16 values.
10. The method of claim 7, wherein the 16-bit floating-point value is a BF16 value.
12. The method of claim 7, wherein the 16-bit floating point values are FP16 values.
12. The method of claim 7, wherein the 16-bit floating-point value is a FP16 value.
13. A system comprising:
memory to store an instance of a single instruction,
the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand,
wherein the opcode is to indicate instruction processing circuitry is to convert 16-bit floating point values from odd data element positions from the identified source operand into 32-bit floating point values using round to nearest even and
store the 32-bit floating point values in data element positions of the identified destination operand decoder circuitry to decode instance of the single instruction; and
instruction processing circuitry to execute the decoded instance of the single instruction according to the opcode.
13. A system comprising:
a memory to store an instance of single instruction,
the single instruction to include fields for an opcode, an identification of source operands, and an identification of destination operand,
wherein the opcode is to indicate execution circuitry and/or memory access circuitry is to convert 32-bit floating point values from the identified source operands into 16-bit floating point values and
store 16-bit floating point values in data element positions of the identified destination operand; decoder circuitry to decode the instance of the single instruction; and
instruction processing circuitry to execute the decoded instruction according to the opcode.
14. The system of claim 13, wherein the field for the identifier of the source operand is to identify a vector register.
14. The system of claim 13, wherein the fields for an identification of the source operands location is to identify two vector registers.
15. The system of claim 13, wherein the field for the identifier of the source operand is to identify a memory location.
15. The system of claim 13, wherein the fields for an identification of the source operands location is to identify a memory location.
16. The system of claim 13, wherein the 16-bit floating point values are BF16 values.
16. The system of claim 13, wherein the 16-bit floating-point value is a BF16 value.
18. The system of claim 13, wherein the 16-bit floating point values are FP16 values.
18. The system of claim 13, wherein the 16-bit floating-point value is a FP16 value.
The majority of bolded differences are stylistic and are herein addressed. The instant application recites “source operand” whereas the reference application recites “source operand location”. “Source operand location” reads on the claimed “source operand”. Furthermore, the instant application recites “instruction set” whereas the reference application recites “instruction set architecture”. “Instruction set architecture” reads on the claimed “instruction set”. Furthermore, the instant application recites “instruction processing circuitry” whereas the reference application recites “execution circuitry and/or memory access circuitry”. Reference application’s “execution circuitry and/or memory access circuitry” reads on the claimed “instruction processing circuitry”.
Claim 1
Claim 1 of the reference copending application recites all of the limitations of claim 1 of the instant application except for 16-bit floating point values “from odd data element positions” and “using round to nearest even” and the complementary operation of “convert 16-bit floating point values from the identified source operands into 32-bit floating point values and store 32-bit floating point values”. However, Eapen teaches a type 2 instruction which produces a resulting set based on odd-numbered elements of the input. Specifically, for type 2 the result elements “R0, R1, R2, R3 correspond to input elements X1, X3, X5, X7 respectively.” (Eapen, FIG. 9 and [0076]). Furthermore, Eapen teaches an “element size increasing instruction may implement a doubling of a number of bits in each element, i.e. N=2M” (Eapen, [0045 and 0046]), which corresponds to the conversion between 16-bit and 32-bit floating point values in light of the copending application’s complementary conversion between 32-bit and 16-bit floating point values.
It would have been obvious to person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the apparatus recited in claim 1 of copending application No. 17560547 to include the limitation 16-bit floating point values “from odd data element positions”, as taught by Eapen and the limitation “convert 16-bit floating point values from the identified source operand into 32-bit floating point values and store the 32-bit floating point values”, as taught by Eapen. Doing so would allow for a more efficient hardware implementation and decrease overhead by allowing “instructions to operate directly on a mask corresponding to a packed vector”. Each processing lane can work with a subset of the input vector without requiring data unpacking or additional register manipulation (Eapen, [0043-0045]). It also would allow for the complementary operation to be performed which achieves a more versatile floating-point processing system.
Kashyap teaches using round to nearest even ([0133] “Round operation control field 959A—just as round operation control field 958, its content distinguishes which one of a group of rounding operations to perform (e.g., Round-up, Round-down, Round-towards-zero and Round-to-nearest)” and [0223] “Example 8 includes the substance of the exemplary processor of Example 1, wherein the rounding mode is one of round to nearest even”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination by incorporating the teachings of Kashyap to use round to nearest even as a rounding mode. Doing so would allow for a rounding mode which provides statistically unbiased rounding.
Claims 2-4 and 6
Claim 2-4 and 6 of the reference copending application recite the same limitations as claim 2-4 and 6 of the instant application respectively.
Claim 7
Claim 7 of the reference copending application recites all of the limitations of claim 7 of the instant application respectfully except for 16-bit floating point values “from odd data element positions” and “using round to nearest even” and performs the complementary operation of “convert 32-bit floating point values from the identified source operands into 16-bit floating point values and store 16-bit floating point values”, similarly to claim 1 and is therefore rejected on the same premises.
Claims 8-10 and 12
Claim 8-10 and 12 of the reference copending application recite the same limitations as claim 8-10 and 12 of the instant application respectively.
Claim 13
Claim 13 of the reference copending application recites all of the limitations of claim 13 of the instant application respectfully except for 16-bit floating point values “from odd data element positions” and “using round to nearest even” and performs the complementary operation of “convert 32-bit floating point values from the identified source operands into 16-bit floating point values and store 16-bit floating point values”, similarly to claim 1 and are therefore rejected on the same premises.
Claims 14-16 and 18
Claims 14-16 and 18 of the reference copending application recites the same limitations as claims 14-16 and 18 of the instant application respectively.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-3, 7-9, and 13-15 are rejected under 35 U.S.C. 103 as being unpatentable over Eapen (US 20170031682 A1) in view of Valentine (US 20190163474 A1) in further view of Kashyap (US 20190042544 A1).
Regarding claim 1, Eapen teaches an apparatus comprising:
decoder circuitry to decode a single instruction (FIG. 1 Decode Circuitry element 20 and [0057]: Instructions are passed through decode circuitry 20 which decodes each instruction)…, wherein the opcode is to indicate instruction processing circuitry (FIG. 15 and [0100]: the instruction opcode determines the type of the operation to be performed; FIG. 4 and [0069]: “The processing circuitry 100-0 also receives an instruction form signal 102 indicating which form of the instruction is being executed”. The type of operation and form of instruction refer to the same concept, indicating that the opcode specifies the operation to be performed by the processing circuitry. Additionally, [0056] “the different forms of the element size increasing instruction may have different opcodes” where [0066] “the second form of the lengthening instruction acts on the odd-numbered input data elements”) is to convert M-bit N-bit (FIG. 9 and [0076]: conversion instructions convert odd M bits into N bits; claim 7, [0045]: N = 2M) and instruction processing circuitry to execute the decoded instruction according to the opcode (FIG. 1 and [0057]: decoded instructions are passed to issue stage circuitry for issuing to execution pipelines. “The execution pipelines 30, 35, 40, 80 may collectively be considered to form processing circuitry”).
Eapen does not explicitly teach the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand, … converting from 16-bit floating point values from the identified source operand into 32-bit floating point values using round to nearest even … store the 32-bit floating point values in data element positions of the identified destination operand.
Valentine teaches the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand, (FIG. 13 and [0175]: an instruction is fetched having fields to specify an opcode, a source operand, a destination operand), … converting 16-bit floating point values from the identified source operand into 32-bit floating point values (FIG. 12-13 and [0169-0179], and [0195]: convert a half-precision floating-point value to a single-precision floating-point value) using round to nearest ([0059]: “Round operation control field 158—its content distinguishes which one of a group of rounding operations to perform (e.g., Round-up, Round-down, Round-towards-zero and Round-to-nearest). Thus, the round operation control field 158 allows for the changing of the rounding mode on a per instruction basis”) and store the 32-bit floating point values in data element positions of the identified destination operand ([0195-0196]: store the single-precision floating-point value in element locations of a destination).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen by incorporating the teachings of Valentine to specify that for the M-bit data element, its format is half precision which is 16-bit floating point and for the N-bit data element, its format is single precision which is 32-bit floating point; and that the instruction have fields for an opcode, source operand and destination operand; and a rounding mode. Doing so would improve data processing performance while reducing storage requirements compared to other formats. It would also allow for efficient encoding and execution of mixed-precision operations in a processor.
Eapen in view of Valentine does not explicitly teach using round to nearest even.
Kashyap teaches using round to nearest even ([0133] “Round operation control field 959A—just as round operation control field 958, its content distinguishes which one of a group of rounding operations to perform (e.g., Round-up, Round-down, Round-towards-zero and Round-to-nearest)” and [0223] “Example 8 includes the substance of the exemplary processor of Example 1, wherein the rounding mode is one of round to nearest even”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine by incorporating the teachings of Kashyap to use round to nearest even as a rounding mode. Doing so would allow for an additional rounding mode which provides statistically unbiased rounding.
Regarding claim 2, Eapen in view of Valentine in further view of Kashyap teaches the apparatus of claim 1.
Although Eapen teaches identifying a vector register ([0059-0060]: a vector register from vector register bank 65 is identified), Eapen does not explicitly teach such identification by “the field for the identifier of the source operand”.
Valentine teaches wherein the field for the identifier of the source operand is to identify a vector register ([0178]: the source and destination operand fields may specify registers or memory locations, such registers including vector registers as supported in FIG. 4 and [0126]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of valentine in further view of Kashyap by incorporating the further teachings of Valentine to specify the field for the identifier of the source operand in an instruction. Doing so would provide a well-known and predictable mechanism for operand identification, allowing the instruction to flexibly reference various kinds of operand types.
Regarding claim 3, Eapen in view of Valentine in further view of Kashyap teaches the apparatus of claim 1. Valentine further teaches wherein the field for the identifier of the source operand is to identify a memory location ([0178]: the source and destination operand fields may specify registers or memory locations).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine in further view of Kashyap by incorporating the further teachings of Valentine to include wherein the field for the identifier of the source operand is to identify a memory location. Doing so would enable efficient data retrieval, since data can be prefetched, reducing latency and improving parallel execution. Doing so would also maintain compatibility with existing instruction set architectures since many ISAs support operand retrieval via memory locations.
Regarding claim 7, Eapen teaches a method comprising: …
wherein the opcode is to indicate instruction processing circuitry ([0100]: the instruction opcode determines the type of the operation to be performed. [0056] “the different forms of the element size increasing instruction may have different opcodes” where [0066] “the second form of the lengthening instruction acts on the odd-numbered input data elements”) is to convert M-bit N-bit (FIG. 9 and [0076]: conversion instructions convert odd M bits into N bits; claim 7, [0045]: N = 2M) decoding a one or more instructions of the second set; (FIG. 1 Decode Circuitry element 20 and [0057]: Instructions are passed through decode circuitry 20 which decodes each instruction), and executing the decoded one or more instructions of the second set according to the opcode of the single instruction of the first instruction set (FIG. 1 and [0057]: decoded instructions are passed to issue stage circuitry for issuing to execution pipelines. “The execution pipelines 30, 35, 40, 80 may collectively be considered to form processing circuitry”).
Eapen does not explicitly teach translating a single instruction of a first instruction set into one or more instructions of a second instruction set, the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand, … converting from 16-bit floating point values from the identified source operand into 32-bit floating point values using round to nearest even … store the 32-bit floating point values in data element positions of the identified destination operand.
Valentine teaches translating a single instruction of a first instruction set into one or more instructions of a second instruction set (FIG. 11 and [0168]: “block diagram contrasting the use of a software instruction converter to convert binary instructions in a source instruction set to binary instructions in a target instruction set”), the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand, (FIG. 13 and [0175]: an instruction is fetched having fields to specify an opcode, a source operand, a destination operand), … converting from 16-bit floating point values from the identified source operand into 32-bit floating point values (FIG. 12-13 and [0169-0179, 0195]: convert a half-precision floating-point value to a single-precision floating-point value) using round to nearest ([0059]: “Round operation control field 158—its content distinguishes which one of a group of rounding operations to perform (e.g., Round-up, Round-down, Round-towards-zero and Round-to-nearest). Thus, the round operation control field 158 allows for the changing of the rounding mode on a per instruction basis”) and store the 32-bit floating point values in data element positions of the identified destination operand ([0195-0196]: store the single-precision floating-point value in element locations of a destination).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen by incorporating the teachings of Valentine to include translating a single instruction of a first instruction set into one or more instructions of a second instruction set; and to specify that for the M-bit data element, its format is half precision which is 16-bit floating point and for the N-bit data element, its format is single precision which is 32-bit floating point; and that the instruction have fields for an opcode, source operand and destination operand; and a rounding mode. Doing so would improve data processing performance while reducing storage requirements compared to other formats and would allow for efficient encoding and execution of mixed-precision operations in a processor. It would also be beneficial for cross instruction set architecture compatibility which the different ISAs exist due to varying design goals such as efficiency and performance. Compatibility between ISAs is useful to ensure seamless software execution.
Eapen in view of Valentine does not explicitly teach using round to nearest even.
Kashyap teaches using round to nearest even ([0133] “Round operation control field 959A—just as round operation control field 958, its content distinguishes which one of a group of rounding operations to perform (e.g., Round-up, Round-down, Round-towards-zero and Round-to-nearest)” and [0223] “Example 8 includes the substance of the exemplary processor of Example 1, wherein the rounding mode is one of round to nearest even”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine by incorporating the teachings of Kashyap to use round to nearest even as a rounding mode. Doing so would allow for an additional rounding mode which provides statistically unbiased rounding.
Regarding claim 8, Eapen in view of Valentine in further view of Kashyap teaches the method of claim 7. Although Eapen teaches identifying a vector register ([0059-0060]: a vector register from vector register bank 65 is identified), Eapen does not explicitly teach such identification by “the field for the identifier of the source operand”.
Valentine teaches wherein the field for the identifier of the source operand is to identify a vector register ([0178]: the source and destination operand fields may specify registers or memory locations, such registers including vector registers as supported in FIG. 4 and [0126]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine in further view of Kashyap by incorporating the further teachings of Valentine to specify the field for the identifier of the source operand in an instruction. Doing so would provide a well-known and predictable mechanism for operand identification, allowing the instruction to flexibly reference various kinds of operand types.
Regarding claim 9, Eapen in view of Valentine in further view of Kashyap teaches the method of claim 7. Valentine further teaches wherein the field for the identifier of the source operand is to identify a memory location ([0178]: the source and destination operand fields may specify registers or memory locations).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine in further view of Kashyap by incorporating the further teachings of Valentine to include wherein the field for the identifier of the source operand is to identify a memory location. Doing so would enable efficient data retrieval, since data can be prefetched, reducing latency and improving parallel execution. Doing so would also maintain compatibility with existing instruction set architectures since many ISAs support operand retrieval via memory locations.
Regarding claims 13-15, the claims recite a system comprising: memory to store an instance of a single instruction (Eapen, [0057]: An instruction cache which is typically coupled to memory is used to fetch the instructions), the single instruction to include fields for an opcode, an identification of a source operand, and an identification of destination operand, wherein the opcode is to indicate instruction processing circuitry to perform operations corresponding to the apparatus of claims 1-3 respectively, and are therefore rejected on the same premises.
Claim(s) 4-6, 10-12, and 16-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Eapen (US 20170031682 A1), in view of Valentine (US 20190163474 A1), in further view of Kashyap (US 20190042544 A1), and in further view of Langhammer (US 20190155574 A1).
Regarding claim 4, Eapen in view of Valentine in further view of Kashyap teaches the apparatus of claim 1.
Eapen in view of Valentine in further view of Kashyap does not explicitly teach wherein the 16-bit floating point values are BF16 values.
Langhammer teaches wherein the 16-bit floating point values are BF16 values (FIG. 7 and [0076]: states that the DSP may be configured to receive BFLOAT16 inputs).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine in further view of Kashyap by incorporating the teachings of Langhammer to include wherein the 16-bit floating point values are BF16 values. Doing so would trade off the mantissa precision for exponent width, which increases the dynamic range in exchange for reduced accuracy (Langhammer, [0077]). Conversion between BFLOAT16 and 32-bit floating point would be greatly simplified because the number of exponents is identical (Langhammer, [0083]).
Regarding claim 5, Eapen in view of Valentine in further view of Kashyap and Langhammer teaches the apparatus of claim 4. Langhammer further teaches wherein instruction processing circuitry is to convert the BF16 values to 32-bit floating point values by appending sixteen zeros to each of the BF16 values (Langhammer, [0083]: “To cast from BFLOAT16 to FP32, 16 zeros can be appended to the LSB of the mantissa”).
Regarding claim 6, Eapen in view of Valentine in further view of Kashyap teaches the apparatus of claim 1.
Eapen in view of Valentine in further view of Kashyap does not explicitly teach wherein the 16-bit floating point values are FP16 values.
Langhammer teaches wherein the 16-bit floating point values are FP16 values ([0052]: states that an FP16 value may be promoted or cast from FP16 to FP32 using a format casting/promoting circuit).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine in further view of Kashyap by incorporating the teachings of Langhammer to include wherein the 16-bit floating point values are FP16 values. Doing so would be beneficial for supporting machine learning training procedures such as Convolution Neural Network algorithms or Recursive Neural Network inference algorithms (Langhammer, [0017]). Doing so would also trade off the exponent width for a larger mantissa precision compared to other 16-bit floating point representations (Langhammer, [0077]).
Regarding claim 10, Eapen in view of Valentine in further view of Kashyap teaches the method of claim 7.
Eapen in view of Valentine in further view of Kashyap does not explicitly teach wherein the 16-bit floating point values are BF16 values.
Langhammer teaches wherein the 16-bit floating point values are BF16 values ([0076]: states that the DSP may be configured to receive BFLOAT16 inputs).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine in further view of Kashyap by incorporating the teachings of Langhammer to include wherein the 16-bit floating point values are BF16 values. Doing so would trade off the mantissa precision for exponent width, which increases the dynamic range in exchange for reduced accuracy (Langhammer, [0077]). Conversion between BFLOAT16 and 32-bit floating point would be greatly simplified because the number of exponents is identical (Langhammer, [0083]).
Regarding claim 11, Eapen in view of Valentine in further view of Kashyap and Langhammer teaches the method of claim 10. Langhammer further teaches wherein instruction processing circuitry is to convert the BF16 values to 32-bit floating point values by appending sixteen zeros to each of the BF16 values (Langhammer, [0083]: “To cast from BFLOAT16 to FP32, 16 zeros can be appended to the LSB of the mantissa”).
Regarding claim 12, Eapen in view of Valentine in further view of Kashyap teaches the method of claim 7.
Eapen in view of Valentine in further view of Kashyap does not explicitly teach wherein the 16-bit floating point values are FP16 values.
Langhammer teaches wherein the 16-bit floating point values are FP16 values ([0052]: states that an FP16 value may be promoted or cast from FP16 to FP32 using a format casting/promoting circuit).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eapen in view of Valentine in further view of Kashyap by incorporating the teachings of Langhammer to include wherein the 16-bit floating point values are FP16 values. Doing so would be beneficial for supporting machine learning training procedures such as Convolution Neural Network algorithms or Recursive Neural Network inference algorithms (Langhammer, [0017]). Doing so would also trade off the exponent width for a larger mantissa precision compared to other 16-bit floating point representations (Langhammer, [0077]).
Regarding claim 16-18, the claims recite a system corresponding to the apparatus of claims 4-6 respectively, and are therefore rejected on the same premises.
Response to Arguments/Amendments
Applicant’s amendments, filed August 25 2025, with respect to the specification objections raised by the Examiner have been addressed. However, Examiner raises new specification objections. See specification objections above.
Applicant’s amendments, filed August 25 2025, with respect to the drawing objections raised by the Examiner have been addressed. However, Examiner raises new drawing objections. See drawing objections above.
Applicant’s arguments, see page 1, lines 23-29, filed August 25 2025, with respect to claims 1-18 have been fully considered and are persuasive. The 112(b) rejections of claims 1-18 have been withdrawn. However, Examiner notes that the clean version of the claims, filed August 25 2025, do not include the amendments.
Applicant’s arguments, see pages 2-4, filed August 25 2025, with respect to the rejection(s) of claim(s) 1-3, 7-9, and 13-15 under 35 U.S.C. 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground of rejection is made under 35 U.S.C. 103 over Eapen in view of Valentine in further view of Kashyap. See new rejections above. However, Examiner notes that the clean version of the claims, filed August 25 2025, do not include the amendments.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZACHARY RYAN DAMMANN whose telephone number is (571)272-4758. The examiner can normally be reached Mon-Fri 8:30-6:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached at (571) 270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Z.R.D./Examiner, Art Unit 2183
/JYOTI MEHTA/Supervisory Patent Examiner, Art Unit 2183