DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The lengthy set of drawings has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the drawings.
Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
The following title is suggested: “INSTRUCTION TO CONDITIONALLY STORE UNALIGNED DATA ELEMENTS USING A MASK OPERAND”
The disclosure is objected to because of the following informalities:
[00169]: The example embodiments suffer the same issues as the claim objections/rejections below. Applicant is advised to fix these issues when appropriate.
Appropriate correction is required.
Claim Objections
Claims 5, 11, and 17 are objected to because of the following informalities:
Claim 5, lines 1-2: Change the phrase “the data elements of the second source operand is 128-bit data element” to “the data elements of the second source operand are 128-bit data elements” to improve the clarity/flow of the sentence.
Claim 11, lines 1-2: Change the phrase “the data elements of the second source operand is 128-bit data element” to “the data elements of the second source operand are 128-bit data elements” to improve the clarity/flow of the sentence.
Claim 17, lines 1-2: Change the phrase “the data elements of the second source operand is 128-bit data element” to “the data elements of the second source operand are 128-bit data elements” to improve the clarity/flow of the sentence.
Appropriate correction is required.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-3 and 5-6 are rejected under 35 U.S.C. 103 as being unpatentable over Orenstien et al (US 20090172365 A1) in view of Intel (Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2C: Instruction Set Reference, V-Z, see Non-Final Office Action mailed 08/26/2025) and Felix Cloutier (“MOVUPS — Move Unaligned Packed Single-Precision Floating-Point Values”).
Regarding claim 1, Orenstien teaches an apparatus (Fig. 6 and [0058]: Processor 400) comprising:
decoder circuitry to decode an instance of a single instruction (Fig. 6 and [0059]: Front end units 410 includes a decoder), the instance of the single instruction to at least include one or more fields for an opcode ([0027]: Table 1 contains an opcode column of the VMASKMOVPS/PD), one or more fields to reference a second source operand ([0027]: xmm1/ymm1 as the second source operand), and one or more fields to reference a destination memory location ([0027]: m128/m256 to hold the destination memory location), wherein the opcode indicates execution circuitry is to conditionally store data elements from data element positions of the second source operand into corresponding data element positions of the destination memory location based on masking information stored in the referenced first source operand ([0057]: Referring to table 3, when the MSB of the ymm0 register (i.e., the first source operand) which corresponds to a data element in the ymm1 register (i.e., the second source operand) is set to 1, the data element stored in the ymm1 register is stored into memory) ; and
execution circuitry configured to execute the decoded instruction according to the opcode (Fig. 6 and [0060]: Execution units 420).
Orenstien does not explicitly teach one or more fields to reference a first source operand in the single instruction or that the opcode indicates execution circuitry is to conditionally store unaligned data elements.
Note that the instruction taught in Orenstien only implies the register that holds the mask (i.e., xmm0/ymm0).
Intel teaches one or more fields to reference a first source operand in the single instruction (Table 1 and paragraph 1: The xmm1/ymm1 register to hold the mask bits).
It would have been obvious to one of ordinary skill in the art before the effective filing date to have the instruction have a field to reference the source operand that specifies the mask bits. Having a field to specify the register allows flexibility to store the mask bits in a register of someone’s choice rather than be locked to one specific register, which one of ordinary skill would appreciate.
Orenstien, in view of Intel, still does not teach that the execution circuitry is to conditionally store unaligned data elements.
Felix Cloutier teaches to store unaligned data elements (Table and Section “Description”: The MOVUPS instruction with opcode “VEX.128.0F.WIG 11 /r” with ModRM byte indicating a memory location indicates that the unaligned data from xmm1 is to be moved into a 128-bit memory location indicated as xmm2).
It would have been obvious to one of ordinary skill in the art before the effective filing date to have combined the teachings of Orenstien, in view of Intel, with the teachings of Felix Cloutier to have conditionally store unaligned data elements from data element positions of the second source operand into corresponding data element positions of the destination memory location. Storing unaligned data elements by tightly packing data rather than filling unaligned data elements with padding such that they can be aligned conserves memory space, which may be preferred by one of ordinary skill.
Regarding claim 2, Orenstien, in view of Intel and Felix Cloutier, teaches the apparatus of claim 1, wherein the first source operand is a vector register (Orenstien, [0023, 0027]: XMM registers are SIMD registers, which are a type of vector registers).
Regarding claim 3, Orenstien, in view of Intel and Felix Cloutier, teaches the apparatus of claim 1, wherein the masking information is provided by a value of a most significant bit position of each data element of the first source operand (Orenstien, [0025, 0057]: Table 3 provides an example of the VMASKMOVPS using the MSB of each data element in ymm0 as the mask bit).
Regarding claim 5, Orenstien, in view of Intel and Felix Cloutier, teaches the apparatus of claim 1, wherein the data elements of the second source operand is 128-bit data element (Orenstien, [0027, 0060]: The xmm0 and xmm1 registers are 128-bits wide, therefore the data elements that xmm1 currently stores, combined together, creates a data element of 128-bits).
Regarding claim 6, Orenstien, in view of Intel and Felix Cloutier, teaches the apparatus of claim 1, wherein masked data element positions are left unchanged (Orenstien, [0057]: As seen in table 3, the data elements corresponding to data element positions in “src” are left unchanged after the storing operation).
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Orenstien et al (US 20090172365 A1) in view of Intel (Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2C: Instruction Set Reference, V-Z, see Non-Final Office Action mailed 08/26/2025), Felix Cloutier (“MOVUPS — Move Unaligned Packed Single-Precision Floating-Point Values”), and Palanca et al (US 6173393 B1).
Regarding claim 4, Orenstien, in view of Intel and Felix Cloutier, teaches the apparatus of claim 1.
Orenstien, in view of Intel and Felix Cloutier, does not teach that the data elements to conditionally store are 8-bit.
Palanca does teach that the data elements to conditionally store are 8-bit data elements (Fig. 8 and Col. 13, lines 58-61: the data elements, which are byte sized (i.e., 8-bit) are stored in src1 900 and are to be conditionally stored to the cache line 906 based on the mask bit).
It would have been obvious to one of ordinary skill in the art before the effective filing date to have combined the teachings of Orenstien, in view of Intel and Felix Cloutier, with the teachings of Palanca to have stored 8-bit sized data elements into memory. One of ordinary skill would appreciate holding more packed data elements of 8 bits in width, over holding less packed data elements of 16 bits in width or more as transferring larger amounts of data between a processor and memory is critical in data-focused computing areas.
Claims 7-9, 11-15, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Orenstien et al (US 20090172365 A1) in view of Intel (Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2C: Instruction Set Reference, V-Z, see Non-Final Office Action mailed 08/26/2025), Felix Cloutier (“MOVUPS — Move Unaligned Packed Single-Precision Floating-Point Values”), and Coleman et al. (US 20170286118 A1).
Regarding claims 7, Orenstien, in view of Intel and Felix Cloutier, teaches most of the method according to the apparatus of claim 1.
Orenstien, in view of Intel and Felix Cloutier, does not teach translating an instance of a single instruction of a first instruction set architecture to one or more instructions of a second instruction set architecture.
Coleman does teach translating an instance of a single instruction of a first instruction set architecture to one or more instructions of a second instruction set architecture (see [0045]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Orenstien, in view of Intel and Felix Cloutier, to incorporate the teachings of Coleman to provide a way to translate an instance of a single instruction from a first instruction set to one or more instructions of a second instruction set architecture to then be sent to the processor. A common motivation for translating an instruction from a first instruction set to a second instruction set is that it allows one of ordinary skill in the art to create a simpler design based on the second instruction set. This simpler design would allow easier implementation of the hardware (e.g., a simpler decoder), would require less time to verify and debug the hardware, and an improvement in performance. In addition, it would allow the processor to accommodate non-native instructions to increase flexibility.
Regarding claims 8-9 and 11-12, the claims recite a method similar to the apparatus of claims 2-3 and 5-6, therefore the claims are rejected on the same premises.
Regarding claim 13, Orenstien, in view of Intel and Felix Cloutier, teaches most of the system according to the apparatus of claim 1.
Orenstien, in view of Intel and Felix Cloutier, does not explicitly teach a system comprising:
a general-purpose processor core; and
a digital signal processing core coupled to the general purpose processor core.
Note that Orenstien suggests that the execution circuitry to process the VMASKMOV instruction can be implemented in a digital signal processor (see [0019])
Coleman teaches a system (Fig. 12 and [0104]: System 1200) comprising:
a general-purpose processor core (Fig. 12 and [0104]: Processor 1270); and
a digital signal processing core coupled to the general-purpose processor core (Fig. 12 and [0109]: Processor 1215 can be a digital signal processor, coupled to processor 1270 through first bus 1216).
It would have been obvious to one of ordinary skill in the art before the effective filing date to have combined the teachings of Orenstien, in view of Intel and Felix Cloutier, with the teachings of Coleman to have the single instruction to be executed in a DSP, coupled to a processor, within a system. Digital signal processors are known to effectively process mathematical operations and have better power efficiency compared to general-purpose processors, which one of ordinary skill would prefer.
Regarding claims 14-15 and 17-18, the claims recite a system similar to the apparatus of claims 2-3 and 5-6, therefore the claims are rejected on the same premises.
Claims 10 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Orenstien et al (US 20090172365 A1) in view of Intel (Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2C: Instruction Set Reference, V-Z, see Non-Final Office Action mailed 08/26/2025), Felix Cloutier (“MOVUPS — Move Unaligned Packed Single-Precision Floating-Point Values”), Coleman et al. (US 20170286118 A1), and Palanca et al (US 6173393 B1).
Regarding claim 10, Orenstien, in view of Intel, Felix Cloutier, and Coleman, teaches the method of claim 7.
Orenstien, in view of Intel, Felix Cloutier, and Coleman, does not teach that the data elements to conditionally store are 8-bit.
Palanca does teach that the data elements to conditionally store are 8-bit data elements (Fig. 8 and Col. 13, lines 58-61: the data elements, which are byte sized (i.e., 8-bit) are stored in src1 900 and are to be conditionally stored to the cache line 906 based on the mask bit).
It would have been obvious to one of ordinary skill in the art before the effective filing date to have combined the teachings of Orenstien, in view of Intel, Felix Cloutier, and Coleman, with the teachings of Palanca to have stored 8-bit sized data elements into memory. One of ordinary skill would appreciate holding more packed data elements of 8 bits in width, over holding less packed data elements of 16 bits in width or more as transferring larger amounts of data between a processor and memory is critical in data-focused computing areas.
Regarding claim 16, Orenstien, in view of Intel, Felix Cloutier, and Coleman, teaches the system of claim 13.
Orenstien, in view of Intel, Felix Cloutier, and Coleman, does not teach that the data elements to conditionally store are 8-bit.
Palanca does teach that the data elements to conditionally store are 8-bit (Fig. 8 and Col. 13, lines 58-61: the data elements, which are byte sized (i.e., 8-bit) are stored in src1 900 and are to be conditionally stored to the cache line 906 based on the mask bit).
It would have been obvious to one of ordinary skill in the art before the effective filing date to have combined the teachings of Orenstien, in view of Intel, Felix Cloutier, and Coleman, with the teachings of Palanca to have stored 8-bit sized data elements into memory. One of ordinary skill would appreciate holding more packed data elements of 8 bits in width, over holding less packed data elements of 16 bits in width or more as transferring larger amounts of data between a processor and memory is critical in data-focused computing areas.
Response to Arguments/Amendments
In response to the comments made by Applicant, on the reply filed December 29 2025, on Page 8, paragraph 2, Examiner did not indicate any allowable subject matter in the Non-Final Office Action, mailed August 26 2025, as all claims were rejected under prior art rejection(s). The amended claims provided in the reply, filed December 29 2025, are not allowable as they have been rejected under prior art rejection(s) (see claim rejections above).
If one or more claims become allowable, over the prior art, Examiner will indicate the allowable claims under the section titled “Allowable Subject Matter”.
Applicant's amendments, filed December 29 2025, with respect to the specification objections raised by the Examiner have mostly been addressed. Applicant has failed to address the objections with respect to the example embodiments suffering the same issues as the claim objections. Furthermore, Applicant indicated that the title is descriptive, but does not further explain why the title is sufficiently descriptive. Examiner has provided a recommendation in the specification objection above for Applicant to consider. The specification objections will be maintained until the issues have been addressed or Applicant provides persuasive arguments regarding the specification objections.
Applicant's amendments, filed December 29 2025, with respect to the drawing objections raised by the Examiner have been addressed. Therefore, the objection(s) of the drawings has been withdrawn.
Applicant's amendments, filed December 29 2025, with respect to the claim objections raised by the Examiner have been addressed. Therefore, the objection(s) of the claims has been withdrawn. However, Examiner has raised new claim objections. See new claim objections above.
Applicant's amendments, filed December 29 2025, with respect to the 112(b) rejections raised by the Examiner have been addressed. Therefore, the 112(b) rejection(s) of claims 1-20 has been withdrawn.
Applicant’s arguments, see Pages 9-12, filed December 29 2025 , with respect to the rejection(s) of claims 1-20 under 35 U.S.C. 103 have been fully considered and are persuasive. Therefore, the rejection(s) has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made under 35 U.S.C. 103 over Orenstien, in view of Intel and Felix Cloutier, a new ground(s) of rejection is made under 35 U.S.C. 103 over Orenstien, in view of Intel, Felix Cloutier, and Palanca, a new ground(s) of rejection is made under 35 U.S.C. 103 over Orenstien, in view of Intel, Felix Cloutier, and Coleman, and a new ground(s) of rejection is made under 35 U.S.C. 103 over Orenstien, in view of Intel, Felix Cloutier, Coleman, and Palanca. See new 103 rejections above.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EMILIO ALCANTARA-RAMOS whose telephone number is (571)272-4211. The examiner can normally be reached Mon-Fri 8:30-5:00 PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached at (571)270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/E.A./Examiner, Art Unit 2183
/JYOTI MEHTA/Supervisory Patent Examiner, Art Unit 2183