DETAILED ACTION
Claims 1-4, 6-8, 10, 12, 14-15, 17, 19, and 21-27 are pending.
The office acknowledges the following papers:
Claims and remarks filed on 12/16/2025.
Withdrawn objections and rejections
The specification objection has been withdrawn.
The 35 U.S.C. 112(a) rejections have been withdrawn due to cancellation and amendment of the claims.
Allowable Subject Matter
Claims 22 and 26 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" granted by a patent and to prevent possible harassment by multiple assignees. See In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970);and, In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent is shown to be commonly owned with this application. See 37 CFR 1.130(b).
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).
Applicants can file an eTerminal Disclaimer (eTD) in utility applications filed under 35 U.S.C. 111(a) or in compliance with 35 U.S.C. 371, and design applications. Filing an eTD via EFS-Web is highly recommended due to an extensive backlog for processing paper TDs. However, applicants may still file a TD for manual review.
Claims 1-3 and 6 are rejected under the judicially created doctrine of obviousness-type double patenting as being unpatentable over claim 1 of U.S. Patent No. 12,111,789. Although the conflicting claims are not identical, they are not patentably distinct from each other because U.S. 12,111,789 contains every element of claims 1-3 and 6 of the instant application and thus anticipates the claims of the instant application. Claims of the instant application therefore are not patently distinct from earlier patent claims and as such are unpatentable over obvious-type double patenting. A later application claim is not patently distinct from an earlier claim if the later claim is anticipated by the earlier claim.
Instant Application
Patent 12,111,789
1. A device comprising:
1. A system comprising:
an array of processing nodes, each processing node comprising: a vector-scalar processor (VSP), wherein the VSP can be re-configured to perform vector processing or scalar processing,
a set of registers; and
the VSP comprising onboard memory via a set of registers,
a vector arithmetic logic unit (ALU), wherein the vector ALU is reconfigurable in response to a type of data or instructions received for processing to:
the VSP comprising onboard memory via a vector arithmetic-logic unit (ALU), wherein the vector ALU and the set of registers can be re-allocated when the vector ALU and set of registers are re-configured to operate on one of scalar or vector data inputs,
operate as a single vector ALU, operate as multiple parallel vector ALUs, operate as multiple scalar ALUs, or operate as a combination of one or more parallel vector ALUs and one or more scalar ALUs.
wherein the vector ALU is re-configured by re-configuring the vector ALU into a combination of scalar processing elements and multiple vector processing elements, each size of the multiple vector processing elements smaller than a size of the scalar processing elements,
a Random Access Memory (RAM) unit coupled to the VSP, and a non-volatile storage unit coupled to one of the VSP or the RAM unit; and a driver comprising code to control the array of processing nodes to perform an operation by re-configuring the vector ALU to operate on one of scalar or vector data inputs and distributing an execution of the operation across at least a portion of the array of processing nodes.
Dependent claims 2-3 and 6 are read upon by the claim 1 of U.S. Patent No. 12,111,789.
Claims 7-8, 10, 14-15, 17, 19, and 23 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 of U.S. Patent No. 12,111,789 in view of Goel et al. (U.S. 2016/0055667).
Claims 4, 12, 21, 24-25, and 27 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 of U.S. Patent No. 12,111,789 in view of Goel et al. (U.S. 2016/0055667), in view of Official Notice.
Maintained and New Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-3, 6-8, 10, 14-15, 17, 19, and 23 are rejected under 35 U.S.C. 102(a)(1 & 2) as being anticipated by Goel et al. (U.S. 2016/0055667).
As per claim 1:
Goel disclosed a device comprising:
a set of registers (Goel: Figures 2-3 elements 38-40 and 48, paragraph 97); and
a vector arithmetic logic unit (ALU), wherein the vector ALU is reconfigurable in response to a type of data or instructions received (Goel: Figures 2-3 elements 44 and 46A-H, paragraphs 92-95 and 97)(The set of PEs reads upon the vector ALU. The PEs are configured for execution based on instructions from the instruction store.) for processing to:
operate as a single vector ALU, operate as multiple parallel vector ALUs, operate as multiple scalar ALUs, or operate as a combination of one or more parallel vector ALUs and one or more scalar ALUs (Goel: Figures 2-3 elements 46A-H, paragraphs 92-97)(The PEs are configured for operation using multiple scalar ALUs or multiple vector ALUs. The PEs can be configured to operate in parallel using multiple parallel vector ALUs/scalar ALUs.).
As per claim 2:
Goel disclosed the device of claim 1, further comprising a controller configured to dynamically reconfigure the vector ALU based on incoming data types or instruction sets (Goel: Figure 3 element 42, paragraphs 90 and 94).
As per claim 3:
Goel disclosed the device of claim 1, wherein the vector ALU comprises a plurality of processing elements, each processing element capable of operating independently as a scalar ALU or cooperatively as part of a vector ALU (Goel: Figures 2-3 elements 46A-H, paragraphs 92-95 and 97)(The set of PEs reads upon the vector ALU. PEs with scalar ALUs can execute scalar operations or execute vector operations in parallel.).
As per claim 6:
Goel disclosed the device of claim 1, wherein the vector ALU is configured to process different data types simultaneously when operating as a combination of vector and scalar ALUs (Goel: Figures 2-3 elements 46A-H, paragraphs 92-97)(The PEs can be configured to operate in parallel using multiple parallel vector ALUs/scalar ALUs. The scalar ALUs operate on scalar data and the vector ALUs operate on vector data.).
As per claim 7:
Goel disclosed the device of claim 1, further comprising a scheduler configured to optimize task allocation based on a current configuration of the vector ALU (Goel: Figure 2 element 36, paragraph 79).
As per claim 8:
Goel disclosed a method comprising:
receiving a set of instructions or data for processing (Goel: Figure 3 element 44, paragraph 91)(The instruction store receives shader programs.);
analyzing the received instructions or data to determine processing requirements (Goel: Figure 3 element 42, paragraph 90);
dynamically reconfiguring a vector arithmetic logic unit (ALU) based on the determined processing requirements (Goel: Figures 2-3 elements 44 and 46A-H, paragraphs 92-95 and 97)(The set of PEs reads upon the vector ALU. The PEs are configured for execution based on instructions from the instruction store.), wherein the reconfiguration includes at least one of:
configuring the vector ALU to operate as one of a single vector ALU, multiple parallel vector ALUs, multiple scalar ALUs, or a combination of one or more parallel vector ALUs and one or more scalar ALUs (Goel: Figures 2-3 elements 46A-H, paragraphs 92-97)(The PEs are configured for operation using multiple scalar ALUs or multiple vector ALUs. The PEs can be configured to operate in parallel using multiple parallel vector ALUs/scalar ALUs.); and
processing the received instructions or data using the reconfigured vector ALU (Goel: Figures 2-3 elements 46A-H, paragraphs 92-97)(The PEs are configured for operation using multiple scalar ALUs or multiple vector ALUs. The PEs can be configured to operate in parallel using multiple parallel vector ALUs/scalar ALUs. The PEs execute instructions of a shader program using input data from registers and local/external memory.).
As per claim 10:
Goel disclosed the method of claim 8, wherein dynamically reconfiguring the vector ALU includes reallocating register resources among resulting ALU configurations, wherein dynamically reconfiguring the vector ALU includes reconfiguring registers of the vector ALU to operate as registers for the resulting ALU configurations (Goel: Figures 2-3 elements 38-40 and 48, paragraphs 93, 95, and 97)(The registers are configured to output full vectors to the entire set of PEs when SIMD processing uses all PEs. Additionally, the registers are configured to output smaller vectors to a subset of PEs when SIMD processing doesn’t use the full set of PEs.)
As per claim 14:
Goel disclosed the method of claim 8, wherein processing the received instructions or data includes executing vector and scalar operations in parallel when the vector ALU is configured as a combination of vector and scalar ALUs (Goel: Figures 2-3 elements 46A-H, paragraphs 92-97)(The PEs are configured for operation using multiple scalar ALUs or multiple vector ALUs. The PEs can be configured to operate in parallel using multiple parallel vector ALUs/scalar ALUs.).
As per claim 15:
Goel disclosed a system comprising:
a host processor (Goel: Figure 8 elements 30, 104, and 116, paragraphs 23, 41, 72, and 162-163)(The GPU receives compiled shader programs from the host CPU for execution.);
a memory (Goel: Figure 8 element 108, paragraph 165); and
a reconfigurable co-processor coupled to the host processor and the memory (Goel: Figure 8 element 30, paragraph 166), the reconfigurable co-processor comprising:
a vector arithmetic logic unit (ALU) capable of dynamically reconfiguring its architecture (Goel: Figures 2-3 elements 44 and 46A-H, paragraphs 92-95 and 97)(The set of PEs reads upon the vector ALU. The PEs are configured for execution based on instructions from the instruction store.);
a configuration controller coupled to the vector ALU (Goel: Figure 3 element 42, paragraphs 72 and 90) and configured to:
receive task information from the host processor (Goel: Figure 3 element 42, paragraph 90)(The controller receives information for shader programs from the host CPU),
determine an optimal ALU configuration based on the task information (Goel: Figure 3 element 42, paragraph 46)(The controller deactivates PEs not needed for execution of shader program instructions.), and
instruct the vector ALU to reconfigure according to the determined optimal configuration, wherein the reconfiguration includes partitioning the vector ALU into one or more of: a single vector ALU, multiple parallel vector ALUs, multiple scalar ALUs, or a combination thereof (Goel: Figures 2-3 elements 46A-H, paragraphs 92-97)(The PEs are configured for operation using multiple scalar ALUs or multiple vector ALUs. The PEs can be configured to operate in parallel using multiple parallel vector ALUs/scalar ALUs.).
As per claim 17:
Goel disclosed the system of claim 15, wherein the vector ALU includes a plurality of processing elements (Goel: Figures 2-3 elements 44 and 46A-H, paragraphs 92-95 and 97)(The set of PEs reads upon the vector ALU.), and wherein reconfiguring the vector ALU includes regrouping the processing elements to form vector or scalar processing units (Goel: Figures 2-3 elements 44 and 46A-H, paragraphs 92-95 and 97)(The set of PEs reads upon the vector ALU. PEs with scalar ALUs can be grouped for vector processing for vector operations.).
As per claim 19:
Goel disclosed the system of claim 15, wherein the host processor is configured to offload vector and scalar processing tasks to the reconfigurable co-processor based on a current configuration of the vector ALU (Goel: Figure 8 elements 30, 104, and 116, paragraphs 23, 41, 72, and 162-163)(The GPU receives compiled shader programs from the host CPU for execution. Vector and scalar instructions are offloaded based on GPU support for executing them.).
As per claim 23:
Goel disclosed the device of claim 1, wherein the device comprises a vector-scalar processor (VSP) that is configurable to perform vector processing or scalar processing (Goel: Figure 3 element 40, paragraphs 92-95)(The shader unit (i.e. VSP) is configured to perform scalar and vector processing.), and wherein the set of registers and the vector ALU comprise onboard memory of the VSP (Goel: Figure 3 elements 46A-H and 48, paragraphs 95 and 97)(The registers are included in the set of on-board memory of the shader unit.).
Maintained and New Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 4, 12, 21, 24-25, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Goel et al. (U.S. 2016/0055667), in view of Official Notice.
As per claim 4:
Goel disclosed the device of claim 1, wherein the vector ALU is configured to switch between vector and scalar operations within a single clock cycle (Goel: Figure 3 elements 42-44, paragraphs 90 and 94)(Official notice is given that instructions are fetched from memory each clock cycle for the advantage of faster program execution. Thus, it would have been obvious to one of ordinary skill in the art that the controller controls PEs on a clock cycle basis for fetched instructions. In view of the official notice, the set of PEs switches control from scalar and vector operations based on fetched scalar and vector instructions.).
As per claim 12:
Goel disclosed the method of claim 8, wherein analyzing the received instructions or data includes identifying patterns indicative of vector or scalar processing suitability, wherein analyzing the received instructions or data includes determining whether the instructions or data are vector data inputs or scalar data inputs (Goel: Figures 2-3 elements 46A-H, paragraphs 92-97)(The PEs are configured for operation using multiple scalar ALUs or multiple vector ALUs. Official notice is given that decoders can be used to detect instruction types and generate corresponding control signals for execution for the advantage of ensuring proper instruction execution. Thus, it would have been obvious to one of ordinary skill in the art to implement decoding shader program instructions in Goel. In view of the above official notice, decoding SIMD instructions (i.e. analyzing) determining that the register inputs to be used are vector inputs.).
As per claim 21:
Goel disclosed the device of claim 1, wherein the vector ALU comprises a vector arithmetic-logic unit that operates on inputs up to a first number of elements (Goel: Figure 3 elements 48A-H, paragraphs 93 and 95)(The set of PEs (i.e. vector ALU) operate on vector lengths that use all or a subset of the PEs.), and wherein when reconfigured to operate as the combination of one or more parallel vector ALUs and one or more scalar ALUs, the vector ALU is partitioned into a plurality of smaller vector ALUs that each operate on a second number of elements less than the first number of elements, and a plurality of scalar ALUs that each operate on a single element (Goel: Figure 3 elements 48A-H, paragraphs 93 and 95)(The set of PEs (i.e. vector ALU) operate on vector lengths that use all or a subset of the PEs. Paragraph 95 gives the example of using half of the PEs to perform a single SIMD operation, leaving the remaining half of PEs idle. Official notice is given that schedulers can be used to issue scalar instructions to be performed concurrently with other instructions on available execution units for the advantage of increased performance. Thus, it would have been obvious to one of ordinary skill in the art to implement a scheduler that dispatches the SIMD operation and multiple scalar operations to the set of PEs for parallel execution.).
As per claim 24:
Goel disclosed the device of claim 1, wherein the vector ALU is reconfigurable to operate on vector data inputs or scalar data inputs based on instructions received from a host processor (Goel: Figures 3 and 8 elements 30, 40, 104, and 116, paragraphs 23, 41, 72, and 162-163)(The GPU receives compiled shader programs from the host CPU for execution. The shader unit includes vector and scalar ALUs. The set of PEs process vector operations when receiving SIMD instructions using the set of PEs. Additionally, official notice is given that scalar ALUs are be used to execute scalar instructions for the advantage of executing smaller data sets. Thus, it would have been obvious to one of ordinary skill in the art to implement scalar instructions to be received by the shader unit and executed by the scalar ALUs.).
As per claim 25:
Goel disclosed the method of claim 8, wherein processing the received instructions or data comprises:
buffering at least a portion of the data into onboard memory of a processing node comprising the vector ALU (Goel: Figure 3 elements 46A-H and 48, paragraphs 93, 95, and 97);
performing one or more operations on the buffered data using the reconfigured vector ALU (Goel: Figure 3 elements 46A-H and 48, paragraphs 93, 95, and 97)(The set of PEs perform SIMD operations using vector data stored in the registers.); and
storing processed data in a memory unit coupled to the processing node (Goel: Figure 3 element 50, paragraph 98)(Official notice is given that execution results can be stored in local caches/memory for the advantage of freeing up register space for other instruction processing. Thus, it would have been obvious to one of ordinary skill in the art to store SIMD processing results in the local memory.).
As per claim 27:
Goel disclosed the system of claim 15, wherein the reconfigurable co-processor further comprises:
a RAM unit coupled to the vector ALU (Goel: Figures 2-3 and 8 elements 30, 40, and 108, paragraph 165); and
a storage unit coupled to the vector ALU (Goel: Figures 2-3 and 8 elements 30, 40, and 108, paragraph 165)(Official notice is given that external memory can include storage disks for the advantage of storing programs and data while power is turned off. Thus, it would have been obvious to one of ordinary skill in the art to implement an external disk drive in Goel.), wherein the configuration controller is configured to control data movement between the storage unit and the RAM unit (Goel: Figure 3 element 42, paragraphs 72 and 90)(Official notice is given that internal memory controllers can be implemented to retrieve data externally from disk memory for the advantage of increasing memory access speeds upon reuse. Thus, it would have been obvious to one of ordinary skill in the art to implement an internal memory controller within the shader unit to fetch disk data. The fetched disk data is stored in RAM prior to being brought back into local memory of the shader unit.).
Response to Arguments
The arguments presented by Applicant in the response, received on 12/16/2025 are not considered persuasive.
Applicant argues regarding claim 1:
“Goel describes something architecturally different. Goel’s GPU includes “a plurality of processing elements (e.g., arithmetic logic units (ALUs))” as stated at ¶ 54. These processing elements 46A-H are illustrated in Figures 2 and 3 as discrete, separate components within shader unit 40. At ¶ 92, Goel explains that “Processing elements 46 are configured to execute threads of a shader program. Each of processing elements 46 may execute a different thread.” Goel further states at ¶ 95 that “each of processing elements 46 may include and/or correspond to an arithmetic logic unit (ALU)” and that “each of processing elements 46 may be a scalar ALU or a vector ALU.” The processing elements are thus individual ALUs—each one being either scalar or vector—that exist as separate hardware units within the shader unit.
…
The Office Action asserts that Goel’s processing elements “can be configured to operate in parallel using multiple parallel vector ALUs/scalar ALUs.” However, this characterization conflates task assignment with architectural reconfiguration. In Goel, control unit 42 assigns threads to the various processing elements 46A-H, and those processing elements then execute their assigned threads. At ¶ 100, Goel explains that “control unit 42 receives the information indicative of the thread configuration, and causes processing elements 46A-46H to execute one or more instances of the shader program based on the thread configuration.” The processing elements themselves do not reconfigure; rather, the control unit directs which processing element handles which thread.
…
The claims require the vector ALU itself to be “reconfigurable in response to a type of data or instructions received for processing.” This reconfiguration changes how the vector ALU operates—transforming it from a single large vector ALU into multiple smaller vector ALUs and/or scalar ALUs. In contrast, Goel’s shader unit 40 always contains eight processing elements (46A-46H), and the thread scheduler determines how to distribute work across those eight fixed units. The number and arrangement of processing elements in Goel does not change based on the data or instructions being processed.”
This argument is not found to be persuasive for the following reason. Applicant is correct that the PEs are described as executing threads, in a wave (or WARP or SIMT) manner in paragraph 92. Additionally, paragraph 93 further describes the shader unit being comprised of “processing elements 46” that “may be single-instruction, multiple-data (SIMD) processing elements.” Further, paragraph 93 states “each of processing elements 46 may execute instructions of a shader program based on a common program counter that points to an instruction contained in instruction store 44.” Due to the stated SIMD processing, the set of processing elements read upon the claims vector ALU. The example in paragraph 93 allows for the entire set of PEs to perform SIMD processing based on an instruction. The example in paragraph 95 allows for a subset of PEs to perform SIMD processing based on a vector length of an operation. This allows for reconfiguring the set of PEs to perform different types of processing.
The claims as is only require that the vector ALU is reconfigurable and that the vector ALU operates as a single vector ALU, multiple parallel vector ALUs, multiple scalar ALUs, OR a combination of 1+ parallel vector ALUs and 1+ scalar vector ALUs. At minimum, paragraphs 93 and 95 shows that the vector ALU is reconfigurable and operates as a single vector ALU. Thus, reading upon the claim limitations.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
The following is text cited from 37 CFR 1.111(c): In amending in reply to a rejection of claims in an application or patent under reexamination, the applicant or patent owner must clearly point out the patentable novelty which he or she thinks the claims present in view of the state of the art disclosed by the references cited or the objections made. The applicant or patent owner must also show how the amendments avoid such references or objections.
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Chen et al. (U.S. 9,799,094), taught PC for SIMT divergence.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB A. PETRANEK whose telephone number is (571)272-5988. The examiner can normally be reached on M-F 8:00-4:30.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on (571) 270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JACOB PETRANEK/Primary Examiner, Art Unit 2183