DETAILED ACTION
Claims 1-25 are pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 01/25/2023. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Specification
The disclosure is objected to because of the following informalities:
[0028] “[…] type of core that are more efficient”
Appropriate correction is required.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute
for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-25 are rejected under 35 U.S.C. 103 as being unpatentable over Zhu (US 2011/0314458 A1) in view of Brower et al. (US 2014/0149969 A1).
Regarding claim 1, Zhu teaches an apparatus to schedule parallel instructions using the apparatus comprising ([0015] GP executable 32 represents a program intended for execution on one or more processors (e.g., central processing units (CPUs)). GP executable 32 includes low level instructions from an instruction set of one or more central processing units (CPUs). GP executable 32 may also include one or more DP device executables 40. A DP device executable 40 represents a data parallel program (e.g., a shader) intended for execution on one or more data parallel (DP) devices such as DP device 210 shown in FIG. 7 and described in additional detail below. DP devices are typically graphic processing units (GPUs) or the vector execution cores of CPUs but may also include the scalar execution cores of CPUs or other suitable devices in some embodiments. DP device executable 40 may include DP byte code that is converted to low level instructions from an instruction set of a DP device using a device driver (not shown). DP device executable 40 may also include low level instructions from an instruction set of one or more DP devices. Accordingly, GP executable 32 is directly executable by one or more central processing units (CPUs), and a containing DP device executable 40 is either directly executable by one or more DP devices or executable by one or more DP devices subsequent to being converted to the low level instructions of the DP device.; Claim 1: translating a first portion of general purpose data parallel code that is intended for execution on one or more data parallel devices into first data parallel device source code; translating a second portion of the general purpose data parallel code that is intended for execution on the one or more data parallel devices into second data parallel device source code.):
interface circuitry to obtain instructions, the instructions including parallel threads (Fig. 1, General Purpose compiler; obtains GP code 12 that includes data parallel (DP) portions 14, see [0014] and [0016]); and
processor circuitry including one or more of:
at least one of a central processor unit, a graphics processor unit ([0015] CPU and GPU);
thread processing circuitry to split a first thread of the parallel threads into partitions ([0001]; [0014-15]; [0016] GP code 12 includes a sequence of instructions of a high level general purpose programming language with data parallel extensions (hereafter GP language) that form a program stored in a set of one or more modules. The GP language allows the program to be written in different parts (i.e., modules) such that each module may be stored in separate files or locations accessible by the computer system. The GP language provides a single language for programming a computing environment that includes one or more general purpose CPUs and one or more special purpose DP devices. Using the GP language, a programmer may include both CPU and DP device code in GP code 12 for execution by CPUs and DP devices, respectively, and coordinate the execution of the CPU and DP device code. GP code 12 may represent any suitable type of code, such as an application, a library function, or an operating system service.; Claim 1. A method performed by a computer system, the method comprising: translating a first portion of general purpose data parallel code that is intended for execution on one or more data parallel devices into first data parallel device source code; translating a second portion of the general purpose data parallel code that is intended for execution on the one or more data parallel devices into second data parallel device source code.)
While Zhu teaches an parallel input that is translated/separated into two different source codes for CPUs and GPUs (see [0014-16] and Claim 1). Zhu does not teach hybrid cores;
and processor circuitry including one or more of:
at least one of a central processor unit, a graphics processor unit, or a digital signal processor, the at least one of the central processor unit, the graphics processor unit, or the digital signal processor having control circuitry to control data movement within the processor circuitry, arithmetic and logic circuitry to perform one or more first operations corresponding to instructions, and one or more registers to store a result of the one or more first operations, the instructions in the apparatus;
a Field Programmable Gate Array (FPGA), the FPGA including logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the logic gate circuitry and the plurality of the configurable interconnections to perform one or more second operations, the storage circuitry to store a result of the one or more second operations; or
Application Specific Integrated Circuitry (ASIC) including logic gate circuitry to perform one or more third operations;
the processor circuitry to perform at least one of the first operations, the second operations, or the third operations to instantiate:
scheduling circuitry to: select (a) a first core to execute a first partition of the partitions and (b) a second core different than the first core to execute a second partition of the partitions; and
generate an execution schedule based on the selection, the interface circuitry to transmit the execution schedule to a device that schedules instructions on the first and second core.
In a similar field of endeavor, Brower teaches an apparatus to schedule parallel instructions using hybrid cores ([0030] The present disclosure provides techniques to process source code in computing systems containing a heterogeneous plurality of central processing units (CPUs). The heterogeneous plurality of CPUs may include substantially different multicore CPU types with disparate characteristics of performance, power consumption, size (e.g., volume), and/or weight. As computing systems evolve to include a heterogeneous plurality of CPUs, it may be desirable to enable programmers to easily and precisely designate and target source code portions within their programs to run on a particular CPU of the heterogeneous plurality of CPUs. Further, it may also be desirable to enable programmers to easily and precisely designate and target source code portions within their programs to run on different cores of a particular multicore CPU of the heterogeneous plurality of CPUs. Such source code sections may be compiled into an executable form that fully and optimally utilizes the target CPU or cores of the target CPU type…Examples of fundamentally different CPU types include multicore CPUs designed for HPC (High Performance Computing) and CPUs designed for low power consumption and extended battery life.; [0005] According to an embodiment, the target CPUs of the first- and second-type subsets are of respectively different CPU types selected from the set of: one or more general purpose CPU cores; one or more vector, array, or graphics processing units (GPUs); one or more compute intensive multicore CPUs; and one or more one or more low-power CPU cores.; [0043] Further, the heterogeneous plurality of target CPUs may co-exist within a common computing device (e.g., server or client computing device); [0038]; [0055] The programmer may annotate portions of the source code to indicate which CPU, Which CPU core, degree of parallelism), the apparatus comprising:
processor circuitry including one or more of:
at least one of a central processor unit ([0101] one or more CPUs), a graphics processor unit ([0041] one or more graphics processing units (GPUs)), or a digital signal processor ([0102] A processor 1112, which may be a micro-controller, digital signal processor (DSP), or other processing component), the at least one of the central processor unit, the graphics processor unit, or the digital signal processor having control circuitry to control data movement within the processor circuitry, arithmetic and logic circuitry to perform one or more first operations corresponding to instructions, and one or more registers to store a result of the one or more first operations, the instructions in the apparatus ([0010] a heterogeneous plurality of target CPUs includes an input/output interface that facilitates the retrieval of annotated source code. The annotated source code identifies at least a first portion thereof suitable for execution on a first-type subset of the target CPUs.; [0034]; [0090] In such an example, annotated and subsequently separated source code portions may include input, processing, and output targeted to a run-time data I/O resource.; [0101] The computing device may additionally include one or more storage devices each selected from a group consisting of floppy disk, flexible disk, hard disk, magnetic tape, any other magnetic medium, CD-ROM, any other optical medium, RAM, PROM, EPROM, FLASH-EPROM, any other memory chip or cartridge, and/or any other medium from which a processor or computer is adapted to read (i.e., registers). The one or more storage devices may include stored information that may be made available to one or more computing devices and/or computer programs; [0103]);
a Field Programmable Gate Array (FPGA) ([0086] FPGA (field-programmable gate array), the FPGA including logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the logic gate circuitry and the plurality of the configurable interconnections to perform one or more second operations, the storage circuitry to store a result of the one or more second operations; or
Application Specific Integrated Circuitry (ASIC) including logic gate circuitry to perform one or more third operations;
the processor circuitry to perform at least one of the first operations, the second operations, or the third operations to instantiate:
thread processing circuitry to split a first thread of the parallel threads into partitions (Abstract; [0010] The system also includes a source code separator that, based at least in part on a first annotation, separates the source code into first and second source code portions.; [0045] In an example, a source code stream includes threads, processes, tasks, and other standard concepts that define and control execution on multiple CPUs and/or multiple CPU cores.; [0055] The programmer may annotate portions of the source code to indicate which CPU, which CPU core, degree of parallelism);
scheduling circuitry to:
select (a) a first core to execute a first partition of the partitions and (b) a second core different than the first core to execute a second partition of the partitions (Abstract: The target CPUs of the first- and second-type subsets have one or more different functionalities.; [0006] core allocation; [0011] In an example, the target CPUs of the first- and second-type subsets are of respectively different CPU types selected from the set of: one or more general purpose CPU cores; one or more vector, array, or graphics processing units (GPUs); one or more high performance multicore CPUs; and one or more low-power CPUs.; [0012] According to an embodiment, the compiler compiles the augmented first source code stream to create a first binary executable program for execution on the first-type subset of the target CPUs. The first-type subset of the target CPUs may execute the first binary executable program (i.e., selection). In an example, the target CPUs of the first- and second-type subsets reside in a common computing system; [0035]; [0036] The functionally related code portions are allocated to one of many possible target CPU cores, or distributed across many possible target CPU cores of a heterogeneous plurality of target CPUs.; [0039] The heterogeneous plurality of target CPUs may include CPU types having disparate performance and power consumption characteristics, volumes (e.g., package sizes), weights, and internal design and architectures (e.g., dissimilar instruction set).; [0055] The programmer may annotate portions of the source code to indicate… which CPU core, [0056]; [0078-79]; [0099]; Claim 1); and
generate an execution schedule based on the selection, the interface circuitry to transmit the execution schedule to a device that schedules instructions on the first and second core ([0006] In an example, the coordination code may include a prolog and an epilogue for initialization and clean-up tasks and at least one from a group including code download and initialization per CPU type, run-time data transfer, synchronization, resource management, core allocation, and monitoring code. In another example, the coordination code includes API calls to move run-time data, control, and status operands between target CPUs of the first- and second-type subsets. In another example, the coordination code includes code compilable to coordinate sequencing of run-time code concurrently executing on target CPUs of the first- and second-type subsets.; [0052] Further, a source code stream may be augmented in order to create one or more binary executable programs suitable for a heterogeneous plurality of target CPUs. The source code stream may be augmented to include various instructions. For example, the source code stream may be augmented to allow multiple binary executable programs to synchronize and communicate with each other during run-time. In an example, based at least in part on an annotation, source code stream pre-processor 120 augments the first source code stream to include additional coordination code not present in the obtained source code to produce an augmented source code stream 121.; [0060] Augmented source code stream 121A, 121B may include additional coordination code not present in annotated source code 106. The coordination code may include code API calls necessary to move run-time data, control, status, and other run-time operands between the heterogeneous plurality of target CPUs (e.g., the first- and second-type subsets of the target CPUs). For example, the coordination code may include code compilable to pass run-time operands between the heterogeneous plurality of target CPUs. The coordination code may also include code compilable to coordinate sequencing of run-time code concurrently executing on the heterogeneous plurality of target CPUs.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of an separating code sections into individually schedulable partitions and distribute them to different heterogenous cores with different capacities for execution with Zhu’s parallel code that is also translated and separated into portions executable by different processing entities having different capabilities.
Regarding claim 2, Brower teaches wherein the thread processing circuitry is to determine a first complexity of the first partition and a second complexity of the second partition ([0036] The functionally related code portions are allocated to one of many possible target CPU cores, or distributed across many possible target CPU cores of a heterogeneous plurality of target CPUs. The quantity of target CPU cores may depend on various factors such as the complexity of a computation.; [0078] Compute intensive source code portion 612 includes source code to be compiled by a compiler that is specific to one or more target compute intensive CPUs (e.g., high performance multicore CPUs) of the heterogeneous plurality of target CPUs. [0079] Low-power source code portion 614 includes source code to be compiled by a compiler that is specific to one or more target low-power CPUs of the heterogeneous plurality of target CPUs.).
Regarding claim 3, Brower teaches, wherein the scheduling circuitry is to select the first core based on the first complexity and the second core based on the second complexity ([0036] The functionally related code portions are allocated to one of many possible target CPU cores, or distributed across many possible target CPU cores of a heterogeneous plurality of target CPUs. The quantity of target CPU cores may depend on various factors such as the complexity of a computation or whether a result of the computation is time critical. In an example, two or more target cores are used. In another example, 100 target cores are used. In another example, hundreds of target cores are used. In another example, 1,000 target cores are used.).
Regarding claim 4, Brower teaches wherein the first core is a performance core and the second core is an efficient core ([0005] one or more compute intensive multicore CPUs; and one or more one or more low-power CPU cores).
Regarding claim 5, Brower teaches wherein the device causes the first core to execute the first partition and causes the second core to execute the second partition (Claim 1. A method of preparing source code for compilation for, and eventual coordinated execution on, a heterogeneous plurality of target CPUs, the method comprising: obtaining source code annotated to identify at least a first portion thereof suitable for execution on a first-type subset of the target CPUs; based at least in part on a first annotation, separating the source code into first and second source code portions; generating from the first source code portion a first source code stream to be supplied for compilation by a first compiler, the first source code stream augmented, based on the first annotation, to include additional coordination code not present in the obtained source code, and the first compiler specific to the first-type subset of the target CPUs; and generating from the second source code portion a second source code stream to be supplied for compilation by a second compiler, the second compiler specific to a second-type subset of the target CPUs, wherein the target CPUs of the first- and second-type subsets have one or more different functionalities.).
Regarding claim 6, Brower teaches wherein the scheduling circuitry is to schedule the second partition to be executed by the second core after the first core begins execution of the first partition ([0006] concurrently executing on target CPUs of the first- and second-type subsets.; Table A; [0036] In an example, source code 104 may be annotated such that functionally related portions of code, including input, processing, and output, are associated.).
Regarding claim 7, Brower teaches wherein the thread processing circuitry is to split the first thread of the parallel threads into the partitions based on a complexity of portions of the first thread ([0036]; [0078-79]).
Regarding claim 8, it is a system type claim is having similar limitations as claim 1 above. Therefore, it is rejected under the same rationale.
Regarding claim 9, it is a system type claim is having similar limitations as claim 2 above. Therefore, it is rejected under the same rationale.
Regarding claim 10, it is a system type claim is having similar limitations as claim 3 above. Therefore, it is rejected under the same rationale.
Regarding claim 11, it is a system type claim is having similar limitations as claim 4 above. Therefore, it is rejected under the same rationale.
Regarding claim 12, it is a system type claim is having similar limitations as claim 5 above. Therefore, it is rejected under the same rationale.
Regarding claim 13, it is a system type claim is having similar limitations as claim 6 above. Therefore, it is rejected under the same rationale.
Regarding claim 14, it is a system type claim is having similar limitations as claim 7 above. Therefore, it is rejected under the same rationale.
Regarding claim 15, it is a media/product type claim is having similar limitations as claim 1 above. Therefore, it is rejected under the same rationale.
Regarding claim 16, it is a media/product type claim is having similar limitations as claim 2 above. Therefore, it is rejected under the same rationale.
Regarding claim 17, it is a media/product type claim is having similar limitations as claim 3 above. Therefore, it is rejected under the same rationale.
Regarding claim 18, it is a media/product type claim is having similar limitations as claim 4 above. Therefore, it is rejected under the same rationale.
Regarding claim 19, it is a media/product type claim is having similar limitations as claim 5 above. Therefore, it is rejected under the same rationale.
Regarding claim 20, it is a media/product type claim is having similar limitations as claim 6 above. Therefore, it is rejected under the same rationale.
Regarding claim 21, it is a media/product type claim is having similar limitations as claim 7 above. Therefore, it is rejected under the same rationale.
Regarding claim 22, it is a system type claim is having similar limitations as claim 1 above. Therefore, it is rejected under the same rationale.
Regarding claim 23, it is a system type claim is having similar limitations as claim 2 above. Therefore, it is rejected under the same rationale.
Regarding claim 24, it is a system type claim is having similar limitations as claim 3 above. Therefore, it is rejected under the same rationale.
Regarding claim 25, it is a system type claim is having similar limitations as claim 4 above. Therefore, it is rejected under the same rationale.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Conte et al. (US 9,189,282 B2) Thread-to-core Mapping Based On Thread Deadline, Thread Demand, And Hardware Characteristics Data Collected By A Performance Counter
Russell et al. (US 2009/0187909 A1) SHARED RESOURCE BASED THREAD SCHEDULING WITH AFFINITY AND/OR SELECTABLE CRITERIA. See at least [0022].
Haber et al. (US 2014/0095832 A1) METHOD AND APPARATUS FOR PERFORMANCE EFFICIENT ISA VIRTUALIZATION USING DYNAMIC PARTIAL BINARY TRANSLATION. See at least [0015].
Latorre et al. (US 2016/0162406 A1) Systems, Methods, And Apparatuses To Decompose A Sequential Program Into Multiple Threads, Execute Said Threads, And Reconstruct The Sequential Execution. See at least [0034] and [0142].
Gove (US 2009/0300643 A1) USING HARDWARE SUPPORT TO REDUCE SYNCHRONIZATION COSTS IN MULTITHREADED APPLICATIONS. See at least [0004]
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JORGE A CHU JOY-DAVILA whose telephone number is (571)270-0692. The examiner can normally be reached Monday-Friday, 6:00am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee J Li can be reached at (571)272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JORGE A CHU JOY-DAVILA/Primary Examiner, Art Unit 2195