DETAILED ACTION
Response to Amendment
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-5, 7, 22-25, and 27-31 are rejected under 35 U.S.C. 103 as being unpatentable over Koker et al. (U.S. Patent Application Publication Number 2020/0294180) and Ganapathy et al. (U.S. Patent Application Publication Number 2023/0153218).
Regarding Claim 1, Koker discloses a parallel processor (paragraph 0334) comprising:
an active base chiplet die (Figure 28A, item 2810) including hardware logic (Figure 28A, item 2801, paragraph 0337), interconnect logic (Figure 28A, item 2808, paragraph 0336), and a plurality of chiplet slots (paragraph 0400); and
a plurality of chiplets (Figure 28A, items 2804-2806) vertically stacked on the active base chiplet die (Figure 36, items 3602; i.e., stacked on top) and coupled with the plurality of chiplet slots of the active base chiplet die, the plurality of chiplets interchangeable during assembly of the parallel processor (paragraph 0373),
wherein the plurality of chiplets include a first group of chiplets (Figure 28A, item 2805, paragraph 0338) and a second group of chiplets (Figure 28A, item 2804, paragraph 0338; i.e., the reference states “the chiplets include but are not limited to” what is shown in the figure; therefore, there may be two media chiplets 2804), each group of chiplets has an equal number of execution cores (Figure 4B, paragraphs 0338 and 0373; i.e., each of the compute chiplets 2805 correspond to the processor 407 depicted in Figure 4B [see paragraph 0338 - the compute chiplets 2805 can include streaming multiprocessors]; there are four compute chiplets 2805 [Figure 28A] and each of them comprises four cores 460A-D [Figure 4B]; therefore, the total number of execution cores in the “first group of chiplets” is 16 [four multiplied by four]; the three dots between graphics processor 432 and graphics processor N indicate that there may be any number of graphics processing units; further, the variable "N" is used to also indicate that there may be any number of graphics processors; therefore, Koker discloses that there can be four graphics processors, each with two cores 1715 [Figure 17]; accordingly, assuming there are two media chiplets 2804 as mentioned above, there may be a total of 16 cores in the media chiplet 2804 [two chiplets multiplied by four graphics processors multiplied by two cores], which is equal to the number of cores in the compute chiplets 2805).
Koker does not expressly disclose each of the first group of chiplets and the second group of chiplets comprises chiplets having different numbers of execution cores.
In the same field of endeavor (e.g., chiplet design techniques), Ganapathy teaches each of the first group of chiplets and the second group of chiplets (Figure 2, items 200 and 202, paragraph 0031) comprises chiplets having different numbers of execution cores (Figure 2, items 206-220, paragraph 0037; i.e., the reference states “in some embodiments, processor 102 includes a different number and/or arrangement of chiplets and/or processor cores”; it would therefore have been obvious to one of ordinary skill in the art to have provided four chiplets [two groups of two chiplets], each having a different numbers of execution cores; further, it has been held that "where the general conditions of a claim are disclosed in the prior art, it is not inventive to discover the optimum or workable ranges by routine experimentation." In re Aller, 220 F.2d 454, 456, 105 USPQ 233, 235 (CCPA 1955); therefore, for this additional reason, it would have been obvious to one of ordinary skill in the art to have provided one number of cores in a first chiplet and different numbers of cores [optimum ranges] in each of the plurality of other chiplets for the purpose of tailoring performance, power consumption, cost, and workload allocation for specific design requirements).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Ganapathy’s teachings of chiplet design techniques with the teachings of Koker, for the purpose of tailoring performance, power consumption, cost, and workload allocation for specific design requirements. More specifically, a hardware or software designer would be able to use a chiplet having a particular number of cores based on the application requirements (i.e., an application with low processing needs could use a chiplet with less cores, while a more computation-intensive application could use a chiplet with more cores). This flexibility would allow for reduced power consumption.
Regarding Claims 2, 23, and 28, Koker discloses a thread dispatcher configured to dispatch threads to the first group of chiplets and the second group of chiplets according to the equal number of execution cores associated respectively with the first group of chiplets and the second group of chiplets (Figure 27, item 2703, paragraph 0333; i.e., the thread dispatcher 2703 will dispatch threads to whichever chiplets 2804 and 2805 contain cores that can execute the required workload).
Regarding Claims 3, 24, and 29, Koker discloses wherein the equal number of execution cores within each group of chiplets includes at least a pre-determined number of functional execution cores of a first type (e.g., processor execution cores) and the pre-determined number of functional execution cores of the first type is equal between the first group of chiplets and the second group of chiplets (paragraph 0373; i.e., each of the compute chiplets 2805 correspond to the processor 407 depicted in Figure 4B [see paragraph 0338 - the compute chiplets 2805 can include streaming multiprocessors], which contains a variable number of cores 460; each of the media chiplets 2804 correspond to the graphics acceleration module 446 depicted in Figure 4B [paragraph 0338 - the media chiplets 2804 can include hardware logic to accelerate media encode and decode operations], which contains a variable number of graphics processing engines 431-N, each of which contain a variable number of cores 1715 [Figure 17]; any number of the cores can be active/functional; therefore, Koker anticipates that each of the various chiplets 2804 and 2805 contain a “pre-determined number of functional execution cores of the first type is equal between the first group of chiplets and the second group of chiplets” as claimed).
Regarding Claims 4 and 30, Koker discloses wherein the first group of chiplets or the second group of chiplets include a first chiplet having a first number of functional execution cores; and a second chiplet having a second number of functional execution cores and a third number of non-functional execution cores due to yield loss (paragraph 0370; i.e., there may be certain units [e.g., cores] within a given chiplet that may be designated as functional, while others may be designated as non-functional; but this can vary from chiplet to chiplet - some chiplets might have all units designated as functional, whereas other chiplets can have some units as functional and others as non-functional; the third number of non-functional execution cores can be due to “yield loss” in that a particular processor/chiplet may have a number of defective compute or graphics cores but still be used in the system [see paragraph 0371]).
Regarding Claims 5 and 31, Koker discloses wherein the first group of chiplets or the second group of chiplets additionally include a third chiplet having a fourth number of functional execution cores and a fifth number of reserved execution cores (paragraph 0370; i.e., the functional units within each chiplet that are chosen to not be used are equivalent to the claimed “reserved execution cores”, since they can be brought back online to process the workload as needed).
Regarding Claim 7, Koker discloses wherein the plurality of chiplet slots have a plurality of different die aperture sizes (Figure 36, item 3602; i.e., as can be seen in the figure, different chiplets 3602 have different sizes, which would indicate that the slots they connect to have respective different die aperture sizes).
Regarding Claim 22, Koker discloses a method comprising:
selecting chiplets from multiple bins of chiplets (Figure 42, items 4204-4208, paragraphs 0390-0391) to create multiple groups of chiplets (Figure 43, item 4306, paragraph 0392), the multiple bins of chiplets including a first group of chiplets (Figure 28A, item 2805, paragraph 0338) and a second group of chiplets (Figure 28A, item 2804, paragraph 0338; i.e., the reference states “the chiplets include but are not limited to” what is shown in the figure; therefore, there may be two media chiplets 2804), and each group of chiplets has an equal number of execution cores (Figure 4B, paragraphs 0338 and 0373; i.e., each of the compute chiplets 2805 correspond to the processor 407 depicted in Figure 4B [see paragraph 0338 - the compute chiplets 2805 can include streaming multiprocessors]; there are four compute chiplets 2805 [Figure 28A] and each of them comprises four cores 460A-D [Figure 4B]; therefore, the total number of execution cores in the “first group of chiplets” is 16 [four multiplied by four]; the three dots between graphics processor 432 and graphics processor N indicate that there may be any number of graphics processing units; further, the variable "N" is used to also indicate that there may be any number of graphics processors; therefore, Koker discloses that there can be four graphics processors, each with two cores 1715 [Figure 17]; accordingly, assuming there are two media chiplets 2804 as mentioned above, there may be a total of 16 cores in the media chiplet 2804 [two chiplets multiplied by four graphics processors multiplied by two cores], which is equal to the number of cores in the compute chiplets 2805);
populating multiple chiplet slots (paragraph 0400) of a base chiplet die (Figure 28A, item 2810) of a parallel processor (paragraph 0334) with selected chiplets to create groups of multiple chiplets, each group having the equal number of execution cores (paragraph 0373); and
configuring firmware for the parallel processor according to the equal number of execution cores within each group of chiplets (paragraph 0398; i.e., the firmware considers the functional execution cores as well as the reserved execution cores [i.e., cores that are determined to not be used for the workload] within the various chiplets when executing the workload).
Koker does not expressly disclose each of the first group of chiplets and the second group of chiplets comprises chiplets having different numbers of execution cores.
In the same field of endeavor, Ganapathy teaches each of the first group of chiplets and the second group of chiplets (Figure 2, items 200 and 202, paragraph 0031) comprises chiplets having different numbers of execution cores (Figure 2, items 206-220, paragraph 0037; i.e., the reference states “in some embodiments, processor 102 includes a different number and/or arrangement of chiplets and/or processor cores”; it would therefore have been obvious to one of ordinary skill in the art to have provided four chiplets [two groups of two chiplets], each having a different numbers of execution cores; further, it has been held that "where the general conditions of a claim are disclosed in the prior art, it is not inventive to discover the optimum or workable ranges by routine experimentation." In re Aller, 220 F.2d 454, 456, 105 USPQ 233, 235 (CCPA 1955); therefore, for this additional reason, it would have been obvious to one of ordinary skill in the art to have provided one number of cores in a first chiplet and different numbers of cores [optimum ranges] in each of the plurality of other chiplets for the purpose of tailoring performance, power consumption, cost, and workload allocation for specific design requirements).
The motivation discussed above with regards to Claim 1 applies equally as well to Claim 22.
Regarding Claim 25, Koker discloses wherein selecting chiplets from multiple bins of chiplets to create multiple groups of chiplets that collectively have pre-determined number of execution cores include: selecting a first chiplet having a first number of functional execution cores; selecting a second chiplet having a second number of functional execution cores and a third number of non-functional execution cores due to yield loss (paragraph 0370; i.e., there may be certain units [e.g., cores] within a given chiplet that may be designated as functional, while others may be designated as non-functional; but this can vary from chiplet to chiplet - some chiplets might have all units designated as functional, whereas other chiplets can have some units as functional and others as non-functional; the third number of non-functional execution cores can be due to “yield loss” in that a particular processor/chiplet may have a number of defective compute or graphics cores but still be used in the system [see paragraph 0371]); and selecting a third chiplet having a fourth number of functional execution cores and a fifth number of reserved execution cores (paragraph 0370; i.e., the functional units within each chiplet that are chosen to not be used are equivalent to the claimed “reserved execution cores”, since they can be brought back online to process the workload as needed).
Regarding Claim 27, Koker discloses a parallel processing system (paragraph 0334) comprising:
an active base chiplet die (paragraph 0334) including hardware logic (Figure 28A, item 2801, paragraph 0337), interconnect logic (Figure 28A, item 2808, paragraph 0336), and a plurality of chiplet slots having a plurality of different die aperture sizes (Figure 36, item 3602, paragraph 0400; i.e., as can be seen in the figure, different chiplets 3602 have different sizes, which would indicate that the slots they connect to have respective different die aperture sizes); and
a plurality of chiplets (Figure 28A, items 2804-2806) vertically stacked on the active base chiplet die (Figure 36, items 3602; i.e., stacked on top) and coupled with the plurality of chiplet slots of the active base chiplet die, the plurality of chiplets interchangeable during assembly of the parallel processing system (paragraph 0373),
wherein the plurality of chiplets include a memory chiplet (Figure 28A, item 2806), a first group of chiplets (Figure 28A, item 2805, paragraph 0338), and a second group of chiplets (Figure 28A, item 2804, paragraph 0338; i.e., the reference states “the chiplets include but are not limited to” what is shown in the figure; therefore, there may be two media chiplets 2804),
each group of chiplets has an equal number of execution cores (Figure 4B, paragraphs 0338 and 0373; i.e., each of the compute chiplets 2805 correspond to the processor 407 depicted in Figure 4B [see paragraph 0338 - the compute chiplets 2805 can include streaming multiprocessors]; there are four compute chiplets 2805 [Figure 28A] and each of them comprises four cores 460A-D [Figure 4B]; therefore, the total number of execution cores in the “first group of chiplets” is 16 [four multiplied by four]; the three dots between graphics processor 432 and graphics processor N indicate that there may be any number of graphics processing units; further, the variable "N" is used to also indicate that there may be any number of graphics processors; therefore, Koker discloses that there can be four graphics processors, each with two cores 1715 [Figure 17]; accordingly, assuming there are two media chiplets 2804 as mentioned above, there may be a total of 16 cores in the media chiplet 2804 [two chiplets multiplied by four graphics processors multiplied by two cores], which is equal to the number of cores in the compute chiplets 2805).
Koker does not expressly disclose each of the first group of chiplets and the second group of chiplets comprises chiplets having different numbers of execution cores.
In the same field of endeavor, Ganapathy teaches each of the first group of chiplets and the second group of chiplets (Figure 2, items 200 and 202, paragraph 0031) comprises chiplets having different numbers of execution cores (Figure 2, items 206-220, paragraph 0037; i.e., the reference states “in some embodiments, processor 102 includes a different number and/or arrangement of chiplets and/or processor cores”; it would therefore have been obvious to one of ordinary skill in the art to have provided four chiplets [two groups of two chiplets], each having a different numbers of execution cores; further, it has been held that "where the general conditions of a claim are disclosed in the prior art, it is not inventive to discover the optimum or workable ranges by routine experimentation." In re Aller, 220 F.2d 454, 456, 105 USPQ 233, 235 (CCPA 1955); therefore, for this additional reason, it would have been obvious to one of ordinary skill in the art to have provided one number of cores in a first chiplet and different numbers of cores [optimum ranges] in each of the plurality of other chiplets for the purpose of tailoring performance, power consumption, cost, and workload allocation for specific design requirements).
The motivation discussed above with regards to Claim 1 applies equally as well to Claim 27.
Claims 6, 21, 26, 32, and 33 are rejected under 35 U.S.C. 103 as being unpatentable over Koker and Ganapathy as applied to Claims 5, 25, and 31, and further in view of Connor et al. (U.S. Patent Application Publication Number 2019/0042351).
Regarding Claims 6, 26, and 32, Koker and Ganapathy do not expressly disclose wherein the fifth number of reserved execution cores are reserved for in-field repair.
In the same field of endeavor (e.g., multi-core processors), Connor teaches wherein the fifth number of reserved execution cores (paragraph 0013) are reserved for in-field repair (Figure 2, items 210-212, paragraph 0029; i.e., a spare/reserved core can replace a failed core [“in-field repair”]).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Connor’s teachings of multi-core processors with the teachings of Koker and Ganapathy, for the purpose of allowing the processor to continue to function without interruption in the event of core failure.
Regarding Claims 21 and 33, Koker discloses firmware configured with a number of functional execution cores and a number of reserved execution cores for the first group of chiplets and the second group of chiplets (paragraph 0398; i.e., the firmware considers the functional execution cores as well as the reserved execution cores [i.e., cores that are determined to not be used for the workload] within the various chiplets when executing the workload).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure because each reference discloses a parallel processor comprising chiplets with unequal numbers of execution cores.
Response to Arguments
Applicant’s arguments with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FAISAL M ZAMAN whose telephone number is (571)272-6495. The examiner can normally be reached Monday - Friday, 8 am - 5 pm, alternate Fridays.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew J. Jung can be reached at 571-270-3779. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/FAISAL M ZAMAN/ Primary Examiner, Art Unit 2175