DETAILED ACTION
Status of Claims
This action is in reply to the application filed on 12/23/2021.
Claims 1, 2, 5, 6, 8-15, and 18 have been amended in a preliminary amendment dated 05/26/2022.
Claims 21-25 have been added in a preliminary amendment in a preliminary amendment dated 05/26/2022.
Claims 7, 16-17, and 19-20 have been canceled in a preliminary amendment dated 05/26/2022.
Claims 1-6, 8-15, 18, 21-25 are currently pending and have been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-6, 8-15, 18, 21-25 are rejected under 35 U.S.C. 103 as being unpatentable over Xiao et al. (“A Machine Learning Inference Framework for Multi-core Processors”, 20191, cited in the 12/23/2021 IDS) in view of Jia et al. (“Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks”, 2018, cited in the 03/31/2025 IDS).
Claims 1, 9, and 18:
Xiao discloses the limitations as shown in the following rejections:
A computer device comprising processors and a memory that is connected to each of the processors, wherein the processors comprise a [plurality of general-purpose processors] having multiple processor cores, the memory is configured to store a computer program (framework) comprising a program instruction, when executed by the general-purpose processor, performing a method for neural network processing (pg. 1; pg. 15, § 4; Figure 1)
obtaining a calculation graph corresponding to a neural network model, wherein the calculation graph includes a plurality of operators (pg. 3; pg. 10, § 3.1; Figure 4).
determining a target splitting policy (optimal splitting strategy/scheme) of a neural network calculation task in a splitting policy set, wherein the splitting policy set (search space of potential splitting strategies) is a set composed of splitting policies corresponding to target operators in the calculation graph (pg. 6, § 2.1; pg. 8-9, § 2.2; pg. 11-12, § 3.1). Exemplary quotation:
“For each specific size of each operator, we use enumeration to traverse each splitting strategy in this search space to obtain its actual performance. From this, we can obtain an optimal splitting strategy, which represents the best performance that the operator can achieve on the underlying processor at that computation scale”
splitting the neural network calculation task according to the target splitting policy to obtain a plurality of sub-calculation tasks; and distributing the plurality of sub-calculation tasks to corresponding [core of multi-core] processor for processing (pg. 15-16, § 4 and 5; pg. 4, para. 1-3; pg. 6, § 2.1).
As shown above, Xiao discloses splitting operators of a NN and distributing the sub-operators to a plurality of cores of multi-core processors for execution. The cores of the multi-core processors arguably constitute processor cores in an AI processor since they are performing NN training/inference tasks; but since they are not clearly “dedicated processors” (AppSpec 0120) they do not clearly anticipate the limitation.
Jia, however, discloses analogous system and methods for partitioning (splitting) layers/operators of a NN, including splitting each layer according to a parallelization strategy which are distributed to GPU devices (i.e. artificial processor cores) for processing in at least pg. 2, col. 1, para. 1; pg. 3.
It would have been obvious to one of ordinary skill in the art prior to the filing date of the invention to modify Xiao to distribute NN calculations tasks to GPUs as taught by Jia to take advantage of a greater range of hardware including devices with high-performance data parallel capabilities (Jia pg. 8; pg. 1; pg. 7, col. 1).
Claims 2, 10, and 21:
The combination of Xiao/Jia discloses the limitations as shown in the rejections above. Xiao further discloses after obtaining the calculation graph corresponding to the NN model and before determining the target splitting policy of the NN calculation task in the splitting policy set, further comprising: determining the splitting policies corresponding to the target operators respectively, according to a degree of parallelism, a splitting dimension, and a size of the splitting dimension corresponding to each target operator in the calculation graph; and determining the splitting policy set according to the splitting policies corresponding to the target operators (pg. 8-9, § 2.2).
Claims 3, 11, and 22:
The combination of Xiao/Jia discloses the limitations as shown in the rejections above. Xiao further discloses determining the splitting policy set according to the splitting policies corresponding to the target operators includes: determining an intersection of splitting policies supported by each target operator as the splitting policy set (pg. 10-13, § 3.).
Claims 4, 12, and 23:
The combination of Xiao/Jia discloses the limitations as shown in the rejections above. Xiao further discloses wherein determining the target splitting policy of the NN calculation task in the splitting policy set includes: determining weight values (execution times/compute latency) of the splitting policies corresponding to the target operators in the splitting policy set respectively; and determining the target splitting policy according to the weight values in at least pg. 11-12. See also Jia pg. 1 and pg. 4, “Cost”.
Claims 5, 13, and 24:
The combination of Xiao/Jia discloses the limitations as shown in the rejections above. Xiao further discloses wherein each weight value is determined according to an operational type of the target operator included in the corresponding splitting policy, a data scale involved in the target operator, and hardware parameters of the multiple AI processor cores. (pg. 16; pg. 11, § 3.2. Jia pg. 2, col. 1; pg. 7.
“The framework can automatically generate an efficient splitting scheme for a given NN and MCP. During the scheme generation process, it can reasonably adjust the splitting method of individual operators based on the type and scale of the operator, combined with the underlying hardware's computational throughput and memory access bandwidth, achieving a good balance between the computational efficiency of the hardware cores and the degree of operator splitting.”
Claims 6, 14, and 25:
The combination of Xiao/Jia discloses the limitations as shown in the rejections above. Xiao further discloses determining the splitting policy of the target operator according to the operational type of the target operator in at least pg. 7-9.
Claims 8 and 15:
The combination of Xiao/Jia discloses the limitations as shown in the rejections above. Xiao further discloses wherein the degree of parallelism corresponding to the target operator comprises a first degree of parallelism (N dimension) and a second degree of parallelism (C dimension), wherein a multiplication product of the first degree of parallelism and the second degree of parallelism is less than or equal to a count of AI processor cores in the AI processor (pg. 7-8; pg. 11, para. 2;)
“Parallelism: This refers to the number of sub-operators into which the operator will be split. This variable is usually limited by the number of cores in the MC architecture, and can be a power of 2, provided it does not exceed the maximum number of cores…Generally, the product of the number of splits along each dimension is the parallelism of the operator.” (pg. 8)
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
The following references are directed to partitioning/parallelizing NN operators: “HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array”; “Split-CNN: Splitting Window-based Operations in Convolutional Neural Networks for Memory System Optimization”; US 2020/0372337 A1; US 2020/0302215 A1; US 2020/0050555 A1.
Any inquiry of a general nature or relating to the status of this application or concerning this communication or earlier communications from the Examiner should be directed to Paul Mills whose telephone number is 571-270-5482. The Examiner can normally be reached on Monday-Friday 11:00am-8:00pm. If attempts to reach the examiner by telephone are unsuccessful, the Examiner’s supervisor, April Blair can be reached at 571-270-1014.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/P. M./
Paul Mills
12/12/2025
/APRIL Y BLAIR/Supervisory Patent Examiner, Art Unit 2196
1 Citations refer to the machine translation prepended to the “A Machine Learning Inference Framework for Multi-core Processors” publication included with the Office Action.