Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-21 are presented for examination.
Claim 21 is new.
This is a Final Action.
Response to Arguments
Applicant's arguments filed 01/23/2026 have been fully considered but they are not persuasive.
Applicant makes the following arguments:
1. Applicant argues, Liu does not disclose commands comprising data structure with field describing dimensions. Specifically applicant argues that Liu only discloses processing instructions including an identifier of a descriptor, but the descriptor itself contains the dimensional fields. Therefore, the commands themselves do not comprise fields describing dimensions of multi-dimensional data.
Examiner respectfully disagrees with the applicant. Specifically the claims are broadly recited and do not require that the dimensional fields be physically embedded within the instruction itself. Liu teaches processing instructions containing operands that include identifiers of descriptors, and the descriptors include parameters describing tensor shapes including dimensional parameters such as coordinate dimensions and size of each dimensions. Accordingly, the processing instruction together with the referenced descriptor forms the claimed command structure comprising fields describing dimensions of the multi-dimensional data.
Applicant argues, Liu’s descriptors are not part of the command. Specifically, applicant asserts that Liu in paragraph 15 teaches that the instruction includes an identifier of a descriptor, rather than the descriptor itself being part of the processing instructions. Therefore, Liu does not disclose commands that themselves comprise fields describing tensor dimensions.
Examiner respectfully disagrees with the applicant. Under BRI, the claim language does not require that the data structure describing the dimensions be physically embedded within the command instructions. Liu teaches that instructions reference descriptors containing dimensional information used to perform tensor processing. The instruction and the descriptor operate together as part of the command execution structure describing multi-dimensional data, and therefore meet the claimed command data structure.
Applicant argues Liu does not disclose field describing synchronization. Specifically applicant argues Liu paragraph 80 only discloses blocking or caching instructions based on incomplete execution of other instructions, which constitutes dependency handling external to the instruction rather than commands comprising fields describing synchronization.
Examiner respectfully disagrees with the applicant. Liu explicitly teaches determining whether a second processing instruction has not been executed completely and blocking or caching a first processing instruction until completion of the earlier instruction. This disclosure describes synchronization of instruction execution based on partial completion of another instruction. Under the BRI, such dependency and execution coordination correspond to the claimed fields describing synchronization of a particular command process with one or more other processes.
Applicant argues Liu does not teach synchronization at occurrence of partial completion. Specifically applicant argues that liu does not disclose commands comprising fields describing synchronization at a plurality of occurrences of partial completion of the command process.
Examiner respectfully disagrees with the applicant. Lui paragraphs 80-82 teaches determining whether earlier instructions have not been completed and blocking or caching subsequent instructions until execution conditions are satisfied. Such execution control based on completion status necessarily corresponds to synchronization occurring at stages of partial completion of command execution.
Applicant argues Liu fails to disclose each element of the claims as required for anticipation. Specifically applicant cites MPEP 2131 and argues that LIU does not teach each element of the claims arranged as required.
Examiner respectfully disagrees with the applicant, specifically, Liu teaches receiving processing instructions, descriptors containing fields describing tensor dimensions, and synchronization of instruction execution based on completion status of other isntructions. When the claims are interpreted under the broadest reasonable interpretation consistent with the specification, Liu teaches the claimed command structure and execution of machine learning operations on multi-dimensional data.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-13 and 16-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Liu et al. (EP 3,800,547-IDS)
1. Liu teaches, A system for machine learning (Abstract; Figure 4; Paragraph 119) comprising:
one or more processors; and a non-transitory computer-readable medium storing a program executable by the one or more processors (Fig 4, Paragraphs 6-8, 28 – neural network chip including the data processing apparatus… An electronic device having the neural network chip… A board card including… a storage device… and the neural network chip… The storage device is configured to store data), the program comprising sets of instructions for:
receiving, by a processor (Fig 4: 389), a plurality of commands to perform machine learning operations on multi-dimensional data (Paragraph 15 – a data processing method includes: when an operand of a decoded first processing instruction includes an identifier of a descriptor, obtaining content of the descriptor… and executing the first processing according to the content of the descriptor), the commands comprising data structures (Paragraphs 15 and 20 – teaches the operand includes an identifier of a descriptor; the descriptor may include an identifier and content… the content of the descriptor may include at least one shape parameter and at least one address parameter), the data structures comprising:
a plurality of fields describing a plurality of dimensions of the multi-dimensional data (Paragraphs 18 and 45 – teaches the descriptor is configured to indicate a shape of tensor… the shape includes dimensions of the tensor and a size of each dimension… the descriptor can use a coordinate (x, y, z) to represent the shape of the tensor data); and
a plurality of fields describing synchronization of a particular command process with one or more other processes at a plurality of occurrences of partial completion of the particular command process (Paragraph 80-82 – teaches determining whether there is a second processing instruction that has not been executed completely… blocking or caching the first processing instruction when there is a second processing instruction that has not been executed completely); and
executing, by the processor, the commands to perform the machine learning operations on the multi-dimensional data (Paragraph 9 and 15 – teaches executing the first processing instruction according to the content of the descriptor… by introducing a descriptor ).
2. Liu teaches, The system of claim 1, wherein the multi-dimensional data comprises tensors, and wherein the commands perform a function on one or more complete tensors without the execution of other commands (Paragraphs 18 & 27 – teaches data be processed may include N-dimensional tensor data… the tensor may have different dimensions… the processor can read data… execute an additional (add) operation, and obtain an operation result (A+B)).
3. Liu teaches, The system of claim 1, wherein the command performs a data movement or matrix multiplication operation (Paragraphs 23 and 27 – teaches rhe first processing instruction may include a data access instruction (or) an operation instruction… executing an additional (add) operation).
4. Liu teaches, The system of claim 1, wherein the commands describe operations on the multi-dimensional data (Paragraph 24 – teaches the decoded first processing instruction includes an operation code and one or more operands… the operation code is used to indicate a processing type corresponding to the first processing instruction).
5. Liu teaches, The system of claim 1, wherein the commands repeat a plurality of same operations on the multi-dimensional data (Paragraph 23 – teaches the decoded first processing instruction may include… a data access instruction, an operation instruction… the present disclosure does not limit the specific type of the first processing instruction – disclose repetitive issuance of identical operation instructions (e.g., repeated add, multiply) is implied by Liu’s generalized instruction loop model).
6. Liu teaches, The system of claim 1, wherein at least one command addresses first multi-dimensional data that does not fit in on-chip memory of the at least one processor (Paragraph 40 – teaches the tensor data indicated by the descriptor may be stored in an external memory (an off-chip memory) connected to the control unit).
7. Liu teaches, The system of claim 6, wherein at least a portion of the first multi-dimensional data operated on during execution of the at least one command is stored in main memory (Paragraph 40 – On-chip storage of the identifier and content of the descriptor and off-chip storage of the tensor data indicated by the descriptor may be adopted).
8. Liu teaches, The system of claim 1, wherein the machine learning operations are neural network operations (Paragraph 9 – teaches by introducing a descriptor indicating the shape of a tensor… operator efficiency when an operation of a neural network model is performed can be improved).
9. Liu teaches, The system of claim 1, wherein the multi-dimensional data comprises multi-dimensional matrices of data, and wherein the commands encode the dimensions of the multi-dimensional matrices of data (Paragraph 18 – teaches the descriptor is configured to indicate a shape of a tensor… for example, a matrix can be tensor of two or more dimensions).
10. Liu teaches, The system of claim 9, wherein the commands specify a plurality of dimension sizes for a plurality of dimensions of one or more matrices (Paragraph 18 – teaches the shape of the tensor includes dimensions of the tensor and size of each dimension of each tensor).
11. Liu teaches, The system of claim 9, wherein the commands comprise a base address for at least one multi-dimensional matrix of data (Paragraph 20 – teaches the content of the descriptor may include at least one address parameter (such as base address of a datum point) representing an address of the tensor data).
12. Liu teaches, The system of claim 9, wherein the commands comprise a size of each dimension for at least one multi-dimensional matrix of data (Paragraph 45 – teaches the descriptor can use a coordinate (x, y, z) to represent the shape… the size of the data storage space of the tensor data in at least one of the N dimensions).
13. Liu teaches, The system of claim 9, wherein the commands comprise a stride size for at least one multi-dimensional matrix of data (Paragraph 45 – teaches the tensor data is stored in a memory, … the descriptor … indicates offset of the storage area in at least one of the N dimensions).
16. Liu teaches, The system of claim 1, wherein the commands encode synchronization points, and wherein a plurality of commands synchronize on a partially processed multi-dimensional data set at the synchronization points (Paragraphs 80-82 – teaches Fig. 1c shows a flowchart of a data synchronization method… determining whether there is a second processing instruction that has not been executed completely… blocking or caching the first processing instruction).
17. Liu teaches, The system of claim 16, wherein a dependent command synchronizes a partially processed multi-dimensional data set in main memory being operated on by another command (Paragraphs 40, 80-82 – teaches determining whether there is a second processing instruction that has not been executed completely… blocking or caching the first processing instruction).
18. Liu teaches, The system of claim 17, wherein at least one command executes a wait, executes a data transaction, or generates a signal on the occurrence of a predefined event specified in the at least one command (Paragraph 82 – teaches blocking or caching the first processing instruction when there is a second processing instruction that has been executed completely).
Claims 19 and 20 are similar to claim 1 hence rejected similarly.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 14 & 15 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (EP 3,800,547 - IDS) in view of Yu et al. (US 2023/0244942)
All the limitation of claim 9 are taught above.
14. Liu does not explicitly teach, wherein the commands comprise a data type for at least one multi-dimensional matrix of data.
Liu teaches general descriptor parameters (Paragraph 46 – those skilled in the art may select a shape parameter representing tensor data according to actual conditions).
However, Yu explicitly teaches, field disclosure of tensor “data type” (Paragraph 47 – GPU includes processing cores capable of operating on different data types (e.g. TF32, Bfloat16, FP32, FP64)).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which said subject matter pertains to allow Liu’s invention to be combined with Yu’s invention as taught by Yu. The combination of Yu’s explicit tensor-type and hardware-compatibility fields with Liu’s descriptor based system to ensure interoperability and optimized ML performance across heterogeneous GPU/AI chips allowing for improved processing efficiency of tensor operations.
All the limitation of claim 9 is taught above.
15. Liu teaches, wherein the commands comprise a base address (Paragraphs 20, 45 & 47 – teaches a start address PA_start (a base address) of the data storage space 21 is a physical address of a first block 22), a size of each dimension (Paragraph 47 – teaches a size in the X axis direction (a size of each row) is ori_x… a size in the Y axis direction (a total count of rows) is ori_y), and stride size (Paragraphs 47 & 58 – teaches Paz(x,y) – PA_start + (offset_y+y_q – 1)* ori_x + offset_x + x_q) and a size in the X axis direction (a size of each row) is ori_x) for at least one multi-dimensional matrix of data (Paragraph 18 – teaches data to be processed may include N-dimensional tensor data…).
Liu does not explicitly disclose, a data type.
However, Yu teaches, a data type (Paragraph 47 – GPU includes processing cores capable of operating on different data types (e.g. TF32, Bfloat16, FP32, FP64)).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which said subject matter pertains to allow Liu’s invention to be combined with Yu’s invention as taught by Yu. The combination of Yu’s explicit tensor-type and hardware-compatibility fields with Liu’s descriptor based system to ensure interoperability and optimized ML performance across heterogeneous GPU/AI chips allowing for improved processing efficiency of tensor operations.
Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (EP 3,800,547 - IDS) in view of Zang (US 2014/0137136)
With respect to claim 21, Lui does not explicitly teach, wherein the plurality of fields describing synchronization include a field identifying at least one of:
a semaphore identifier that causes command execution to be contingent upon completion of a partial operation;
an event identifier that causes command execution to be contingent upon detection of a predefined event; or
a parameter defining a synchronization granularity that causes command execution to be contingent upon occurrence of a specified degree of partial completion of an operation.
However, Zang teaches,
an event identifier that causes command execution to be contingent upon detection of a predefined event (Paragraph 11 – teaches the predefined algorithm includes an identifier of a concerned event, an identifier of an event required for computation, and a computation-trigger condition, Zhang)
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which said subject matter pertains to incorporate the event-based synchronization techniques of Zhang into the machine learning command processing system of Liu in order to coordinate execution of processing operations based on the occurrence of predefined events. Both Liu and Zhang operate in the field of distributed data processing system and address execution control of computational tasks. Incorporating Zhang’s event-triggered execution mechanism into Liu’s machine learning command framework would have been a predictable use of known synchronization techniques to manage dependencies between computational operations.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMRESH SINGH whose telephone number is (571)270-3560. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached at (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AMRESH SINGH/Primary Examiner, Art Unit 2159