DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following figures not mentioned in the description: figure 5A-5D, and 10A-10B. Furthermore with respect to figure 2 the description for reference designator 121, register renaming table, is not mentioned in the description. Furthermore, claimed subject matter “register mapping table” is not shown. Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 2-8, and 11-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 2 line 3, and claim 12 lines 2-3 recite “an identifier of the plurality of components”. It is unclear whether the identifier is a single identifier for all of the components, or an identifier for each of the plurality of components. For purposes of examination, Examiner interprets as an identifier for each of the plurality of components. Claims 3-6 inherit the same deficiency as claim 2 based on dependence. Claims 13-16 inherit the same deficiency as claim 12 based on dependence.
Claim 3 line 5, claim 4 line 5, claim 5 lines 6, 10, 12, and 14, claim 7 line 7, claim 8 line 4, claim 13 line 5, claim 14 line 5, claim 15 lines 4-5, 9, 11, and 14, claim 16 line 6, claim 17 line 7, and claim 18 line 4 recite “the filter matrix”. This limitation lacks antecedent basis. It is unclear whether this refers to the “set filter matrix” or another filter matrix. For purposes of examination, Examiner interprets as “the set filter matrix”. Claims 4-6 inherit the same deficiency as claim 3 based on dependence. Claim 6 inherits the same deficiency as claim 5 based on dependence. Claim 8 inherits the same deficiency as claim 7 based on dependence. Claims 14-16 inherit the same deficiency as claim 13 based on dependence. Claim 16 inherits the same deficiency as claim 15 based on dependence. Claim 18 inherits the same deficiency as claim 17 based on dependence.
Claim 7 lines 2-4, and claim 17 lines 2-4 recite “generating [generating], by the at least one processor, the input data matrix by changing a size of an original input data matrix and a number and an order of the plurality of components into a memory region corresponding to a workspace”. It is unclear how the processor generates the input data matrix into a memory region. It is unclear whether the input data is stored into a memory region, mapped to a memory regions, or what the relationship is between the input data and the memory region. Furthermore, it is not clear what is meant by the original input data matrix outputs the feature data matrix through the filter matrix and the GEMM operation; it is unclear how original input data matrix can output a feature data matrix, or how it is output through a filter matrix and GEMM operation. For purposes of examination, Examiner interprets as the input data is stored into a memory region. Furthermore, for purposes of examination, Examiner interprets as the processor outputs the original input data matrix as the feature data matrix as associated with the filter matrix and for a GEMM operation. Claim 8 inherits the same deficiency as claim 7 based on dependence. Claim 18 inherits the same deficiency as claim 17 based on dependence.
Claim 11, line 5 recites “a register mapping table”. For antecedent basis reasons, it is unclear whether this is the same register mapping table as recited in line 2 or a different register mapping table. For purposes of examination, Examiner interprets as the same. Claims 12-19 inherit the same deficiency as claim 11 based on dependence.
Claims 11-19 do not provide a discernable boundary as to what part of the processor performs the claim functions, rendering the bounds of the claim unclear. The functions which the processor is configured to carry out do not follow from structure recited in the claim, rendering it unclear whether the function requires some other structure or is simply a result of operating the processor in a certain manner. Therefore, one of ordinary skill in the art would not be able to draw a clear boundary between what is and is not covered by the claims. See MPEP 2173.05(g). Dependent claims further inherit the same deficiency as the claim upon which it depends.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.
Claim 10 does not fall within at least one of the four categories of patent eligible subject matter because the claim is directed to a transitory form of signal transmission (signals per se). Claim 10 recites “a computer program stored in a computer-readable recording medium”. The BRI of machine readable media can encompass non-statutory forms of signal transmission, signals per se. Furthermore the specification provides no further limitation to the scope of the computer-readable recording medium to be limited to non-transitory computer-readable recording medium. For example [0079] merely describes that a program may be recoded on a recording medium readable by the processor without further discussion.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
As to treatment of claims, the apparatus claims will be addressed first followed by the method and computer-readable recording medium claims.
Claims 1 and 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over US 20190042262 A1 Espig et al., (hereinafter “Espig”) in view of Lecture 11: Modern Superscalar Processor Models, Iowa State University, 2005, found at https://home.engineering .iastate.edu/~zzhang/courses/cpre585-f04/slides/lecture11.pdf (hereinafter “lecture 11”), in view of P. Warden, Why GEMM is at the heart of deep learning, Pete Warden’s blog, 2015, found at https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/ (hereinafter “Warden”).
Regarding claim 11, Espig teaches the following:
a memory storing a register map (fig 4B physical register files units 458 for memory [0087], register maps); and
a processor configured to perform a general matrix multiplication (GEMM) operation on an input data matrix with a set filter matrix ([0130]), update a register map so that first destination register addresses of redundant components indicating data redundant with each other among a plurality of components of the input data matrix correspond to a same second destination register address ([0134], fig 13, physical input matrix for first destination register address, virtual input matrix for second destination register address, rows of the respective matrices 1311-1314 for plurality of components, [0135] common data elements are reused); and
perform the convolutional operation by reusing a register having the same second destination register address with respect to the redundant components, based on the register map (fig 14A-C, fig 15, [0130], [0139-0143], [0158]].
Espig discloses register maps, but does not explicitly disclose a register mapping table or wherein the reusing a register is based on the register mapping table. Espig further does not explicitly disclose wherein the GEMM operation is used in performing a convolution operation for generating a feature data matrix corresponding to an output data matrix.
However, in the same field of endeavor, a mechanism for mapping from virtual architectural registers to physical registers using a register mapping table (slide 6, slide 8). It would have been obvious to one of ordinary skill in the art before the effective filing date, to use as the mechanism for register mapping as disclosed by Espig, the register mapping table as disclosed in Lecture 11. It is obvious to use a known technique to improve similar devices in the same way. See MPEP 2141.III.(C). Therefore Espig in view of Lecture II teaches a memory storing a register mapping table, and wherein the reusing a register is based on the register mapping table.
Furthermore Warden discloses wherein the GEMM operation is used in performing a convolution operation for generating a feature data matrix corresponding to an output data matrix (How GEMM works for Convolutions section). It would have been obvious to one of ordinary skill in the art before the effective filing data to use the GEMM operation as disclosed by Espig for performing a convolution operation for generating a feature data matrix corresponding to an output data matrix as disclosed by Warden, to achieve the benefit of very regular patterns of memory access (Why GEMM works for Convolutions section). Therefore Espig in view of Warden teaches wherein the GEMM operation is used in performing a convolution operation for generating a feature data matrix corresponding to an output data matrix, and Espig in view of Lecture 11 in view of Warden teaches the claim 11 limitations.
Claim 1 is directed to a method that would be practiced by the apparatus as in claim 11. All steps performed by the method as in claim 1 is performed by the apparatus as in claim 11 as configured. The claim 11 analysis applies equally to claim 1.
Claim 10 is directed to a computer program stored in a computer-readable recording medium that would execute the method as in claim 1. All steps performed by the method as in claim 1 are executed by the computer program stored in a computer-readable recording medium as in claim 10. The claim 1 analysis applies equally to claim 10.
Claims 2 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Espig in view of Lecture 11 in view Warden in view of J. Jeon et al., Locality-aware GPU Register File, IEEE Computer Architecture Letters. 2019 (hereinafter “Jeon”).
Regarding claim 12, Espig in view of lecture 11 in view of Warden teach the claim 11 limitations. Espig in view of lecture 11 in view of Warden do not explicitly disclose wherein the processor is further configured to generate an identifier of the plurality of components; and update the register mapping table so that first destination register addresses of components for which a same identifier is generated among the plurality of components correspond to a same second destination register address. However in the same field of endeavor Jeon discloses:
wherein the processor is further configured to generate an identifier of the plurality of components; and update the register mapping table so that first destination register addresses of components for which a same identifier is generated among the plurality of components correspond to a same second destination register address (figure 3 one of the tables inside the register renaming module, section 3, first paragraph register id for identifier).
It would have been obvious to one of ordinary skill in the art before the effective filing date to update the register mapping table of Espig in view of lecture 11 in view of Warden using the identifier as in Jeon. It would have been obvious to achieve the benefit of mapping multiplier entries of physical registers to multiple architectural registers (Section 3 first paragraph).
Claim 2 is directed to a method that would be practiced by the apparatus as in claim 12. All steps performed by the method as in claim 2 is performed by the apparatus as in claim 12 as configured. The claim 12 analysis applies equally to claim 2.
Claims 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Espig in view of Lecture 11 in view Warden in view of US 20220066960 A1 Park et al., (hereinafter “Park”).
Regarding claim 17, Espig in view of lecture 11 in view of Warden teach the claim 1 limitations. Espig in view of lecture 11 in view of Warden do not explicitly disclose wherein the processor is further configured to generate the input data matrix by changing a size of an original input data matrix and a number and an order of a plurality of components into a memory region corresponding to a workspace, wherein the input data matrix is a matrix in which the plurality of components of the original input data matrix are recombined and arranged with a rule so that the original input data matrix outputs the feature data matrix through the filter matrix and the GEMM operation. However in the same field of endeavor Park discloses:
wherein the processor is further configured to generate the input data matrix by changing a size of an original input data matrix and a number and an order of a plurality of components into a memory region corresponding to a workspace (fig 5, [0058-0061]),
wherein the input data matrix is a matrix in which the plurality of components of the original input data matrix are recombined and arranged with a rule so that the original input data matrix outputs the feature data matrix through the filter matrix and the GEMM operation (fig 5, [0058-0061], the rule being changing the size based on the spatial size of the filter).
It would have been obvious to one of ordinary skill in the art before the effective filing date to generate the input data matrix into a memory region according to Park, and to recombine and rearrange with a rule as in Park for a GEMM operation associated with the filter matrix as in Espig in view of lecture 11 in view of Warner. It would have been obvious to achieve the benefit of not overlapping redundant data in memory ([0061]).
Claim 7 is directed to a method that would be practiced by the apparatus as in claim 17. All steps performed by the method as in claim 7 is performed by the apparatus as in claim 17 as configured. The claim 17 analysis applies equally to claim 7.
Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Espig in view of Lecture 11 in view Warden in view of Park in view of US 20190205735 A1 Smelyanskiy et al, (hereinafter “Smelyanskiy”).
Regarding claim 18, Espig in view of lecture 11 in view of Warden in view of Park teach the claim 17 limitations. Espig in view of lecture 11 in view of Warden in view of Park do not explicitly disclose wherein the processor is further configured to generate the input data matrix by converting the original input data matrix into the workspace having a same number of rows as a number of rows of the feature data matrix and a same number of columns as a size of the filter matrix.. However in the same field of endeavor Smelyanskiy discloses:
wherein the processor is further configured to generate the input data matrix by converting the original input data matrix into the workspace having a same number of rows as a number of rows of the feature data matrix and a same number of columns as a size of the filter matrix (fig 4B, [0048-0053]).
It would have been obvious to one of ordinary skill in the art before the effective filing date to generate the input data matrix of Espig in view of lecture 1 in view of Warden in view of Park using the lowering procedure as disclosed by Smelyanskiy. It would have been obvious to achieve the benefit of allowing a GEMM operation to be performed for a convolution ([0048]).
Claim 8 is directed to a method that would be practiced by the apparatus as in claim 18. All steps performed by the method as in claim 8 is performed by the apparatus as in claim 18 as configured. The claim 18 analysis applies equally to claim 8.
Allowable Subject Matter
Claim 9 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Claims 3-6, 13-16, and 19 would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and rewritten to overcome the rejections under 35 USC 112(b) and the relevant claim objections.
The following is a statement of reasons for the indication of allowable subject matter.
Applicant claims devices, methods and a computer program stored in a computer-readable recording medium to execute a convolutional operation, wherein the device as in claim 11 comprises:
a memory storing a register mapping table; and
a processor configured to perform a convolutional operation for generating a feature data matrix corresponding to an output data matrix by performing a general matrix multiplication (GEMM) operation on an input data matrix with a set filter matrix, update a register mapping table so that first destination register addresses of redundant components indicating data redundant with each other among a plurality of components of the input data matrix correspond to a same second destination register address; and perform the convolutional operation by reusing a register having the same second destination register address with respect to the redundant components, based on the register mapping table.
Wherein claim 12 comprising the device of claim 11, wherein the processor further is configured to:
generate an identifier of the plurality of components; and update the register mapping table so that first destination register addresses of components for which a same identifier is generated among the plurality of components correspond to a same second destination register address.
Wherein claim 13 comprising the device of claim 12, wherein the identifier comprises: an element ID,
wherein the processor is further configured to generate a patch ID of the plurality of components based on an array index of the plurality of components, a number of rows and columns of the filter matrix, and a number of columns of the output data matrix; and
generate an element ID of the plurality of components based on the patch ID and an offset of the plurality of components,
wherein the offset is a value determined based on the patch ID, the number of columns and a number of channels of the input data matrix, and
wherein the array index is a value indicating a location of each component when the plurality of components are arranged in a single-dimensional array.
Reasons for indication of allowable subject matter include the above highlighted limitations in combination with the remaining limitations. Espig, Lecture 11, Warden, Park, Smelyanskiy, and Jeon disclose the claimed invention according to the above claim mappings. Each of Espig, Lecture 11, Warden, Park, Smelyanskiy, and Jeon are silent with respect to a patch ID and generation of an element ID of the plurality of components based on the patch ID and an offset of the plurality of components, and does not teach or suggest these limitations or the above highlighted limitations in combination with the remaining limitations.
Applicant further claims, wherein claim 19 comprising the device of claim 11, wherein the processor is further configured to:
receive tensor core load data, identify whether an identifier of a component having a first destination register address and a second destination register address included in the tensor core load data are recorded in a load history buffer; when the identifier and the second destination register address are not recorded in the load history buffer, fetch data of the component, record a second destination register address of a register in which the identifier and the fetched data are stored in the load history buffer, and update the register mapping table so that the second destination register address recorded in the load history buffer corresponds to the first destination register address; and when the identifier and the second destination register address are recorded in the load history buffer, update the register mapping table so that the second destination register address recorded in the load history buffer corresponds to the first destination register address.
Reasons for indication of allowable subject matter include the above highlighted limitations in combination with the remaining limitations. Espig, Lecture 11, Warden, Park, Smelyanskiy, and Jeon disclose the claimed invention according to the above claim mappings. Each of Espig, Lecture 11, Warden, Park, Smelyanskiy, and Jeon are silent with respect toa load history buffer, and does not teach or suggest these limitations or the above highlighted limitations in combination with the remaining limitations specifically with respect to identifying whether an identifier of a component having a first destination register address and a second destination register address are recorded in a load history buffer, and fetching data of the component when the identifier and the second destination register address are not recorded in the load history buffer, and updating the register mapping table so that the second destination register address recorded in the load history buffer corresponds to the firs destination register address.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EMILY E LAROCQUE whose telephone number is (469)295-9289. The examiner can normally be reached on 10:00am - 1200pm, 2:00pm - 8pm ET M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor Andrew Caldwell can be reached on 571-272-3701. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/EMILY E LAROCQUE/Examiner, Art Unit 2182