DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner’s Remark
The Examiner notes that claim language “grouping… in a column direction” is interpreted as grouping contiguous blocks in one direction based on the description of Figure 6 in applicant’s specification.
Drawings
The drawings are objected to because Figure 6, the “16” and “32” should be “8” and “16” respectively. Based on pg. 6 of applicant’s clean specification, it seems the number of blocks is unchanged and only the grouping of the first column changes between Fig. 5 and Fig. 6. Furthermore, pg. 6 line 23 suggests “G1” should be “G4”
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Objections
Claims 9-16 objected to because of the following informalities:
Claims 9-16, change instances of “the processor” to “the reconfigurable processor” for clearer antecedent.
Claim 9, change “instructions is executed” to “instructions are executed”.
Appropriate correction is required.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-2, 8-10, 16 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Lee et al. (Optimization of GPU-based Sparse Matrix Multiplication for Large Sparse Networks, hereinafter “Lee”).
As per claim 1, Lee teaches A method for processing sparse data, performed by a reconfigurable processor, wherein the reconfigurable processor comprises a processing element (PE) array, and the PE array comprises PxQ PE units, the method comprising: dividing a sparse weight matrix to be calculated into at least one unit block (Lee: Fig. 4; Section IV.A second paragraph; each shared memory (SM) corresponding to a processing element);
grouping a plurality of unit blocks into a computing group (Lee: Fig. 6; Section IV.C.2);
and obtaining an effective weight address corresponding to each effective weight in the computing group (Lee: Fig. 5; Section IV.C.1; wherein the pointer and index corresponds to an effective weight address).
As per claim 2, Lee further teaches The method of claim 1, wherein dividing the sparse weight matrix to be calculated into at least one unit block comprises: dividing the sparse weight matrix into the at least one unit block by taking PxQ unit blocks as a division unit in a row direction and a column direction of the sparse weight matrix, wherein each unit block comprises at least one effective weight (Lee: Fig. 6; Section IV.C.2 paragraphs 4-5).
As per claim 8, Lee further teaches The method of claim 1, wherein the PxQ PE units in the PE array are 8x8 PE units (Lee does not explicitly state 8x8 SMs, however Lee does not suggest a limit. It would be reasonable for one of ordinary skill to duplicate SMs such that there is an equivalent of 8x8 SMs. (MPEP 2144.04.VI.B)).
As per claims 9-10, 16, the claims are directed to an apparatus that implements the same or similar features as the method of claims 1-2, 8, respectively, and is therefore rejected for at least the same reasons therein. Furthermore, Lee teaches a memory configured to store instructions executable by the processor (Lee: Fig. 1, global memory).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 4-7, 12-15 are rejected under 35 U.S.C. 103 as being unpatentable over Lee in view of Ro et al. (US 20210117761 A1, hereinafter “Ro”).
As per claim 4, Lee further teaches The method of claim 1,
However, while Lee discloses a method of relative addressing of nonzero weights, Lee does not explicitly disclose the method determining number of zero weights between each pair of nonzero weights. Thus, Lee does not teach wherein obtaining the effective weight address corresponding to each effective weight in the computing group comprises: reading each effective weight in the computing group sequentially by the PE array; and determining a number of zero weights between a current effective weight and a previous effective weight as an effective weight address of the current effective weight, and storing the number of zero weights into a storage address corresponding to the current effective weight of the computing group.
Ro teaches wherein obtaining the effective weight address corresponding to each effective weight in the computing group comprises: reading each effective weight in the computing group sequentially by the PE array (Ro: Fig. 5 element 520; [0108]);
and determining a number of zero weights between a current effective weight and a previous effective weight as an effective weight address of the current effective weight, and storing the number of zero weights into a storage address corresponding to the current effective weight of the computing group (Ro: Fig. 4 element 420; [0097]-[0098]).
Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to substitute, with a reasonable expectation of success, the relative addressing of Lee with the relative addressing of Ro. One would have been motivated to combine these references because both references disclose addressing nonzero weight data, and it is a simple substitution of one known element for another to obtain predictable results (addressing nonzero weight data).
As per claim 5, Lee further teaches The method of claim 1,
However, while Lee discloses a matrix multiplication, Lee does not explicitly disclose the matrix multiplication is used for convolution. Thus, Lee does not teach further comprising: reading a convolution computation value; and performing convolution computation or fully connected layer computation.
Ro teaches further comprising: reading a convolution computation value; and performing convolution computation or fully connected layer computation (Ro: [0075]).
Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify, with a reasonable expectation of success, the matrix multiplication of Lee with the teachings of Ro. One would have been motivated to combine these references because both references disclose matrix multiplication with weight matrix, combining prior art elements according to known methods to yield predictable results (of performing a convolution operation).
As per claim 6, Lee/Ro further teaches The method of claim 5, wherein reading the convolution computation value comprises: obtaining an effective weight corresponding to an effective weight address and a storage address of the effective weight in a non-sparse weight matrix according to the effective weight address of each computing group of the sparse weight matrix through the PxQ PE units in the PE array (Lee: Fig. 2; Section II.B.1);
and reading the convolution computation value corresponding to the effective weight according to the storage address of the effective weight in the non-sparse weight matrix (Ro: Fig. 5; [0108]).
As per claim 7, Lee/Ro further teaches The method of claim 5, wherein performing convolution computation or fully connected layer computation comprises: performing convolution computation or fully connected layer computation in a neural network model based on deep learning according to the convolution computation value corresponding to the effective weight in each computing group (Ro: [0051]; [0075]).
As per claims 12-15, the claims are directed to an apparatus that implements the same or similar features as the method of claims 4-7, respectively, and is therefore rejected for at least the same reasons therein.
Allowable Subject Matter
Claims 3, 11 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
As to claim 3, the prior art of record does not teach or suggest a combination as claimed including:
grouping the plurality of unit blocks in the sparse weight matrix into a computing group in a column direction of the sparse weight matrix;
determining whether a total number of effective weights in the computing group is more than (PxQ)/2;
in response to the total number of effective weights in the computing group being more than (PxQ)/2, splitting the computing group into two computing groups evenly in the column direction of the sparse weight matrix;
repeating the above determining and splitting until the total number of effective weights in each computing group is less than (PxQ)/2
Lee discloses optimizing workload across a plurality of shared memories (SM), that contain processing cores (Section III). The workload balancing of Lee determines blocks with a large number of nonzero elements and splits said blocks further (Fig. 4; section IV). Lee does not suggest workload balancing that divides blocks such that the number of nonzero elements are no more than half of available SMs. Therefore, Lee does not teach or suggest a combination as claimed including the limitations identified above.
Chinya et al. (US 20210042617 A1, hereinafter “Chinya”) discloses a method of assigning weights to a plurality of processing elements (abstract). Chinya assigns mutually exclusive partitions of nonzero weights, and their corresponding sparsity representation, to each processing element (Fig. 1; [0018]). Chinya does not suggest splitting the partitions further based on the number of nonzero values and number of processing elements. Therefore, Chinya does not teach or suggest a combination as claimed including the limitations identified above.
Pool et al. (US 11392829 B1, hereinafter “Pool”) discloses a method of managing sparse matrices by enforcing sparsity constraints on submatrices (abstract). Pool divides the matrix into equal blocks (Figs. 3A-3B) and arranging data such that each submatrix satisfies a sparsity constraint (Fig. 6), wherein each set of submatrices have relatively equivalent workloads (col 6 lines 7-13). Pool does not suggest determining the set of submatrices, or the sparsity constraint, is based on the number of processing elements. Therefore, Pool does not teach or suggest a combination as claimed including the limitations identified above.
Han et al. (US 20210182666 A1, hereinafter “Han”) discloses a weight data storage method to store only nonzero weights and their corresponding indexes (abstract). Han continually divides the weight matrix to isolate nonzero weights (Fig. 3A; [0044]). Han does not suggest the dividing the weight matrix to obtain nonzero weights to be based on the number of processing elements. Therefore, Han does not teach or suggest a combination as claimed including the limitations identified above.
As to claim 11, the claim recites the same or similar limitations as claim 3 discussed above and are therefore allowable for at least the same reasons.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHAT N LE whose telephone number is (571)272-0546. The examiner can normally be reached Monday-Friday 8:30AM-5PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew T Caldwell can be reached at (571) 272-3702. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/P.N.L./
Phat LeExaminer, Art Unit 2182 (571) 272-0546
/ANDREW CALDWELL/Supervisory Patent Examiner, Art Unit 2182