DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dao, Tri (“FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning”) in view of Wang et al. (US Pub 2023/0133305*).
Regarding claim 1, Dao discloses a method of executing a task for a large language model, comprising: determining, by using a determination unit, a target attention task from a plurality of attention tasks to be processed (see section 3.1 – plurality of tile block sub computation), wherein the target attention task is a task corresponding to a non-fully masked region of the feature to be processed (see section 3.1 – Dao does not skip those that are non-fully masked and partially masked blocks), and the mask position represents mask endpoint positions in at least two non-intersecting intervals in a mask matrix corresponding to the feature to be processed (section 3.1, page 6 – “As FlashAttention and FlashAttention-2 already operate by blocks, for any blocks where all the column indices are more than the row indices (approximately half of the blocks for large sequence length), we can skip the computation of that block” – therefore mask matrix is divided into two regions); and
executing the target attention task by using a computing unit, so as to obtain an attention feature (see section 3.1 – accumulated block outputs produces the final attention feature).
Dao does not disclose based on a sparse representation corresponding to a feature to be processed, the sparse representation represents a mask position of the feature to be processed.
Wang discloses based on a sparse representation corresponding to a feature to be processed, the sparse representation represents a mask position of the feature to be processed (para 0039-0044).
Therefore, it would have been obvious to a person of ordinary skilled in the art before the effective filing date of the claimed invention to modify Dao with the teachings of Wang in order to avoid unnecessary computation on masked regions.
Regarding claim 16, see rejection of claim 1.
Regarding claim 20, see rejection of claim 1.
Allowable Subject Matter
Claims 2-15, and 17-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NAFIZ E HOQUE whose telephone number is (571)270-1811. The examiner can normally be reached M-F 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached at (571)272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NAFIZ E HOQUE/ Primary Examiner, Art Unit 2693