Last updated: April 19, 2026
Application No. 17/864,235
DEVICE FOR ACCELERATING SELF-ATTENTION OPERATION IN NEURAL NETWORKS

Non-Final OA §101§103
Filed
Jul 13, 2022
Examiner
OWYANG, MICHELLE N
Art Unit
2168
Tech Center
2100 — Computer Architecture & Software
Assignee
Seoul National University R&Db Foundation
OA Round
7 (Non-Final)
Interview Optional

— +29.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 610 resolved cases, 2023–2026
Examiner Intelligence

OWYANG, MICHELLE N View full profile →
Grants 76% — above average
Career Allow Rate
464 granted / 610 resolved
+21.1% vs TC avg
Strong +30% interview lift
Without
With
+29.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
16 currently pending
Career history
626
Total Applications
across all art units
Statute-Specific Performance

§101
18.4%
-21.6% vs TC avg
§103
37.6%
-2.4% vs TC avg
§102
12.6%
-27.4% vs TC avg
§112
19.1%
-20.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 610 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/30/2025 has been entered.
 Claims 1, 4-8, 11-15 are pending.
	Claims 2-3, 9-10 are cancelled.

Response to Arguments
Applicant’s arguments with respect to the rejections previously made and the amended claims filed on 12/30/2025 have been fully considered. In view of the claim amendments, the rejections are being updated accordingly.

35 USC 101 Rejections
Applicant’s arguments have been fully considered.
In response to the arguments, it is submitted that merely reciting amended limitation of “reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys by” performing a series of mathematical relationships as recited in independent claims 1 & 8  (e.g. calculating similarity estimate, compare similarity estimate, calculating hamming distance converting the distance in to angle, calculating cosine function, and multiplying the cosine value to obtain an inner product estimate) .  The cited amended element of does not show how the memory access operations and multiply-accumulate operations are being reduced since none of the mathematical relationships performances involves in any memory access operations or multiply-accumulate operations relative to the performing self-attention operation on all of a plurality of keys.  
Also, the amended element does not show how the mathematical relationships performances would add the necessary degree of specificity to the independent claims to integrate the alleged abstract idea into a practical application and to satisfy of the Step 2A of the patent eligibility analysis since there is no to practical application being generated from the mathematical relationship.  
Further, merely reciting amended limitation of “performing a self-attention operation on only the candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys, to reduce a computational burden on the processor"” as indicated in claims 1 & 8 does not necessarily “improve the computational performance of the self- attention process and constitute a practical solution to a known problem in machine learning. Successfully reducing the computational burden of self-attention to enable neural networks to handle large datasets more efficiently is a clear technical improvement in the field of neural networks” stated by Application since claims do not show how the performance of a self-operation while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys would to reduce a computational burden on the processor. 
Neither the claim nor the specification has shown (1) how the self-attention process is being improved,  or (2) how a practical solution to a known problem in machine learning is being constituted from, or (3) how the computational burden of self-attention  is being reduced to enable neural networks to handle large datasets more efficiently is a clear technical improvement in the field of neural networks, nor (4) tie the mathematical similarity estimate to a concrete, layer-by-layer control mechanism already implemented in hardware for a practical application, since the claims does not recited any functional limitation or step involving in those areas, e.g. improving the self-attention process, technical improvement in neural network, or a practical application. 
In addition, the newly added limitation of “determining access on the all of the plurality of keys” is directed to an insignificant extra-solution activity at Step 2A Prong Two, and also would be well-understood, routine, and conventional at Step 2B. This nothing more than gathering key information, which would not be integrated into to a practical application or necessary improvement that would impose meaningful limit on the judicial exception. 
Plus, the calculating an attention score, adjusting thresholds hash computation to generate hash values, and using pipeline structure, key matrix, degree of approximation determined by a hyperparameter as recited in the dependent claims are further emphasizing that the claimed invention is directed to more mathematic relationships that generate numeric values that do not directed claimed invention to a practical application. The claimed steps remain directed to a process of series of mathematical correlations, which is one of the abstract idea groups. 
Moreover, as previously stated in the office action, it is submitted that “to reduce a computational burden on the processor” suggests an intended outcome of the self-attention operation. Such description does not direct the claimed invention to any improvement or a practical solution in the technical field, or impose impact on the functionality of the claimed steps, nor impose a meaningful limit on the judicial exception since the such limitation is not functionally involved.  All the steps in the claims would be performed the same regardless of whether the computational burden is being reduced or not. 
 The additional elements in the claims (including the amendment) do not include additional elements or the combination of the elements are sufficient to amount to significantly more than the judicial exception and fail to (i) integrate the judicial exception into practical application, and (ii) to indicate of any improvement of neural networks to handle large datasets more efficiently is a clear technical improvement in the field of neural networks.  
Besides, it is determined that the computing elements (such a memory, processor) in the claim amount to no more than usage of a generic computing system having a generic computing component, which fails to provide an inventive concept or significantly more than abstract idea because the elements do not necessary improve the functional of a computing system or an improvement to a technical field since network computing is well known.
Therefore, for at least the reasoning above, the amended pending claims are not patent eligible.

35 USC 103 Rejections
Applicant’s arguments--which are primarily directed to the amended limitations of “reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys by: determining access on the all of the plurality of keys… performing a self-attention operation on only the candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys,…” --have been fully considered.
In response to the arguments, it is submitted that in view of the amendment, the rejections have been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made, and the claims are being properly addressed; see rejections below for detail. 

Furthermore, it is submitted that all limitations in pending claims--including those not specifically argued--are properly addressed. The reason is set forth in the rejections. See claim analysis below for detail.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 4-8, 11-15 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Claims 1, 4-8, 11-15 recite a process comprising calculating a similarity estimate that includes (i)calculating a Hamming distance, (ii)converting the distance into and angle with a bias removal, (iii)calculating a cosine function, and (iv) multiplying the cosine by a normalized key value to obtain an inner product. The claimed process also comprises selecting a subset of keys by comparing the similarity estimate and performing a self-attention operation on only the candidates as recited in claims 1 & 8. The claimed processes additionally recite the calculating an attention score, adjusting thresholds hash computation to generate hash values and using pipeline structure as recited in the dependent claims.
The claimed process is similar to a method of mathematic relationships, which is one of the groupings of abstract ideas according to Prong One in Step 2A of the 2019 Patent Subject Matter Eligibility Guidance since the claimed steps are directed a series of mathematic relationships. 
E.g., the claimed similarity estimate calculation involves multiple mathematical calculations with respect to different type of data elements, including a Hamming distance calculation and a cosine function calculation as claimed. 
The selecting step involves compare the similarity estimate with a threshold as claimed, and the perform a self-attention operation which involves mathematic computation as indicated in the specification of instant application (e.g., para [0017-0019]). 
The claimed steps with mathematic relationships do not render the claims to a practical application because the claimed process does not show how performing a series of mathematic relationships would direct the claimed process to a particular useful application. Making a selection of a subset of candidates as claimed does not render the series of mathematic relationships exception being a practical application. Similarly, performing a self-attention operation on the subset does not render the claimed process to a particular useful application because the performance of the self-attention does not lead to or produce any practical outcome.  
Also, merely reciting amended limitation of “reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys by” performing a series of mathematical relationships as recited in independent claims 1 & 8  (e.g. calculating similarity estimate, compare similarity estimate, calculating hamming distance converting the distance in to angle, calculating cosine function, and multiplying the cosine value to obtain an inner product estimate) .  The cited amended element of does not show how the memory access operations and multiply-accumulate operations are being reduced since none of the mathematical relationships performances involves in any memory access operations or multiply-accumulate operations relative to the performing self-attention operation on all of a plurality of keys.  
Plus, the amended element does not show how the mathematical relationships performances would add the necessary degree of specificity to the independent claims to integrate the alleged abstract idea into a practical application and to satisfy of the Step 2A of the patent eligibility analysis since there is no to practical application being generated from the mathematical relationship.  
Besides, merely reciting amended limitation of “performing a self-attention operation on only the candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys, to reduce a computational burden on the processor"” as indicated in claims 1 & 8 does not necessarily “improve the computational performance of the self- attention process and constitute a practical solution to a known problem in machine learning. Successfully reducing the computational burden of self-attention to enable neural networks to handle large datasets more efficiently is a clear technical improvement in the field of neural networks” stated by Application since claims do not show how the performance of a self-operation while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys would to reduce a computational burden on the processor. 
Neither the claim nor the specification has shown (1) how the self-attention process is being improved,  or (2) how a practical solution to a known problem in machine learning is being constituted from, or (3) how the computational burden of self-attention  is being reduced to enable neural networks to handle large datasets more efficiently is a clear technical improvement in the field of neural networks, nor (4) tie the mathematical similarity estimate to a concrete, layer-by-layer control mechanism already implemented in hardware for a practical application, since the claims does not recited any functional limitation or step involving in those areas, e.g. improving the self-attention process, technical improvement in neural network, or a practical application. 
In addition, the newly added limitation of “determining access on the all of the plurality of keys” is directed to an insignificant extra-solution activity at Step 2A Prong Two, and also would be well-understood, routine, and conventional at Step 2B. This nothing more than gathering key information, which would not be integrated into to a practical application or necessary improvement that would impose meaningful limit on the judicial exception. 
Additionally, the element of “to reduce a computational burden on the processor” is directed to non-functional descriptive material that describes that intended outcome when the self-attention operation is being performed, which does not impose a meaningful limit on the judicial exception since the such limitation is not functionally involved.  All the steps in the claims would be performed the same regardless of whether the quantity operation is being reduced or not.  Merely reciting the intended purpose of computational burden reduction as claimed would not integrate the abstraction idea into practical application.  
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Additional elements (e.g.  queries, keys, input entities) are directed to types of information, which do not impose a meaningful limit on the judicial exception, such that the claims are more than a drafting effort design to monopolize exception, because the claimed steps could be performed in a same manner to achieve the same outcome with other types of information other than the ones being used in the claims.  
Hence, the claims do not include additional elements or the combination of the elements are sufficient to amount to significantly more than the judicial exception and fail to integrate the judicial exception into practical application according to Prong Two in Step 2A of the 2019 Patent Subject Matter Eligibility Guidance because the claimed elements or their combination do not impose any meaningful limits on practicing the abstract idea. 
Further, in view of Step 2B of the 2019 Patent Subject Matter Eligibility Guidance, it is determined that the computing elements (such a memory, processor) in the claim amount to no more than usage of a generic computing system having a generic computing components, which fails to provide an inventive concept or significantly more than abstract idea because the elements do not necessary improve the functional of a computing system or an improvement to a technical field since network computing is well known.
Thus, for at least the reasoning above, the pending claims are not patent eligible.

Examiner Notes
It is noted that the phrase “for executing” in claim 15 indicates intended use of the claimed a computer readable non-transitory recording medium; Minton v. Nat ’l Ass ’n of Securities Dealers, Inc., 336 F.3d 1373, 1381, 67 USPQ2d 1614, 1620 (Fed. Cir. 2003) “whereby clause in a method claim is not given weight when it simply expresses the intended result of a process step positively recited.” Examples of claim language, although not exhaustive, that may raise a question as to the limiting effect of the language in a claim are: (A) “adapted to” or “adapted for” clauses; (B) “wherein” clauses; and (C) “whereby” clauses. Therefore intended use limitations are not required to be taught, see MPEP 2103, MPEP 2111.04. Hence, any computer readable non-transitory recording medium with instructions would read on the claim 15. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 4-8, 11-15 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al (Pub No. US 2021/0026891, hereinafter Wu) in view of McCallie Jr. et al (Patent No. US 11,526,508, hereinafter McCallie), and further in view of Tan et al (Pub No. US 2022/0245928, hereinafter Tan).
Wu and McCallie are cited in the previous office action.

	With respect to claim 1, Wu discloses an electronic device (abstract) comprising: 
a memory (Fig 2 & 5B); and at least one processor (Fig 2 & 5B) that for each of a plurality of queries performs steps (Fig 1-3)) comprising:
reducing memory access operations and multiply-accumulate operations on all of a plurality of keys (reducing memory access operations and multiply-accumulate operations are merely operations; [0005], [0118], Fig 3: perform quickly, efficiently and accuracy retrieval implementation reduce memory access operations and multiply-accumulate operations on all of a plurality of keys) by:
determining access on the all of the plurality of keys ([0069-0072]: determine access on all the keys represented by the query information for the 1st retrieval operation);
calculating a similarity estimate between a query and each of the plurality of keys ([007-0009], [0013], ]0057], Fig 4: calculating a similarity between a query input by an user such as a to be queried image, and each of the keys represented the images in the image database),
selecting a subset of keys of the plurality of keys as candidates by comparing similarity estimate with a layer-specific threshold that is automatically determined from a user-defined hyper-parameter, each of the candidates having a similarity estimate that surpasses the threshold ([0016], [0057], [0072-0074, [0097-0098], Fig 3-4: select a subset of key represented by the image candidates by comparing the similarity estimates with a threshold, and n-candidates with similarity greater than 1st similarity threshold  are being selected, wherein 1st threshold is customized on the client side or system side, and the 1st threshold is representing one of the sequential operations correspond to a specific layer of the information process method of Wu since the information process comprises a series of sequential operations, wherein each operation is corresponds to a layer as shown in the listed and figures. Also, the 1st threshold is automatically determined from a user-defined hyper-parameter via customization on the client side or system side with defined parameter as set forth by an user or component if system that implements the information process method); and 
performing a self-attention operation on only the candidates to reduce a computational burden on the processor ([0010], [0017-0021[0057], [0099], Fig 4: performing an operation that corresponds to the self-attention operation on only on the selected candidates to obtain the final display results, which would reduce a computational burden on the processor since only a subset of data images representing the candidates instead of the entire set is being processed to obtain the final result),
wherein the calculating the similarity estimate comprises: 
calculating a Hamming distance between the query and each key of the plurality of keys ([0054], [0098], [0104]: calculating the similarity estimate includes calculating a Hamming distance between the query, e.g. image, and the each of the key represented by the images using the feature information):
converting the Hamming distance between a query hash and a key hash into an angle ([0054], [0098-0099]: convert the distance between the query hash and key hash represented by respective hash codes into a number corresponding the angle respect to the position in the order since the angle is known as a space close to a point);
calculating, for each key of the plurality of keys, a cosine function value ([0055], [0098-0099]: calculating a cosine function value to determine similarity estimate).
Wu does not explicitly disclose reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys; 
performing a self-attention operation on only the candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys; 
performing a bias removal on the angle,
calculating, for each key of the plurality of keys, a cosine function value based on the angle; 
multiplying, for each key of the plurality of keys, the cosine function value by a normalized key value to obtain an inner product estimate that serves as the similarity estimate. 
However, McCallie discloses wherein the calculating the similarity estimate (Col. 4, lines 8-11: calculate similarity estimate) comprises: 
performing a bias removal on the angle (Col. 5, lines 4-5 & 51-53, Fig 6: remove bias on the angle between vectors);
calculating, for each key of the plurality of keys, a cosine function value based on the angle (Col. 3, lines 53-57, Col. 9, lines 45-50: calculate a cosine function based on angle);  
multiplying, for each key of the plurality of keys, the cosine function value by a normalized key value to obtain an inner product estimate that serves as the similarity estimate (Col. 9, lines 50-52: multiple the cosine function by a normalized key via normalized cosine function to obtain similarity estimate).
Since, both Wu and McCallie are from the same field of endeavor as both are directed selecting candidates based on similarity estimates via machine learning techniques using different mathematical relationships, which is same field of endeavor as the claimed invention, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify their teachings to incorporate bias removal as well as normalized cosine function value in similarity calculation techniques of McCallie into similarity estimate calculation of Wu for performing query operations a claimed.  The motivation to combine is to enhance relevant information retrieval with efficient and accurate implementation in information retrieval technologies (Wu, [0004]; McCallie, Col. 1, lines 13-15).
McCallie does not explicitly disclose reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys; and
performing a self-attention operation on only the candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys.
However, Tan discloses reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys ([0021-0022]: reduce operations corresponds to the memory access and multiply accumulate operations with respect to self-attention operation on all keys represented by the input data, as further disclosed in [0025]] & [0043]); and
performing a self-attention operation on only candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys ([0025], [0098-0099]: perform self-attention operation only on candidates data while maintaining specific accuracy compare to the operation on all key data with self-attention operation augmentation).
Since, Wu, McCallie and Tan are all from the same field of endeavor as all are directed to efficient and accurate information retrieval processing via machine learning techniques using different mathematical relationships, which is same field of endeavor as the claimed invention, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify their teachings to incorporate reduce operation with self-attention operation  and accuracy management of Tan into operation processing of Wu and McCallie for performing query operations a claimed.  The motivation to combine is to enhance relevant information retrieval with efficient and accurate implementation as well as reduce computational resources in information retrieval technologies (Wu, [0004]; McCallie, Col. 1, lines 13-15; Tan, [0002-0003]).

With respect to claim 8, Wu discloses a method for accelerating a self-attention operation, the method (abstract; the term for indicates intended use of the method) comprising: 
for each of a plurality of queries ([0057], Fig 1-2: for each of the queries represented by an image), performing: 
reducing memory access operations and multiply-accumulate operations on all of a plurality of keys (reducing memory access operations and multiply-accumulate operations are merely operations; [0005], [0118], Fig 3: perform quickly, efficiently and accuracy retrieval implementation reduce memory access operations and multiply-accumulate operations on all of a plurality of keys) by:
determining access on the all of the plurality of keys ([0069-0072]: determine access on all the keys represented by the query information for the 1st retrieval operation);
calculating a similarity estimate between a query and each of the plurality of keys ([007-0009], [0013], ]0057], Fig 4: calculating a similarity between a query input by an user such as a to be queried image, and each of the keys represented the images in the image database), 
selecting a subset of keys of the plurality of keys as candidates by comparing similarity estimate with a threshold, each of the candidates having a similarity estimate that surpasses the threshold ([0016], [0057], [0072-0074, [0097-0098], Fig 4: select a subset of key represented by the image candidates by comparing the similarity estimates with a threshold, and n-candidate with similarity greater than 1st similarity threshold are being selected), and 
performing a self-attention operation on only the candidates to reduce a computational burden on the processor ([0010], [0017-0021[0057], [0099], Fig 4: performing an operation that corresponds to the self-attention operation on only on the selected candidates to obtain the final display results, which would reduce a computational burden on the processor since only a subset of data images representing the candidates instead of the entire set is being processed to obtain the final result),
wherein the calculating the similarity estimate comprises: 
calculating a Hamming distance between the query and each key of the plurality of keys ([0054], [0098], [0104]: calculating the similarity estimate includes calculating a Hamming distance between the query, e.g. image, and the each of the key represented by the images using the feature information):
converting the Hamming distance between a query hash and a key hash into an angle ([0054], [0098-0099]: convert the distance between the query hash and key hash represented by respective hash codes into a number corresponding the angle respect to the position in the order since the angle is known as a space close to a point);
calculating, for each key of the plurality of keys, a cosine function value ([0055], [0098-0099]: calculating a cosine function value to determine similarity estimate).
Wu does not explicitly disclose reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys;
performing a self-attention operation on only the candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys;
performing a bias removal on the angle,
calculating, for each key of the plurality of keys, a cosine function value based on the angle; 
multiplying, for each key of the plurality of keys, the cosine function value by a normalized key value to obtain an inner product estimate that serves as the similarity estimate. 
However, McCallie discloses wherein the calculating the similarity estimate (Col. 4, lines 8-11: calculate similarity estimate) comprises: 
performing a bias removal on the angle (Col. 5, lines 4-5 & 51-53, Fig 6: remove bias on the angle between vectors);
calculating, for each key of the plurality of keys, a cosine function value based on the angle (Col. 3, lines 53-57, Col. 9, lines 45-50: calculate a cosine function based on angle);  
multiplying, for each key of the plurality of keys, the cosine function value by a normalized key value to obtain an inner product estimate that serves as the similarity estimate (Col. 9, lines 50-52: multiple the cosine function by a normalized key via normalized cosine function to obtain similarity estimate).
Since, both Wu and McCallie are from the same field of endeavor as both are directed selecting candidates based on similarity estimates via machine learning techniques using different mathematical relationships, which is same field of endeavor as the claimed invention, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify their teachings to incorporate bias removal as well as normalized cosine function value in similarity calculation techniques of McCallie into similarity estimate calculation of Wu for performing query operations a claimed.  The motivation to combine is to enhance relevant information retrieval with efficient and accurate implementation in information retrieval technologies (Wu, [0004]; McCallie, Col. 1, lines 13-15).
McCallie does not explicitly disclose reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys; and
performing a self-attention operation on only the candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys.
However, Tan discloses reducing memory access operations and multiply-accumulate operations compared to performing self-attention operation on all of a plurality of keys ([0021-0022]: reduce operations corresponds to the memory access and multiply accumulate operations with respect to self-attention operation on all keys represented by the input data, as further disclosed in [0025]] & [0043]); and
performing a self-attention operation on only candidates, while maintaining accuracy within the threshold compared to performing the self-attention operation on the all of the plurality of keys ([0025], [0098-0099]: perform self-attention operation only on candidates data while maintaining specific accuracy compare to the operation on all key data with self-attention operation augmentation).
Since, Wu, McCallie and Tan are all from the same field of endeavor as all are directed to efficient and accurate information retrieval processing via machine learning techniques using different mathematical relationships, which is same field of endeavor as the claimed invention, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify their teachings to incorporate reduce operation with self-attention operation  and accuracy management of Tan into operation processing of Wu and McCallie for performing query operations a claimed.  The motivation to combine is to enhance relevant information retrieval with efficient and accurate implementation as well as reduce computational resources in information retrieval technologies (Wu, [0004]; McCallie, Col. 1, lines 13-15; Tan, [0002-0003]).

With respect to claims 4 and 11, the combined teachings of Wu, McCallie, and Tan further disclose wherein the at least one processor calculates an attention score by performing an inner product operation between a query matrix and a key matrix for the candidates (attention score is merely a score, and an inner product operation is merely an operation between the matrixes; Wu, [0046], [0081-0083], [0087]; McCallie, Col. 18, lines 36-40, Col. 20, lines 32-35; Tan, [0025], [0060], Fig 2A-4: perform operation with output correspond to an attention score on the query/image matrix and the key/candidate matrix to estimate similarity as represented by a calculated score correspond to the score via attention based model such as the self-attention mechanism). 

With respect to claims 5 and 12, the combined teachings of Wu, McCallie and Tan further disclose wherein the at least one processor dynamically adjusts one or more thresholds for each layer based on a degree of approximation determined by a hyperparameter (Wu, [0046], [0057], [0076], [0097; McCallie, Col. 10, lines 6-15 & 40-42, Col. 16, lines 29-31; Tan, [0025], Fig 2A-3: dynamically adjust threshold for each layer of learned network based on the approximation closeness in the space corresponding the hyperparameter).
  
With respect to claims 6 and 13, the combined teachings of Wu, McCallie and Tan further disclose wherein the at least one processor includes a hash computation module that generates hash values of queries and keys, a candidate selection module that filters keys based on the one or more thresholds, an attention computation module that generates attention scores for the candidates, and an output module, and the memory comprises a hash memory and a matrix memory (Wu, [0053-0059],[0079]; McCallie, Col. 18, lines 36-40, Col. 20, lines 32-35, Fig 3A-3B; Tan, [0025], Fig 2A-4: the processor comprises different modes, including hashing that generate hash code representing hash value, candidate selection based on threshold, score generating as well as matrix to generate final result based on similarity estimates).  

	With respect to claims 7 and 14, the combined teachings of Wu, McCallie and Tan further disclose wherein the at least one processor is configured to process operations in parallel using a pipeline structure (Wu, [0057-0059], [0063], Fig 1 & 4; McCallie, Col. 18, lines 36-40, Col. 20, lines 32-35, Fig 3A-3B: pipeline structure as shown by the chart and/or the figures with one operation after another via network, including a  neural network with parallel processing).  

With respect to claim 15, the combined teachings of Wu, McCallie and Tan further disclose a computer readable non-transitory recording medium that stores a computer program comprising instructions that execute the method for accelerating the self-attention operation according to any one of claims 8 and 11 to 14 by an electronic device (Wu, [0129], Fig 2 & 5B; McCallie, Col. 7, lines 1-20, Fig 1 & 7; Tan, [0025], Fig 1-4). 

Examiner Comments
Examiner has cited particular columns/paragraph and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michelle Owyang whose telephone number is (571)270-1254. The examiner can normally be reached Monday-Friday, 8am-6pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached at (571)272-4085. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/MICHELLE N OWYANG/Primary Examiner, Art Unit 2168
Read full office action
Prosecution Timeline

Jul 13, 2022
Application Filed
Aug 09, 2023
Non-Final Rejection — §101, §103
Nov 14, 2023
Response Filed
Dec 08, 2023
Final Rejection — §101, §103
Feb 20, 2024
Response after Non-Final Action
Feb 26, 2024
Response after Non-Final Action
Mar 14, 2024
Request for Continued Examination
Mar 15, 2024
Response after Non-Final Action
May 15, 2024
Non-Final Rejection — §101, §103
Aug 19, 2024
Response Filed
Aug 28, 2024
Final Rejection — §101, §103
Nov 15, 2024
Response after Non-Final Action
Nov 20, 2024
Response after Non-Final Action
Nov 26, 2024
Applicant Interview (Telephonic)
Nov 26, 2024
Examiner Interview Summary
Dec 04, 2024
Request for Continued Examination
Dec 06, 2024
Response after Non-Final Action
Mar 23, 2025
Non-Final Rejection — §101, §103
Jun 26, 2025
Response Filed
Jul 28, 2025
Final Rejection — §101, §103
Dec 22, 2025
Applicant Interview (Telephonic)
Dec 22, 2025
Examiner Interview Summary
Dec 30, 2025
Request for Continued Examination
Jan 20, 2026
Response after Non-Final Action
Feb 05, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/823,950
Patent 12566764
Ambient Multi-Device Framework for Agent Companions
2y 5m to grant Granted Mar 03, 2026
18/912,882
Patent 12566799
TRANSACTION EXCHANGE PLATFORM HAVING CONFIGURABLE MICROSERVICES
2y 5m to grant Granted Mar 03, 2026
18/489,532
Patent 12561286
COMPRESSION TECHNIQUES FOR VERTICES OF GRAPHIC MODELS
2y 5m to grant Granted Feb 24, 2026
18/642,043
Patent 12547605
PERFORMING LOAD ERROR TRACKING DURING LOADING OF DATA FOR STORAGE VIA A DATABASE SYSTEM
2y 5m to grant Granted Feb 10, 2026
18/731,734
Patent 12536235
USING A MACHINE LEARNING SYSTEM TO PROCESS A CORPUS OF DOCUMENTS ASSOCIATED WITH A USER TO DETERMINE A USER-SPECIFIC AND/OR PROCESS-SPECIFIC CONSEQUENCE INDEX
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

7-8
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+29.9%)
3y 1m
Median Time to Grant
High
PTA Risk
Based on 610 resolved cases by this examiner. Grant probability derived from career allow rate.
DEVICE FOR ACCELERATING SELF-ATTENTION OPERATION IN NEURAL NETWORKS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email