Last updated: May 29, 2026

Application No. 18/365,047

NORMALIZATION SCHEME FOR SELF-ATTENTION NEURAL NETWORKS

Non-Final OA §101§102

Filed

Aug 03, 2023

Priority

Feb 04, 2021 — continuation of PCTEP2021052679

Examiner

SHANMUGASUNDARAM, KANNAN

Art Unit

2168

Tech Center

2100 — Computer Architecture & Software

Assignee

Huawei Technologies Co., Ltd.

OA Round

1 (Non-Final)

Interview Optional

— +36.5% interview lift. Examiner has a relatively high allowance rate (72%); +36.5% interview lift. A written response may suffice.

Based on 586 resolved cases, 2023–2026

Examiner Intelligence

SHANMUGASUNDARAM, KANNAN View full profile →

Grants 72% — above average

Career Allowance Rate

422 granted / 586 resolved

+17.0% vs TC avg

Strong +36% interview lift

Without

With

+36.5%

Interview Lift

resolved cases with interview

Typical timeline

3y 7m

Avg Prosecution

13 currently pending

Career history

607

Total Applications

across all art units

Statute-Specific Performance

§101

3.9%

-36.1% vs TC avg

§103

86.2%

+46.2% vs TC avg

§102

6.4%

-33.6% vs TC avg

§112

1.9%

-38.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 586 resolved cases

Office Action

§101 §102

DETAILED ACTION
Claims 1-16 are pending in the Instant Application. 
Claims 1-16 are rejected (Non-Final Rejection). 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority

The Instant Application, filed 08/03/2023, is a continuation of PCT/EP2021/052679, filed 02/04/2021, and thus has an effective filing date of  02/04/2021 for what is described therein.

Information Disclosure Statement

The information disclosure statement (IDS) submitted on 10 July 2024 and 07 January 2025 were considered by the examiner.

Claim Objections
Claim 16 is objected to because of the following informalities: claim 16 refers to claim 13, but states “any of claim 13.” Since there is only one claim 13, it appears to be a typographical error.  Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-12 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. Claims 1-12 recite “a data processing system” that can be broadly interpreted as non-statutory, “software per se”. Claims 1-12 do not expressly recite hardware. The specification at [0028] states, “According to a third aspect there is provided a computer program which, when executed by a computer, causes the computer to perform the method described above.” Thus, the system can be “a computer program,” allowing it to be interpreted as “software per se”. ”Software per se” is considered non-statutory subject matter.  Please include hardware such as a “processor” or a “memory” to overcome this rejection. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-16 are rejected under 35 U.S.C. 102(a)(1) as being unpatentable by Veličković et al. (“Velickovic”), “Graph Attention Networks,” 2018.

As per claim 1,  Velickovic discloses a data processing device for performing an attention-based operation on a graph neural network, the device being configured to 
receive one or more input graphs each having a plurality of nodes and to, for at least one of the input graphs ([Page 6, 3.1 Datasets] wherein Inducive learning with 20 graphs is described, with an average of 2372 nodes per graph): 
form an input node representation for each node in the respective input graph ([Page 3, 2.1 Graph attentional layer] wherein an input node representation, where j  is a node and is represented by its feature\
    PNG
    media_image1.png
    24
    182
    media_image1.png
    Greyscale
), wherein a respective norm is defined for each input node representation ([Page 3, 2.1 Graph attentional layer] wherein features, h, are defined for each input node as a vector, requires a respective norm); 
multiply each of the input node representations with each of the set of attention parameters to form a score function of the respective input graph ([Page 3, 2.1 Graph attentional layer] wherein the input node representations (input features) are multiple by the weight matrix to form a score function representative of the importance of node j’s features to node i); 
normalize the score function based on a maximum of the norms of the input node representations to form a normalised score function ([Page 3, 2.1 Graph attentional layer] wherein we normalize the score function using softmax for input node j); and 
form a weighted node representation by weighting each node in the respective input graph using a respective element of the normalised score function  ([Page 3, 2.1 Graph attentional layer] wherein so the coefficients form a weighted node representation that can be compared) .  

As per claim 2, Velickovic discloses the data processing device of claim 1, wherein the score function is normalized such that the elements of the normalized score function sum to 1 ([Page 3, 2.1 Graph attentional layer] wherein softmax is described which ensures the resulting values are in the range of (0,1) and add up to 1).  

As per claim 3, Velickovic discloses the data processing device of claim 1, wherein an attention mechanism of the graph neural network is Lipschitz continuous ([Page 3, 2.1 Graph attentional layer] wherein softmax is described which is Lipschitz continuous around ½). 

As per claim 4, Velickovic discloses the data processing device of claim 1, wherein a softmax function is applied to the normalized score function ([Page 3, 2.1 Graph attentional layer] wherein softmax is described to normalize the score function).  

As per claim 5, Velickovic discloses the data processing device of claim 1, wherein a softmax function is applied to the score function of each node of the graph and the neighbouring nodes of each respective node, such that a set of score function values of each neighborhood sum to 1 ([Page 5, 2.2 Comparisons to related work] wherein  Softmax can be performed on an entire neighborhood of a node, wherein the function values would then sum to 1). 

As per claim 6, Velickovic discloses the data processing device of claim 1, wherein the input node representation gives contextual information about the respective node ([Page 3, 2.1 Graph attentional layer] wherein the feature vectors encode contextual information).  
 
As per claim 7, Velickovic discloses the data processing device of claim 6, wherein the contextual information is in the form of a tensor ([Page 3, 2.1 Graph attentional layer] wherein the feature vectors are a type of tensor).  

As per claim  8, Velickovic discloses  the data processing device of claim 1, wherein for each node, the respective element of the normalised score function is combined with the input representation of the respective node using a dot-product to form the weighted node representation of the node based on the weighted representation of its neighboring nodes ([Pages 3-4. 2.1 Graph attentional layer] wherein aij is the normalized score function, which is combined with the input representation hj using dot product, which is multiplication in the prior art as shown on page 4, equation 3.)

As per claim 9, Velickovic discloses the data processing device of claim 1, wherein the graph neural network is a graph attention network or a graph transformer ([Page 1, Abstract] wherein a graph attention network is described). 

As per claim 10, Velickovic discloses the data processing device of claim 1, wherein an attention mechanism of the graph neural network comprises a multi-head attention mechanism  ([Pages 4. 2.1 Graph attentional layer] wherein multi-head attention is applied).  

As per claim 11, Velickovic discloses the data processing device of claim 10, wherein the score function is normalized for every attention head in the multi-head attention mechanism ([Pages 3-4. 2.1 Graph attentional layer] wherein the normalized score function and performed on each attention head). 

As per claim 12, Velickovic disclose she data processing device of claim 1, wherein the system is configured to learn the attention parameters ([Pages 3, 1. Introduction] where in the is configured to learn the attention parameters).  

As per claim 	13, claim 13 is the method performed by the system of claim 1 and is rejected for the same rationale and reasoning. 

As per claim 14, claim 14 is the method performed by the system of claim 2 and is rejected for the same rationale and reasoning. 

As per claim 15, claim 15 is the method performed by the system of claim 3 and is rejected for the same rationale and reasoning. 

As per claim 16, claim 16 is the non-transitory computer readable medium storing a computer program which, when executed by a computer, causes the computer to perform the method of any of claim 13. Thus, claim 16 is rejected for the same rationale and reasoning as claim 1 and 13. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KANNAN SHANMUGASUNDARAM whose telephone number is (571)270-7763. The examiner can normally be reached M-F 9:00 AM -6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached at (571) 272-4085. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/KANNAN SHANMUGASUNDARAM/Primary Examiner, Art Unit 2168

Read full office action

Prosecution Timeline

Aug 03, 2023

Application Filed

Mar 26, 2026

Non-Final Rejection mailed — §101, §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/771,969

Patent 12639401

QUANTIZATION METHOD AND DEVICE FOR NEURAL NETWORK MODEL, AND COMPUTER-READABLE STORAGE MEDIUM

4y 1m to grant Granted May 26, 2026

18/291,902

Patent 12619644

TEXT RECOMMENDATION METHOD AND APPARATUS, MODEL TRAINING METHOD AND APPARATUS, AND READABLE STORAGE MEDIUM

2y 3m to grant Granted May 05, 2026

16/552,867

Patent 12596756

PERSONALIZED RELATED QUERIES FOR SEARCH SEGMENTS

6y 7m to grant Granted Apr 07, 2026

18/416,957

Patent 12596729

Value-Directed Parsing for Data Extraction

2y 2m to grant Granted Apr 07, 2026

17/977,881

Patent 12585703

METHOD AND APPARATUS FOR PROCESSING GRAPH DATA, DEVICE, STORAGE MEDIUM, AND PROGRAM PRODUCT

3y 4m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

72%

Grant Probability

99%

With Interview (+36.5%)

3y 7m (~9m remaining)

Median Time to Grant

Low

PTA Risk

Based on 586 resolved cases by this examiner. Grant probability derived from career allowance rate.