Last updated: May 29, 2026
Application No. 18/322,444
EVENT EXTRACTION METHOD AND APPARATUS, COMPUTER PROGRAM PRODUCT, STORAGE MEDIUM, AND DEVICE

Final Rejection §103
Filed
May 23, 2023
Priority
May 24, 2022 — CN 202210580427.8
Examiner
ORTIZ SANCHEZ, MICHAEL
Art Unit
2656
Tech Center
2600 — Communications
Assignee
Alipay (Hangzhou) Information Technology Co., Ltd.
OA Round
2 (Final)
Interview Optional

— +27.6% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 67% grant rate with +27.6% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 496 resolved cases, 2023–2026
Examiner Intelligence

ORTIZ SANCHEZ, MICHAEL View full profile →
Grants 67% — above average
Career Allowance Rate
331 granted / 496 resolved
+4.7% vs TC avg
Strong +28% interview lift
Without
With
+27.6%
Interview Lift
resolved cases with interview
Typical timeline
3y 10m
Avg Prosecution
13 currently pending
Career history
520
Total Applications
across all art units
Statute-Specific Performance

§101
1.6%
-38.4% vs TC avg
§103
88.4%
+48.4% vs TC avg
§102
7.2%
-32.8% vs TC avg
§112
0.5%
-39.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 496 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 12/04/2025 have been fully considered but they are not persuasive.
Mehta teaches  train a binary machine learning classifier to summarize descriptions of past feature submissions based on one or more features of the descriptions; perform a text summarization procedure to identify a set of key phrases using the binary machine learning classifier, see col. 1, lines 43-50. In some implementations, generating the index includes training a binary machine learning classifier to summarize descriptions of the past feature submissions based on one or more features of the descriptions, and performing a text summarization procedure to identify a set of key phrases using the binary machine learning classifier. In some implementations, the one or more features include at least one of a length of a key phrase, a frequency of a key phrase, a quantity of recurring words in a key phrase, or a quantity of characters in a key phrase. In some implementations, generating the index further includes concatenating descriptions of the past feature submissions to generate a corpus; splitting the corpus into a set of segments; generating vector representations of the set of segments; determining similarities between the vector representations; generating a similarity matrix based on the similarities; and converting the similarity matrix into a graph, see col. 16 lines 22-38.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-7, 10-16, 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang U.S. PAP 2022/0398384 A1 IN VIEW OF Mehta U.S. Patent No. 10,719,736 B1.

Regarding claim 1 Wang teaches a method (A text extraction method and device, computer-readable storage medium, and electronic device are described that relate to the technical field of machine learning, see abstract), comprising: 
identifying at least one trigger word in a target text (acquiring to-be-extracted data, and extracting a current trigger word, see par. [0007]); 
obtaining a trigger word vector corresponding to each of the at least one trigger word (calculating a semantic text vector of the to-be-extracted data, see par. [0012]); 
determining, in the target text, element word information associated with an event type corresponding to each trigger word based on the trigger word vector corresponding to each trigger word, an event type vector corresponding to each trigger word, and a relative location vector corresponding to each trigger word ( an importance degree of each semantic text vector to a to-be-extracted text using a first dynamic weight fusion Bert layer included in the target trigger word extraction model of the target event extraction model, and obtaining a first current encoding vector according to the semantic text vector and the importance degree, see par. [0012]), wherein the element word information includes location information corresponding to each of at least one element word and an element relationship between the element words ( The event trigger word represents the occurrence of the event, and the event argument is a subject, object, time and location of the event or the like, which is the carrier of important information about the event, see par. [0089]); 
and generating an event extraction result corresponding to the target text based on the location information of each element word and the element relationship between the element words, wherein the event type vector corresponding to each trigger word indicates the event type corresponding to the trigger word, and the relative location vector corresponding to each trigger word indicates a relative location relationship between a word and the trigger word in the target text (extracting a current event argument corresponding to the current trigger word according to the current query sentence and a target argument extraction model included in the target event extraction model, wherein the target trigger word extraction model and the target argument extraction model have a same model structure and parameter, and are connected in a cascading manner., see par. [0084]).
However Wang does not teach after the at least one trigger word has been identified, obtaining a trigger word vector corresponding to each of the at least one trigger word by conducting a binary classification processing on an original word vector included in the target text based on the at least one trigger word. 
In the same field of endeavor Mehta teaches  train a binary machine learning classifier to summarize descriptions of past feature submissions based on one or more features of the descriptions; perform a text summarization procedure to identify a set of key phrases using the binary machine learning classifier, see col. 1, lines 43-50. In some implementations, generating the index includes training a binary machine learning classifier to summarize descriptions of the past feature submissions based on one or more features of the descriptions, and performing a text summarization procedure to identify a set of key phrases using the binary machine learning classifier. In some implementations, the one or more features include at least one of a length of a key phrase, a frequency of a key phrase, a quantity of recurring words in a key phrase, or a quantity of characters in a key phrase. In some implementations, generating the index further includes concatenating descriptions of the past feature submissions to generate a corpus; splitting the corpus into a set of segments; generating vector representations of the set of segments; determining similarities between the vector representations; generating a similarity matrix based on the similarities; and converting the similarity matrix into a graph, see col. 16 lines 22-38.
It would have been obvious to one of ordinary skill in the art to combine the Wang invention with the teachings of Mehta for the benefit of generating a summary of the text, see col. 1 lines 43-50.
Regarding claim 2 Wang teaches the method according to claim 1, wherein the obtaining the trigger word vector corresponding to each of the at least one trigger word includes: 
vectoring each word in the target text to obtain at least one original word vector (The first dynamic weight fusion Bert layer is used to calculate a first current encoding vector of the to-be-extracted data, the first fully connected layer 302 is used to extract the current trigger word, and the fully connected unit (dense unit) is used to calculate an importance degree of each transformer model, see par. [0095]); 
and performing binary classification processing on each of the at least one original word vector based on a determined binary classifier, to determine the trigger word vector corresponding to each of the at least one trigger word (the role type of the argument are determined by predicting probabilities that all positions of the start and end pointer vectors corresponding to each role of the input sequence are 0/1 through a plurality of binary-classification networks, see par. [0120]).
Regarding claim 3 Wang teaches the method according to claim 2, wherein the performing binary classification processing on each of the at least one original word vector based on the at least one determined binary classifier, to determine the trigger word vector corresponding to each of the at least one trigger word includes: 
sequentially performing binary classification processing on all of the at least one original word vector based on an initial order of all the original word vectors in the target text, to determine at least one start word vector (In order to extract a plurality of event arguments at the same time, two 0/1 sequences are generated through two binary-classification networks to determine a span of the event argument in the sequence (each span is determined by a start position pointer (start) and an end position pointer (end)), see par. [0119]); 
sequentially identifying, based on the initial order of the original word vectors in the target text, a determined number of original word vectors starting from a location of each start word vector, to determine an end word vector corresponding to each start word vector (Each character in the input sequence may be represented as the start position and the end position of the argument, see par. [0119]); 
and combining original word vectors included between each start word vector and the corresponding end word vector to generate a trigger word vector, thereby obtaining the trigger word vector corresponding to each of the at least one trigger word (the span composed of text between any two characters may be expressed as any event role, see par. [0119]).
Regarding claim 4 Wang teaches the method according to claim 2, wherein the performing binary classification processing on each of the at least one original word vector based on the at least one determined binary classifier, to determine the trigger word vector corresponding to each of the at least one trigger word includes: 
performing binary classification processing on each of the at least one original word vector based on the at least one determined binary classifier, to determine at least one initial trigger word vector ( generating an embedding vector according to the character embedding vector, the character embedding matrix, and the position embedding matrix, and generating a first text semantic vector by inputting the embedding vector into a first transformer model, see par. [0019]); 
and selecting the trigger word vector corresponding to each of the at least one trigger word corresponding to the target text from the at least one initial trigger word vector (obtaining the first current encoding vector according to respective importance degrees, the embedding vector, and respective text semantic vectors, see par. [0021]).
Regarding claim 5 Wang teaches the method according to claim 2, wherein the determining, in the target text, the element word information associated with the event type corresponding to each trigger word based on the trigger word vector corresponding to each trigger word includes: 
fusing each of the at least one original word vector with a target trigger word vector to obtain at least one fused word vector, wherein the target trigger word vector is a trigger word vector corresponding to any one of the at least one trigger word (generating a first sentence pair according to the to-be-extracted data and the current query sentence, and obtaining a second current encoding vector by encoding the first sentence pair with a second dynamic weight fusion Bert layer included in the target argument extraction model of the target event extraction model, see par. [0026]); 
determining an event type vector corresponding to the target trigger word vector based on the binary classifier (obtaining a role label of the argument by classifying roles to which the arguments included in the span belong with a plurality of binary-classification networks, see oar, [0031]); 
generating a relative location vector corresponding to the target trigger word vector based on location information of a target start word vector and location information of a target end word vector corresponding to the target trigger word vector ( generating a start position matrix and an end position matrix according to start position pointers and end position pointers of all the role labels, see par. [0032]); 
fusing the at least one fused word vector, the event type vector, and the relative location vector through a multilayer perceptron to generate a first element matrix (obtaining the probabilities that the characters, at all the positions from the start position pointer to the end position pointer, of the argument included in each span belong to the start position and the end position of the current event argument by performing calculation on the start position matrix and the end position matrix with a second fully connected layer included in the target argument extraction model, see par. [0033]); 
and determining element word information associated with an event type corresponding to a target trigger word based on the first element matrix (the information interaction between the current trigger word and the current event argument is fully considered, see par. [0085]).
Regarding claim 6 Wang teaches the method according to claim 5, wherein the determining the event type vector corresponding to the target trigger word vector based on the binary classifier includes: 
determining the event type corresponding to the target trigger word based on a target binary classifier corresponding to the target trigger word vector, wherein the target binary classifier is a binary classifier that identifies the target trigger word vector (In order to extract a plurality of event arguments at the same time, two 0/1 sequences are generated through two binary-classification networks to determine a span of the event argument in the sequence, see par. [0119]); 
and encoding the event type to obtain the event type vector corresponding to the target trigger word vector (a role classification is performed on the argument span through a plurality of binary-classification networks. Each character in the input sequence may be represented as the start position and the end position of the argument, and the span composed of text between any two characters may be expressed as any event role, see par. [0162]).
Regarding claim 7 Wang teaches the method according to claim 5, wherein the generating the relative location vector corresponding to the target trigger word vector based on the location information of the start word vector and the location information of the end word vector corresponding to the target trigger word vector includes: 
determining the target start word vector and the target end word vector corresponding to the target trigger word vector (each span is determined by a start position pointer (start) and an end position pointer (end), see par. [0162]); 
determining location information of the target trigger word in the target text based on the location information of the target start word vector and the location information of the target end word vector (each character in the input sequence may be represented as the start position and the end position of the argument, and the span composed of text between any two characters may be expressed as any event role, see par. [0162]); 
and generating the relative location vector of the target trigger word relative to each word in the target text based on the location information of the target trigger word in the target text (a start position matrix S.sub.s and a position matrix S.sub.e may be obtained by combining the start and end pointer vectors of all labels together. Each row of the S.sub.s and S.sub.e represents a role type, and each column thereof corresponds to a character in the text. In the present disclosure, the start position and end position and the role type of the argument are determined by predicting probabilities that all positions of the start and end pointer vectors corresponding to each role of the input sequence are 0/1 through a plurality of binary-classification networks, see par. [0162]).
Regarding claim 10 Wang teaches a computing system comprising one or more processors and one or more memory devices, the one or more memory devices having computer executable instructions stored thereon, which when executed by the one or more processors, enable the one or processors (A text extraction method and device, computer-readable storage medium, and electronic device are described that relate to the technical field of machine learning, see abstract) to implement acts including: identifying at least one trigger word in a target text (acquiring to-be-extracted data, and extracting a current trigger word, see par. [0007]); 
obtaining a trigger word vector corresponding to each of the at least one trigger word (calculating a semantic text vector of the to-be-extracted data, see par. [0012]); 
determining, in the target text, element word information associated with an event type corresponding to each trigger word based on the trigger word vector corresponding to each trigger word, an event type vector corresponding to each trigger word, and a relative location vector corresponding to each trigger word ( an importance degree of each semantic text vector to a to-be-extracted text using a first dynamic weight fusion Bert layer included in the target trigger word extraction model of the target event extraction model, and obtaining a first current encoding vector according to the semantic text vector and the importance degree, see par. [0012]), wherein the element word information includes location information corresponding to each of at least one element word and an element relationship between the element words ( The event trigger word represents the occurrence of the event, and the event argument is a subject, object, time and location of the event or the like, which is the carrier of important information about the event, see par. [0089]); 
and generating an event extraction result corresponding to the target text based on the location information of each element word and the element relationship between the element words, wherein the event type vector corresponding to each trigger word indicates the event type corresponding to the trigger word, and the relative location vector corresponding to each trigger word indicates a relative location relationship between a word and the trigger word in the target text (extracting a current event argument corresponding to the current trigger word according to the current query sentence and a target argument extraction model included in the target event extraction model, wherein the target trigger word extraction model and the target argument extraction model have a same model structure and parameter, and are connected in a cascading manner., see par. [0084]).
However Wang does not teach after the at least one trigger word has been identified, obtaining a trigger word vector corresponding to each of the at least one trigger word by conducting a binary classification processing on an original word vector included in the target text based on the at least one trigger word. 
In the same field of endeavor Mehta teaches  train a binary machine learning classifier to summarize descriptions of past feature submissions based on one or more features of the descriptions; perform a text summarization procedure to identify a set of key phrases using the binary machine learning classifier, see col. 1, lines 43-50. In some implementations, generating the index includes training a binary machine learning classifier to summarize descriptions of the past feature submissions based on one or more features of the descriptions, and performing a text summarization procedure to identify a set of key phrases using the binary machine learning classifier. In some implementations, the one or more features include at least one of a length of a key phrase, a frequency of a key phrase, a quantity of recurring words in a key phrase, or a quantity of characters in a key phrase. In some implementations, generating the index further includes concatenating descriptions of the past feature submissions to generate a corpus; splitting the corpus into a set of segments; generating vector representations of the set of segments; determining similarities between the vector representations; generating a similarity matrix based on the similarities; and converting the similarity matrix into a graph, see col. 16 lines 22-38.
It would have been obvious to one of ordinary skill in the art to combine the Wang invention with the teachings of Mehta for the benefit of generating a summary of the text, see col. 1 lines 43-50.
Regarding claim 11 Wang teaches the computing system according to claim 10, wherein the obtaining the trigger word vector corresponding to each of the at least one trigger word includes: vectoring each word in the target text to obtain at least one original word vector (The first dynamic weight fusion Bert layer is used to calculate a first current encoding vector of the to-be-extracted data, the first fully connected layer 302 is used to extract the current trigger word, and the fully connected unit (dense unit) is used to calculate an importance degree of each transformer model, see par. [0095]); 
and performing binary classification processing on each of the at least one original word vector based on a determined binary classifier, to determine the trigger word vector corresponding to each of the at least one trigger word (the role type of the argument are determined by predicting probabilities that all positions of the start and end pointer vectors corresponding to each role of the input sequence are 0/1 through a plurality of binary-classification networks, see par. [0120]).
Regarding claim 12 Wang teaches the computing system according to claim 11, wherein the performing binary classification processing on each of the at least one original word vector based on the at least one determined binary classifier, to determine the trigger word vector corresponding to each of the at least one trigger word includes: 
sequentially performing binary classification processing on all of the at least one original word vector based on an initial order of all the original word vectors in the target text, to determine at least one start word vector (In order to extract a plurality of event arguments at the same time, two 0/1 sequences are generated through two binary-classification networks to determine a span of the event argument in the sequence (each span is determined by a start position pointer (start) and an end position pointer (end)), see par. [0119]); 
sequentially identifying, based on the initial order of the original word vectors in the target text, a determined number of original word vectors starting from a location of each start word vector, to determine an end word vector corresponding to each start word vector (Each character in the input sequence may be represented as the start position and the end position of the argument, see par. [0119]); 
and combining original word vectors included between each start word vector and the corresponding end word vector to generate a trigger word vector, thereby obtaining the trigger word vector corresponding to each of the at least one trigger word (the span composed of text between any two characters may be expressed as any event role, see par. [0119]).
Regarding claim 13 Wang teaches the computing system according to claim 11, wherein the performing binary classification processing on each of the at least one original word vector based on the at least one determined binary classifier, to determine the trigger word vector corresponding to each of the at least one trigger word includes: performing binary classification processing on each of the at least one original word vector based on the at least one determined binary classifier, to determine at least one initial trigger word vector ( generating an embedding vector according to the character embedding vector, the character embedding matrix, and the position embedding matrix, and generating a first text semantic vector by inputting the embedding vector into a first transformer model, see par. [0019]); 
and selecting the trigger word vector corresponding to each of the at least one trigger word corresponding to the target text from the at least one initial trigger word vector (obtaining the first current encoding vector according to respective importance degrees, the embedding vector, and respective text semantic vectors, see par. [0021]).

Regarding claim 14 Wang teaches the computing system according to claim 11, wherein the determining, in the target text, the element word information associated with the event type corresponding to each trigger word based on the trigger word vector corresponding to each trigger word includes: 
fusing each of the at least one original word vector with a target trigger word vector to obtain at least one fused word vector, wherein the target trigger word vector is a trigger word vector corresponding to any one of the at least one trigger word (generating a first sentence pair according to the to-be-extracted data and the current query sentence, and obtaining a second current encoding vector by encoding the first sentence pair with a second dynamic weight fusion Bert layer included in the target argument extraction model of the target event extraction model, see par. [0026]); 
determining an event type vector corresponding to the target trigger word vector based on the binary classifier (obtaining a role label of the argument by classifying roles to which the arguments included in the span belong with a plurality of binary-classification networks, see oar, [0031]); 
generating a relative location vector corresponding to the target trigger word vector based on location information of a target start word vector and location information of a target end word vector corresponding to the target trigger word vector ( generating a start position matrix and an end position matrix according to start position pointers and end position pointers of all the role labels, see par. [0032]); 
fusing the at least one fused word vector, the event type vector, and the relative location vector through a multilayer perceptron to generate a first element matrix (obtaining the probabilities that the characters, at all the positions from the start position pointer to the end position pointer, of the argument included in each span belong to the start position and the end position of the current event argument by performing calculation on the start position matrix and the end position matrix with a second fully connected layer included in the target argument extraction model, see par. [0033]); 
and determining element word information associated with an event type corresponding to a target trigger word based on the first element matrix (the information interaction between the current trigger word and the current event argument is fully considered, see par. [0085]).
Regarding claim 15 Wang teaches the computing system according to claim 14, wherein the determining the event type vector corresponding to the target trigger word vector based on the binary classifier includes: determining the event type corresponding to the target trigger word based on a target binary classifier corresponding to the target trigger word vector, wherein the target binary classifier is a binary classifier that identifies the target trigger word vector (In order to extract a plurality of event arguments at the same time, two 0/1 sequences are generated through two binary-classification networks to determine a span of the event argument in the sequence, see par. [0119]); 
and encoding the event type to obtain the event type vector corresponding to the target trigger word vector (a role classification is performed on the argument span through a plurality of binary-classification networks. Each character in the input sequence may be represented as the start position and the end position of the argument, and the span composed of text between any two characters may be expressed as any event role, see par. [0162]).

Regarding claim 16 Wang teaches the computing system according to claim 14, wherein the generating the relative location vector corresponding to the target trigger word vector based on the location information of the start word vector and the location information of the end word vector corresponding to the target trigger word vector includes: determining the target start word vector and the target end word vector corresponding to the target trigger word vector (each span is determined by a start position pointer (start) and an end position pointer (end), see par. [0162]); 
determining location information of the target trigger word in the target text based on the location information of the target start word vector and the location information of the target end word vector (each character in the input sequence may be represented as the start position and the end position of the argument, and the span composed of text between any two characters may be expressed as any event role, see par. [0162]); 
and generating the relative location vector of the target trigger word relative to each word in the target text based on the location information of the target trigger word in the target text (a start position matrix S.sub.s and a position matrix S.sub.e may be obtained by combining the start and end pointer vectors of all labels together. Each row of the S.sub.s and S.sub.e represents a role type, and each column thereof corresponds to a character in the text. In the present disclosure, the start position and end position and the role type of the argument are determined by predicting probabilities that all positions of the start and end pointer vectors corresponding to each role of the input sequence are 0/1 through a plurality of binary-classification networks, see par. [0162]).

Regarding claim 19 Wang teaches a non-transitory storage medium ( A non-transitory computer-readable storage medium, see claim 13) having computer executable instructions stored thereon, the computer executable instructions, when executed by the one or more processors, configuring the one or processors to implement actions including: 
i identifying at least one trigger word in a target text (acquiring to-be-extracted data, and extracting a current trigger word, see par. [0007]); 
obtaining a trigger word vector corresponding to each of the at least one trigger word (calculating a semantic text vector of the to-be-extracted data, see par. [0012]); 
determining, in the target text, element word information associated with an event type corresponding to each trigger word based on the trigger word vector corresponding to each trigger word, an event type vector corresponding to each trigger word, and a relative location vector corresponding to each trigger word ( an importance degree of each semantic text vector to a to-be-extracted text using a first dynamic weight fusion Bert layer included in the target trigger word extraction model of the target event extraction model, and obtaining a first current encoding vector according to the semantic text vector and the importance degree, see par. [0012]), wherein the element word information includes location information corresponding to each of at least one element word and an element relationship between the element words ( The event trigger word represents the occurrence of the event, and the event argument is a subject, object, time and location of the event or the like, which is the carrier of important information about the event, see par. [0089]); 
and generating an event extraction result corresponding to the target text based on the location information of each element word and the element relationship between the element words, wherein the event type vector corresponding to each trigger word indicates the event type corresponding to the trigger word, and the relative location vector corresponding to each trigger word indicates a relative location relationship between a word and the trigger word in the target text (extracting a current event argument corresponding to the current trigger word according to the current query sentence and a target argument extraction model included in the target event extraction model, wherein the target trigger word extraction model and the target argument extraction model have a same model structure and parameter, and are connected in a cascading manner., see par. [0084]).
However Wang does not teach after the at least one trigger word has been identified, obtaining a trigger word vector corresponding to each of the at least one trigger word by conducting a binary classification processing on an original word vector included in the target text based on the at least one trigger word. 
In the same field of endeavor Mehta teaches  train a binary machine learning classifier to summarize descriptions of past feature submissions based on one or more features of the descriptions; perform a text summarization procedure to identify a set of key phrases using the binary machine learning classifier, see col. 1, lines 43-50. In some implementations, generating the index includes training a binary machine learning classifier to summarize descriptions of the past feature submissions based on one or more features of the descriptions, and performing a text summarization procedure to identify a set of key phrases using the binary machine learning classifier. In some implementations, the one or more features include at least one of a length of a key phrase, a frequency of a key phrase, a quantity of recurring words in a key phrase, or a quantity of characters in a key phrase. In some implementations, generating the index further includes concatenating descriptions of the past feature submissions to generate a corpus; splitting the corpus into a set of segments; generating vector representations of the set of segments; determining similarities between the vector representations; generating a similarity matrix based on the similarities; and converting the similarity matrix into a graph, see col. 16 lines 22-38.
It would have been obvious to one of ordinary skill in the art to combine the Wang invention with the teachings of Mehta for the benefit of generating a summary of the text, see col. 1 lines 43-50.
Regarding claim 20 Wang teaches the non-transitory storage medium according to claim 19, wherein the obtaining the trigger word vector corresponding to each of the at least one trigger word includes: vectoring each word in the target text to obtain at least one original word vector (The first dynamic weight fusion Bert layer is used to calculate a first current encoding vector of the to-be-extracted data, the first fully connected layer 302 is used to extract the current trigger word, and the fully connected unit (dense unit) is used to calculate an importance degree of each transformer model, see par. [0095]); 
and performing binary classification processing on each of the at least one original word vector based on a determined binary classifier, to determine the trigger word vector corresponding to each of the at least one trigger word (the role type of the argument are determined by predicting probabilities that all positions of the start and end pointer vectors corresponding to each role of the input sequence are 0/1 through a plurality of binary-classification networks, see par. [0120]).
Claim(s) 8 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang U.S. PAP 2022/0398384 A1 IN VIEW OF Mehta U.S. Patent No. 10,719,736 B1, further in view of Ji “A Biaffine Attention-Based Approach for Event Factor Extraction”.

Regarding claim 8 Wang teaches the method according to claim 5, wherein the determining the element word information associated with the event type corresponding to the target trigger word based on the first element matrix includes: 
extracting a location information vector of each element word associated with the event type corresponding to the target trigger word and an element relationship vector between the element words from the first element matrix each character in the input sequence may be represented as the start position and the end position of the argument, and the span composed of text between any two characters may be expressed as any event role, see par. [0162]). 
However Wang in view of Mehta does not teach performing biaffine transformation on the location information vector and the element relationship vector to obtain a second element matrix; and decoding the second element matrix in a determined order, to obtain location information of each element word and an element relationship between the element words.
In the same field of endeavor Ji teaches state-of-the-art BERT-like base models and the biaffine attention mechanism to build a two-stage model, one stage for event trigger extraction and another for event role extraction, see abstract.
Ji teaches performing biaffine transformation on the location information vector and the element relationship vector to obtain a second element matrix (In recent years attention mechanism has become a key technique to improve performance of NLP models, and the pointer network structure has become the mainstream method in NER problem especially in the nested situation. However the general pointer networks encode start and end pointer together, in [16] the authors propose an attention mechanism considering start and end pointer simultaneously using a biaffine operation which achieve a better result in NER task, see section 2 Related Works); and decoding the second element matrix in a determined order, to obtain location information of each element word and an element relationship between the element words (The CCKS 2021 communication domain process knowledge extraction task aims at extracting event trigger words and event arguments from free text, that is, given text T, extract all event sets E in text T, and for each event e in E, extract trigger words (including words, positions and classifications) and roles (including words, positions and classifications) of E from text T, see  section 1 Introduction).
It would have been obvious to one of ordinary skill in the art to combine the Wang in view of Mehta invention with the teachings of Ji for the benefit of obtaining a better result in named entity recognition tasks, see section 2).
Regarding claim 17 Wang teaches the computing system according to claim 14, wherein the determining the element word information associated with the event type corresponding to the target trigger word based on the first element matrix includes: extracting a location information vector of each element word associated with the event type corresponding to the target trigger word and an element relationship vector between the element words from the first element matrix each character in the input sequence may be represented as the start position and the end position of the argument, and the span composed of text between any two characters may be expressed as any event role, see par. [0162]). 
However Wang in view of Mehta does not teach performing biaffine transformation on the location information vector and the element relationship vector to obtain a second element matrix; and decoding the second element matrix in a determined order, to obtain location information of each element word and an element relationship between the element words.
In the same field of endeavor Ji teaches state-of-the-art BERT-like base models and the biaffine attention mechanism to build a two-stage model, one stage for event trigger extraction and another for event role extraction, see abstract.
Ji teaches performing biaffine transformation on the location information vector and the element relationship vector to obtain a second element matrix (In recent years attention mechanism has become a key technique to improve performance of NLP models, and the pointer network structure has become the mainstream method in NER problem especially in the nested situation. However the general pointer networks encode start and end pointer together, in [16] the authors propose an attention mechanism considering start and end pointer simultaneously using a biaffine operation which achieve a better result in NER task, see section 2 Related Works); and decoding the second element matrix in a determined order, to obtain location information of each element word and an element relationship between the element words (The CCKS 2021 communication domain process knowledge extraction task aims at extracting event trigger words and event arguments from free text, that is, given text T, extract all event sets E in text T, and for each event e in E, extract trigger words (including words, positions and classifications) and roles (including words, positions and classifications) of E from text T, see  section 1 Introduction).
It would have been obvious to one of ordinary skill in the art to combine the Wang in view of Mehta invention with the teachings of Ji for the benefit of obtaining a better result in named entity recognition tasks, see section 2).

Claim(s) 9 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang U.S. PAP 2022/0398384 A1 IN VIEW OF Mehta U.S. Patent No. 10,719,736 B1, further in view of Haldar U.S. Patent No.11,687,385 B2 .


Regarding claim 9 Wang in view of Mehta does not teach the method according to claim 1, wherein the generating the event extraction result corresponding to the target text based on the location information of each element word and the element relationship between the element words includes: 
generating a directed acyclic graph including each element word based on the location information of each element word and the element relationship between the element words; determining all element paths between any two element words in the directed acyclic graph; determining at least one event path among the element paths based on a longest non-implication rule; and generating the event extraction result corresponding to the target text based on the event path.
In the same field of endeavor Haldar teaches  computer implemented method can include parsing, by a system operatively coupled to a processor, unstructured text comprising event information to identify candidate event components, see abstract.  The disclosed unsupervised event extraction techniques can be applied to facilitate automated event extraction and related applications in a variety of domains. For example, the disclosed techniques can be applied to facilitate more accurately answering/responding to unstructured user queries for which the correct answers/responses are provided in one or more unstructured text documents/files, see col. 3 lines 57-63.
Haldar teaches generating a directed acyclic graph including each element word based on the location information of each element word and the element relationship between the element words (the parsing component employs abstract meaning representation (AMR) parsing to identify the candidate event components, see col. 1 lines 23-32; AMRs represent sentences as single-rooted, directed acyclic graphs, with labeled roles, see col. 10 lines 18-19); 
determining all element paths between any two element words in the directed acyclic graph (AMR would represent a phrase like “bond investor” using the frame “invest-01”, even though no verbs appear in the phrase, see col. 10 lines 19-24); 
determining at least one event path among the element paths based on a longest non-implication rule (FIGS. 5A and 5B present example AMR graphs representing candidate events extracted from unstructured text using AMR parsing in accordance with one or more embodiments, see col. 10 lines 45-63); 
and generating the event extraction result corresponding to the target text based on the event path (an event extraction component that generates structured event information defining events represented in the unstructured text based on the candidate event components, see col. 1 lines 32-40).
It would have been obvious to one of ordinary skill in the art to combine the Wang in view of Mehta invention with the teachings of Haldar for the benefit of facilitating more accurately answering/responding to unstructured user queries for which the correct answers/responses are provided in one or more unstructured text documents/files, see col. 3 lines 57-63.

Regarding claim 18 Wang in view of Mehta does not teach the computing system according to claim 10, wherein the generating the event extraction result corresponding to the target text based on the location information of each element word and the element relationship between the element words includes: generating a directed acyclic graph including each element word based on the location information of each element word and the element relationship between the element words; determining all element paths between any two element words in the directed acyclic graph; determining at least one event path among the element paths based on a longest non-implication rule; and generating the event extraction result corresponding to the target text based on the event path.
In the same field of endeavor Haldar teaches  computer implemented method can include parsing, by a system operatively coupled to a processor, unstructured text comprising event information to identify candidate event components, see abstract.  The disclosed unsupervised event extraction techniques can be applied to facilitate automated event extraction and related applications in a variety of domains. For example, the disclosed techniques can be applied to facilitate more accurately answering/responding to unstructured user queries for which the correct answers/responses are provided in one or more unstructured text documents/files, see col. 3 lines 57-63.
Haldar teaches generating a directed acyclic graph including each element word based on the location information of each element word and the element relationship between the element words (the parsing component employs abstract meaning representation (AMR) parsing to identify the candidate event components, see col. 1 lines 23-32; AMRs represent sentences as single-rooted, directed acyclic graphs, with labeled roles, see col. 10 lines 18-19); 
determining all element paths between any two element words in the directed acyclic graph (AMR would represent a phrase like “bond investor” using the frame “invest-01”, even though no verbs appear in the phrase, see col. 10 lines 19-24); 
determining at least one event path among the element paths based on a longest non-implication rule (FIGS. 5A and 5B present example AMR graphs representing candidate events extracted from unstructured text using AMR parsing in accordance with one or more embodiments, see col. 10 lines 45-63); 
and generating the event extraction result corresponding to the target text based on the event path (an event extraction component that generates structured event information defining events represented in the unstructured text based on the candidate event components, see col. 1 lines 32-40).
It would have been obvious to one of ordinary skill in the art to combine the Wang in view of Mehta invention with the teachings of Haldar for the benefit of facilitating more accurately answering/responding to unstructured user queries for which the correct answers/responses are provided in one or more unstructured text documents/files, see col. 3 lines 57-63.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Pertinent prior art available on form 892.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Ortiz-Sanchez whose telephone number is (571)270-3711. The examiner can normally be reached Monday- Friday 9AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached at 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL ORTIZ-SANCHEZ/Primary Examiner, Art Unit 2656
Read full office action
Prosecution Timeline

May 23, 2023
Application Filed
Sep 08, 2025
Non-Final Rejection mailed — §103
Nov 06, 2025
Examiner Interview Summary
Nov 06, 2025
Applicant Interview (Telephonic)
Dec 04, 2025
Response Filed
Mar 12, 2026
Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/426,781
Patent 12639507
Computing a credibility score of an online article
2y 3m to grant Granted May 26, 2026
18/236,071
Patent 12608373
DETECTING OUT-OF-DOMAIN, OUT-OF-SCOPE, AND CONFUSION-SPAN (OOCS) INPUT FOR A NATURAL LANGUAGE TO LOGICAL FORM MODEL
2y 8m to grant Granted Apr 21, 2026
18/460,373
Patent 12596887
SYSTEMS AND METHODS FOR TEXT SIMPLIFICATION WITH DOCUMENT-LEVEL CONTEXT
2y 7m to grant Granted Apr 07, 2026
17/529,344
Patent 12566831
METHODS AND SYSTEMS FOR TRAINING A MACHINE LEARNING MODEL AND AUTHENTICATING A USER WITH THE MODEL
4y 3m to grant Granted Mar 03, 2026
17/908,851
Patent 12567399
MANAGEMENT APPARATUS, MANAGEMENT SYSTEM, MANAGEMENT METHOD, AND RECORDING MEDIUM
3y 6m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
67%
Grant Probability
94%
With Interview (+27.6%)
3y 10m (~9m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 496 resolved cases by this examiner. Grant probability derived from career allowance rate.