Last updated: May 29, 2026
Application No. 18/088,074
PROCESSING EVENT DATA BASED ON MACHINE LEARNING

Final Rejection §103
Filed
Dec 23, 2022
Examiner
MORALES, PEDRO JESUS
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
The Johns Hopkins University
OA Round
2 (Final)
Interview Optional

— +50.0% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 67% grant rate with +50.0% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 9 resolved cases, 2023–2026
Examiner Intelligence

MORALES, PEDRO JESUS View full profile →
Grants 67% — above average
Career Allowance Rate
6 granted / 9 resolved
+11.7% vs TC avg
Strong +50% interview lift
Without
With
+50.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
10 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
3.9%
-36.1% vs TC avg
§103
92.2%
+52.2% vs TC avg
§112
2.0%
-38.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 9 resolved cases
Office Action

§103
DETAILED ACTION
	This action is responsive to Applicant’s reply filed 22 January 2026. This action is made final. 

Status of the Claims
Claims 1-3 and 13-15 are currently amended. 
Claim status is currently pending and under examination for claims 1-20 of which independent claims are 1 and 13. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
	Applicant’s amendments to the Claims have overcome the 101 rejections previously set forth in the Non-Final Office Action mailed October 23rd 2025. 
	Applicant’s arguments regarding the art rejections are moot in view of the new grounds of rejection necessitated by Applicant’s remarks and amendment. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5, 13-15 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Burns et al. (US 20200226321 A1), hereinafter Burns, in view of Krishnamurti et al. (US 20200051697 A1), hereinafter Krishnamurti, further in view of Gopalakrishnan et al. (US 20230134546 A1), hereinafter Gopalakrishnan. 

With respect to claim 1, Burns teaches:
A computer-implemented method, comprising (Burns discloses “an automated system and methods for assigning billing codes to medical procedures” [0002].): 
receiving, by a data processing system, event data representing a medical safety event (Burns discloses “text processor 12 is configured to receive an input record, where the input record represents a medical procedure performed on a patient and the input record includes a text description describing the medical procedure” [0021]. See Figure 1 depicting an automated system (‘data processing system’) receiving an input medical record.); 
processing the event data, including: parsing, by a parser of the data processing system … the event data to identify the data structure of the event data (Burns discloses a text processor (‘parser’) processes an input record (‘event data’) to identify strings from the text description (‘identify a data structure of the event data’), “to aid in the processing of procedural text and to decrease vocabulary size, the text description from the input record was processed into a standardized form. Because the text description of the medical procedure is typically hand-entered, it is subject to misspellings and frequently contains medical abbreviations and acronyms. First, misspelled words in the text description are corrected. To do so, a dictionary of commonly misspelled words is referenced by the text processor 12, where each entry in this dictionary includes a misspelled word and a corresponding proper spelling for the misspelled word. Upon receiving a text description, each string in the text description is parsed and compared to the entries in the dictionary. If a string in the text description matches a given entry in the dictionary, the string in the text description is replaced with the proper spelling from the dictionary” [0030]. 
See Figure 1 depicting an automated system (‘data processing system’) comprising a text processor.); 
	and identifying a given field and data content of that given field, by: identifying [a field] from the data structure of the event data including data content corresponding to [the field] (Burns discloses a standardized form of the text description (‘given field’)  is generated (‘identified’) from the parsed strings of the text description (‘data structure of the event data’), “Natural language and/or text processing is performed at 22 on the input record to generate a standardized form suitable for machine learning. A feature vector is then constructed at 23 from the standardized form of the input record by extracting one or more features from the input record. In one embodiment, each string in the standardized form of the text description is an element in the feature vector” [0026].
The standardized form of the text description (‘given field’) is comprised of strings (‘data content’).); 
inputting, to a machine learning engine, data content of the given field … (Burns discloses the standardized form of the text description (‘given field’) is input to a classifier (‘machine learning engine’), “the classifier 14 receives the input record, including standardized form of the text description, from the text processor 12” [0022]. See [0026] describing how the standardized form of the text description (‘given field’) is made of strings (‘data content).);
generating, by the machine learning engine and from contents of the one or more fields, one or more feature vectors by (Burns discloses “The classifier 14 receives the input record, including standardized form of the text description, from the text processor 12. The classifier 14 operates to assign a billing code from the listing of possible billing codes 16 to the input record. Briefly, the classifier 14 constructs a feature vector by extracting one or more features from the input record” [0022].
Burns further discloses “A feature vector is then constructed at 23 from the standardized form of the input record by extracting one or more features from the input record. In one embodiment, each string in the standardized form of the text description is an element in the feature vector” [0026].):
	identifying, in the data content of the given field, one or more values, each value corresponding to a feature of one or more features (When a standardized form of the text description (‘given field’) is inputted into a classifier, the classifier constructs a feature vector comprised of features (elements). Strings (‘data content’) make up the features. Each string is comprised of text (values corresponding to features). 
See [0030] describing how each string is comprised of text that includes words, medical abbreviations, and acronyms.):
accessing, from the hardware storage device, a plurality of indicator candidates (Burns discloses “the classifier 14 operates to assign a billing code from the listing of possible billing codes 16 to the input record” [0022]. See Figure 1 depicting a classifier retrieving a stored list of Current Procedural Terminology (CPT) codes (‘indicator candidates’).); 
determining, by the machine learning engine and based on the one or more feature vectors …, one or more indicators from the plurality of indicator candidates (Burns discloses “billing codes are classified by the classifier 14 using machine learning. In one example embodiment, the billing codes are classified using a deep neural network, such as Label-Embedding Attentive models (LEAM)” [0037].
Burns further discloses “each billing code in the listing of possible billing codes is first classified at 34 using a label-embedding attentive model (LEAM). For a given feature vector, LEAM computes a probability or percentage of confidence score for each billing code in the listing of possible billing codes. The billing codes are then ordered highest to lowest based on the corresponding confidence score. To assist with the assignment process, a confidence parameter is also derived from the confidence scores.” [0043].); 
tagging, by the data processing system, the event data with the one or more indicators (Burns discloses “for each billing code in the listing of possible billing codes, the feature vector is scored at 24 using models created using machine learning. Based on the scores, a billing code from the listing of possible billing codes is assigned at 25 to the input record. Additionally, a confidence score can be calculated for the assigned billing code, where the confidence score quantitates confidence in the assigned billing code. The confidence score may be used in the assignment process and/or may be presented to a billing specialist (along with the assigned billing code)” [0027]. See Figure 1 depicting an automated system (‘data processing system’) that assigns billing codes to input medical records.); 
storing the tagged event data in the hardware storage device (Burns discloses “the input record with an assigned billing code is passed directly to a billing system 17 for processing. In other embodiments, the input record with the assigned billing code are reviewed manually by a billing specialist on a user interface of a computing device 18 before being passed on the billing system 17. The billing specialist may elect to confirm the assignment made by the system or change the assignment made the system” [0023].);
and updating the machine learning engine based on the tagged, processed event data stored in the hardware storage device (Burns discloses “the assignment of the billing code to an input record (either by the system or manually) is used as feedback to improve the machine learning models. That is, the models can be re-trained and/or updated based on the input records with assigned billing codes. Additionally or alternatively, the models can be re-trained and/or updated using feedback from a billing specialist. During a validation process, the billing specialist can indicate whether a billing code assignment was accurate or not and, if not, provide a reason. The feedback from the billing specialist in turn is represented as a vector that is used to re-train the machine leaning models” [0024].).
However, Burns does not teach a data structure having a plurality of fields delineated by indicators, which is taught by Krishnamurti:
wherein the event data comprises a data structure having a plurality of fields and a plurality of indicators delineating each field of the plurality of fields from other fields of the plurality of fields (Krishnamurti discloses “parse one or more electronic health records to retrieve values of respective fields of the electronic health records, the one or more retrieved values associated with one or more tests performed during a pregnancy” [0021].
Krishnamurti discloses “At step 610, the classification system parses one or more items of structured medical data to retrieve values of respective fields of the one or more items of structured medical data, the one or more retrieved values representing a set of medical attributes” [0094]. 
Electronic health records (‘event data’) are parsed to retrieve values of respective fields (therefore electronic health records are comprised of a data structure having a plurality of fields). Each retrieved value is a medical attribute that is from a respective field, therefore each field is defined (delineated) by a medical attribute (and therefore medical attributes are a plurality of indicators delineating each field of the plurality of fields from other fields of the plurality of fields).);
parsing, by a parser of the data processing system based on the plurality of indicators delineating each field of the plurality of fields, the event data to identify the data structure of the event data (Krishnamurti discloses “parse one or more electronic health records to retrieve values of respective fields of the electronic health records” [0021].
By parsing the electronic health records, values of the respective fields of the records are retrieved (therefore parsing the respective fields for their values identifies the data structure of event data).);
identifying the plurality of fields from the data structure of the event data including data content corresponding to each field of the plurality of fields (Krishnamurti discloses electronic health records are parsed to retrieve values of respective fields (‘data content corresponding to each field’), therefore identifying each respective field and its values, see [0021] above.);
inputting, to a machine learning engine, data content of the given field of the plurality of fields (Krishnamurti discloses medical attributes (retrieved values of respective fields) are applied (‘input’) to a classifier (‘machine learning engine’), “At step 610, the classification system parses one or more items of structured medical data to retrieve values of respective fields of the one or more items of structured medical data, the one or more retrieved values representing a set of medical attributes. At step 620, the classification system 620 selects, from the memory a classifier based at least one of the attributes in the set and further applies the classifier to the set of attributes to classify the one or more items of structured medical data into a particular risk profile that includes a plurality of risk factors” [0094].);
 	Krishnamurti teaches parsing fields of structured medical data to retrieve values is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the automated system of Burns with the structured medical data disclosed by Krishnamurti to parse data consisting of multiple fields. By parsing data consisting of multiple fields, data can be processed consistently since each value is structured and separated in a standardized manner, thereby improving processing efficiency. 
	Furthermore, the combination of Burns in view of Krishnamurti does not teach calculating a frequency score based on historical event data for each feature of the one or more features, which is taught by Gopalakrishnan:
and calculating, based on a database of historical event data stored in a hardware storage device, a frequency score corresponding to each feature of the one or more features (Gopalakrishnan discloses “ML service 112 includes an NLP engine that converts text-based features into numerical values. For instance, a numerical value for a token may be a score that represents what the word means to the log entry versus what the word means to a list of historical events. An example approach for assigning a score is to compute a term frequency inverse-document frequency (TF-IDF) score. With TF-IDF, the score for a token increases proportionally to the frequency the token appears in a log record offset by the number of logs that include the token” [0057].
Gopalakrishnan discloses textual tokens (‘features’) are words, “A textual token as used herein may include words and/or phrases” [0059].
Gopalakrishnan discloses “The process further determines a frequency of the textual token in a list of historical events (operation 206). For example, the process may determine a frequency of the token in a list of log records in the past five days or over some other timeframe. A logarithmically scaled inverse document frequency may be computed by taking the log of the value obtained by dividing the total number of log entries within the specified timeframe by the number of log entries include the token” [0061]. 
Gopalakrishnan discloses “the microservice application may generate and provide an output based on input that identifies, locates, or provides historical data” [0130]. A microservice application receives historical data, therefore it is implied historical data is stored in an accessible database.), 
the frequency score indicating a frequency at which the feature appears in the historical data (Gopalakrishnan discloses “The process further determines a frequency of the textual token in a list of historical events (operation 206). For example, the process may determine a frequency of the token in a list of log records in the past five days or over some other timeframe” [0061].);
determining, by the machine learning engine and based on the one or more feature vectors indicating the frequency score for each feature, one or more indicators from the plurality of indicator candidates (Gopalakrishnan discloses “The process may then generate TF-IDF scores for the individual textual tokens and/or the log records as previously described. The process may generate a feature vector for an example that includes the scores” [0070].
Gopalakrishnan discloses “The ML model may receive the feature vector as input and output a set of one or more predictions about whether observed behavior is a network attack” [0032]. 
Gopalakrishnan discloses “The process further generates a score for the log record based on the scores of the textual tokens included therein (operation 606). For example, the process may sum, average, or otherwise aggregate the TF-IDF scores of the textual tokens. Once the scores have been computed, the process applies one or more trained classifiers to predict whether the new event log activity represents a current attack (operation 608)” [0089-0090].
A trained classifier (‘machine learning engine’) generates predictions to classify an event log activity as an attack (therefore “attack” and not an attack are indicator candidates).);
Gopalakrishnan teaches calculating a TF-IDF score for each token of an event log based on historical event data is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the automated system of Burns with the TF-IDF scores disclosed by Gopalakrishnan to calculate a TF-IDF score for each token based on historic data. By calculating a TF-IDF score for each token based on historic data, the frequency of a word across several historical documents can be used to determine a word’s importance, thereby filtering out tokens (words) that appear frequently and elevating the importance of rare tokens.

With respect to claim 2, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches:
the computer-implemented method of claim 1, wherein the one or more feature vectors comprise the one or more features and one or more corresponding scores including the frequency score for each feature of the one or more features (Burns discloses feature vectors are comprised of strings (‘one or more features’), “a feature vector is constructed at 33 by extracting one or more features from the input record. In this example embodiment, elements in the feature vector include each string in the text description” [0042]. 
	Burns discloses corresponding scores as confidence scores that are computed for a feature vector, “each billing code in the listing of possible billing codes is first classified at 34 using a label-embedding attentive model (LEAM). For a given feature vector, LEAM computes a probability or percentage of confidence score for each billing code in the listing of possible billing codes. The billing codes are then ordered highest to lowest based on the corresponding confidence score” [0043].
Gopalakrishnan discloses a TF-IDF score (‘frequency score’) is computed for each token (‘feature’), “ML service 112 includes an NLP engine that converts text-based features into numerical values. For instance, a numerical value for a token may be a score that represents what the word means to the log entry versus what the word means to a list of historical events. An example approach for assigning a score is to compute a term frequency inverse-document frequency (TF-IDF) score. With TF-IDF, the score for a token increases proportionally to the frequency the token appears in a log record offset by the number of logs that include the token” [0057].).

With respect to claim 3, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches:
the computer-implemented method of claim 2, wherein the one or more values comprise a plurality of words that describe the medical safety event (Burns discloses feature vectors are comprised of strings (‘one or more features’), “a feature vector is constructed at 33 by extracting one or more features from the input record. In this example embodiment, elements in the feature vector include each string in the text description” [0042]. 
A feature vector is comprised of features (elements). Strings make up the features. Each string is comprised of text (therefore text are values corresponding to features). 
Burns discloses strings/text are comprised of words that describe medical procedures (‘medical safety events’), “to aid in the processing of procedural text and to decrease vocabulary size, the text description from the input record was processed into a standardized form. Because the text description of the medical procedure is typically hand-entered, it is subject to misspellings and frequently contains medical abbreviations and acronyms. First, misspelled words in the text description are corrected … Upon receiving a text description, each string in the text description is parsed and compared to the entries in the dictionary. If a string in the text description matches a given entry in the dictionary, the string in the text description is replaced with the proper spelling from the dictionary” [0030].),
wherein the one or more features are determined from the plurality of words (Burns discloses “each string in the standardized form of the text description is an element in the feature vector” [0026].), 
and wherein the frequency score corresponding to each feature of the one or more features comprises a term frequency-inverse document frequency (TF-IDF) score (Gopalakrishnan discloses a TF-IDF score (‘frequency score’) is computed for each token (‘feature’), “ML service 112 includes an NLP engine that converts text-based features into numerical values. For instance, a numerical value for a token may be a score that represents what the word means to the log entry versus what the word means to a list of historical events. An example approach for assigning a score is to compute a term frequency inverse-document frequency (TF-IDF) score. With TF-IDF, the score for a token increases proportionally to the frequency the token appears in a log record offset by the number of logs that include the token” [0057].). 

With respect to claim 5, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches:
The computer-implemented method of claim 1, further comprising: training the machine learning engine with a set of sample event data (Burns discloses a LEAM model (‘machine learning engine’) is used to classify billing codes, “billing codes are classified by the classifier 14 using machine learning. In one example embodiment, the billing codes are classified using a deep neural network, such as Label-Embedding Attentive models (LEAM)” [0037].
Burns discloses “in testing models on the Holdout dataset (data from an institution that was not included in the Train/Test dataset), one is able to show the generalizability of the models. … The Label-Embedding Attentive Model (LEAM) was chosen based on its ability to weight relevant words within a text sequence. While SVM model performed as well as the LEAM for the Train/Test dataset, LEAM outperformed SVM in the Holdout dataset: 95.0% (62.0% of cases) and 96.9% (48.3% of cases). This is expected as the LEAM model embeds labels, containing more information in their assignments, and thus capable of better assessment of untrained data” [0067]. See [0051] describing how the Train/Test and Holdout datasets are patient data records (‘sample event data’).), 
wherein the sample event data are tagged with one or more sample indicators (Burns discloses data records (‘sample event data’) of the Holdout and Train/Test datasets are assigned (‘tagged’) CPT codes (‘sample indicators’), “data records were gathered for all patients undergoing elective or urgent procedures with an assigned valid anesthesiology CPT code and an operative date between Jan. 1, 2014 and Dec. 31, 2016 using the Multicenter Perioperative Group (MPOG) registry. Individual institutions which contributed to this dataset ranged from large academic hospital groups to smaller community based practices. A second dataset was created for external validation and generalization of the models created in this study. This second dataset was created using data from a single institution that was not included in the previous dataset. In this disclosure, the larger multi-institution data set is referred to as “Train/Test” dataset and second single institution dataset is referred to as the “Holdout” dataset. For the Holdout dataset, patients undergoing elective or urgent procedures with an assigned valid anesthesiology CPT code and an operative date between Oct. 1, 2015 and Nov. 1, 2016 were selected” [0051].).

With respect to claim 13, the rejection of claim 1 is incorporated. The difference in scope being 
A non-transitory computer-readable medium storing program instructions that cause a data processing system to perform operations comprising (Burns discloses “The computer programs include processor-executable instructions that are stored on a non-transitory tangible computer readable medium” [0072].).

With respect to claim 14, the claim recites similar limitations corresponding to claim 2, therefore the same rationale of rejection is applicable.

With respect to claim 15, the claim recites similar limitations corresponding to claim 3, therefore the same rationale of rejection is applicable.

With respect to claim 17, the claim recites similar limitations corresponding to claim 5, therefore the same rationale of rejection is applicable.

Claims 4, 11 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Burns in view of Krishnamurti, further in view of Gopalakrishnan and Joyce et al. (US 20210263900 A1), hereinafter Joyce.

With respect to claim 4, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches:
The computer-implemented method of claim 2, wherein determining the one or more indicators from the plurality of indicator candidates comprises: determining a probability value corresponding to each indicator candidate based on the one or more features and the one or more corresponding scores (Burns discloses confidence scores (‘corresponding scores’) are probability values, “the classifier 14 can calculate a confidence score for the assigned billing code, where the confidence score quantitates confidence in the assigned billing code” [0022]. 
Burns discloses a confidence score (‘probability value’) is computed for each billing code (‘indicator candidate’) and a given feature vector (‘one or more features’), “each billing code in the listing of possible billing codes is first classified at 34 using a label-embedding attentive model (LEAM). For a given feature vector, LEAM computes a probability or percentage of confidence score for each billing code in the listing of possible billing codes” [0043].); 
However, Burns does not teach comparing a probability value with a probability threshold value and selecting indicators based on the probability threshold value, which Joyce does:
comparing the probability value corresponding to each indicator candidate with a probability threshold value (Joyce discloses identifying label proposals (‘indicator candidate’) for fields (‘features’) based on a calculated confidence level (‘probability value’), “process 1200 includes receiving (1202), by the data processing system, the one or more datasets, with the one or more datasets including a field and with the field storing values. The process 1200 includes profiling (1204), by the data processing system, the values stored in the field included in the one or more datasets. The process includes applying (1206), by the data processing system, one or more classifiers to the profiled values. The process 1200 includes identifying (1208) one or more label proposals identifying a semantic meaning for the field, with each of the one or more label proposals having a calculated confidence level and labeling the field with the label proposal having a calculated confidence level that satisfies a threshold level” [0190].); 
and selecting the one or more indicators whose corresponding probability values are greater than the probability threshold value (Joyce discloses “The process 1200 includes identifying (1208) one or more label proposals identifying a semantic meaning for the field, with each of the one or more label proposals having a calculated confidence level and labeling the field with the label proposal having a calculated confidence level that satisfies a threshold level” [0190].).
Joyce teaches labeling features with multiple labels based on a confidence threshold is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the automated system of Burns with the technique disclosed by Joyce to perform accurate multi-label classification. By using a confidence threshold, multiple labels can be assigned to features because a threshold can determine which labels have a high enough probability to be considered relevant, thereby leading to an accurate representation of complex data.

With respect to claim 11, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches the computer-implemented method of claim 1, however the combination does not teach creating and storing a new indicator candidate, which is taught by Joyce:
further comprising: creating, by the machine learning engine, a new indicator candidate (Joyce discloses “For a field, the classification module determines whether the field is already associated with a label in the label index. If a field has not yet been labeled, or if no label index exists, the classification module 605 determines that no label was found for the field. If needed, the classification module 605 generates a new label index to populate with semantic labels … the classification module 605 can determine that the field is a numeric field, a string field, or other such data type … The classification module 605 generates classified data to be sent to the testing module 606 as a classification output for finding the semantic meaning. The classified data is tagged with the data type determined by the classification module 605” [0149].
Joyce discloses “the classification module 605 can be updated over multiple iterations using machine learning approaches” [0151].); 
and storing the new indicator candidate by the hardware storage device (Joyce discloses “Joyce If a label is found, the classification module generates label data that can be passed through the testing module 606 and the results corroboration module 608. The label data informs the testing module 606 and the results corroboration module 608 that the field has already been labeled. This can be used to weight the classifiers applied to the field or suggest a label. However, the field can be re-classified by the classification module 605 and re-tested by the testing module 606 to confirm that the label is accurate and potentially update the label attributes of that label in the data dictionary database 614. For example, if the testing module 606 finds the existing label to be a poor fit, a new label can be suggested” [0150].).
Joyce teaches using machine learning to update or generate new labels for a field is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the automated system of Burns with the technique disclosed by Joyce to automate data labelling. By automating label creation with machine learning, models can generate new labels or improve existing labels to better describe the data being classified, leading to more interpretable and accurate models. 

With respect to claim 16, the claim recites similar limitations corresponding to claim 4, therefore the same rationale of rejection is applicable.

Claims 6 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Burns in view of Krishnamurti, further in view of Gopalakrishnan and Kabeya et al. (US 20190122144 A1), hereinafter Kabeya.

With respect to claim 6, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches the computer-implemented method of claim 5, however the combination does not teach obtaining sample feature vectors from the sample event data, obtaining an indicator vector, and inputting the vectors into a logistic regression classifier which is taught by Kabeya:
wherein training the machine learning engine with the set of sample event data comprises: obtaining one or more sample feature vectors from the sample event data (Kabeya discloses “event record system 150 may collect event information originating from one or more event sources, and record the collected event information to the event collection database 120 together with its timestamp as a data record. Such event sources may include, but not limited to, electronic medical record systems” [0041].
	See Figure 4 (reproduced below) depicting obtaining a set s of timestamps (‘sample feature vector’) from data records (‘sample event data’). See also Figure 5 depicting a regression model using a set of timestamps as inputs.); 
obtaining an indicator vector from the sample event data, wherein the indicator vector comprises a plurality of fields indicating a presence or absence of the plurality of indicator candidates (Kabeya discloses Figure 4 (reproduced below) depicting an input vector (‘indicator vector’) comprised of 1’s and 0’s, with 1 indicating a label (‘indicator candidate’) exists in the input set (‘sample event data’) and 0 indicating a label does not exist in the input set. 

    PNG
    media_image1.png
    648
    1126
    media_image1.png
    Greyscale

Kabeya discloses input vector as set U “there is the input vector 220 including a plurality of elements that corresponds to the predetermined label set                         
                            L
                            =
                            {
                            
                                    L
                                
                                    1
                                
                            ,
                             
                            .
                             
                            .
                             
                            .
                             
                            ,
                             
                                    L
                                
                                    N
                                
                            }
                        
                    . Each element                         
                            
                                    u
                                
                                    n
                                
                     has a value representing at least whether or not a corresponding label                         
                            
                                    L
                                
                                    n
                                
                    is observed at least once actually in the set of the data records 210. The input vector generation module 112 may set the element                         
                            
                                    u
                                
                                    n
                                
                    by one                         
                            
                                    u
                                
                                    n
                                
                    =1) when the corresponding label                         
                            
                                    L
                                
                                    n
                                
                     exists in the set of the data records                         
                            {
                            
                                    l
                                
                                    1
                                
                            ,
                             
                            .
                             
                            .
                             
                            .
                             
                            ,
                             
                                    l
                                
                                    N
                                
                            }
                        
                    . The input vector generation module 112 may set the element                        
                             
                                    u
                                
                                    n
                                
                     by zero (                        
                            
                                    u
                                
                                    n
                                
                    =0) when the corresponding label                         
                            
                                    L
                                
                                    N
                                
                    does not exist in the set of the data records                         
                            {
                            
                                    l
                                
                                    1
                                
                            ,
                             
                            .
                             
                            .
                             
                            .
                             
                            ,
                             
                                    l
                                
                                    N
                                
                            }
                        
                    ” [0055].); 
and inputting the one or more sample feature vectors and the one or more indicator vectors to a logistic regression classifier to obtain a prediction model (Kabeya discloses a regression model receives as input an input vector (‘indicator vector’) and a set of timestamps (‘sample feature vectors’), “the regression model 160 includes an input layer 162 corresponding to the predetermined label set L; and an output layer 168 configured to output the probability for a given target timestamp t*; and a network structure provided therebetween. The input layer 162 is configured to receive the input vector u and the representative timestamps s that are obtained from the set of the data records” [0059-0060]. See Figure 5 depicting the inputs of a regression model.
Kabeya discloses a regression model’s sigmoid function can be used as a classifier (‘prediction model’), “the regression model 160 employed in the described embodiment can be seen as an extension of a binary logistic regression model where a binary dependent variable (the target label exists or does not exist) is used and a sigmoid function is used as the output function, which can be used as classifier” [0067].).
Kabeya teaches using an input vector and a set of features as inputs to a logistic regression model is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the automated system of Burns with the technique disclosed by Kabeya to identify labels that do not exist in a data set. By identifying which labels are present or absent in data, a model can learn to better distinguish between features that indicate the presence of a label and features that indicate the absence of label, thereby yielding more accurate classifications.  

With respect to claim 18, the claim recites similar limitations corresponding to claim 6, therefore the same rationale of rejection is applicable.

Claims 7-8 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Burns in view of Krishnamurti, further in view of Gopalakrishnan and Kabeya, and further in view of Hagen et al. (US 20180032917 A1), hereinafter Hagen.

With respect to claim 7, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan and Kabeya teaches:
the computer-implemented method of claim 6, wherein training the machine learning engine with the set of sample event data further comprises: obtaining one or more prediction metrics from a set of test event data (Burns discloses confidence parameter and accuracy as prediction metrics, “FIG. 6 is a graph plotting Confidence Parameter (CP) against accuracy for CPT assignment. This graph shows the positive correlation between the confidence parameter assigned to each case and the accuracy of the first assigned CPT code. A testing dataset (Train/Test, solid line) and a true holdout dataset (Holdout, dashed line) are plotted. High (CP>=1.6), Medium (1.6>CP>=1.2), Low (1.2>CP) areas are shown, respectively” [0017]. See Figure 6 depicting a graph with accuracy and confidence parameters obtained from a testing dataset. See also [0043] describing that a confidence parameter is a ratio of the top two highest confidence scores.); 
	However, the combination does not teach determining a probability threshold value for a prediction model, which Hagen does:
and determining a probability threshold value for the prediction model (Hagen discloses “Training module 222 can also use machine learning classification methods (e.g., logistic regression, support vector machines, neural network, etc.) that rely on a prediction threshold for a classifier to determine whether a datum is a member of a class or not. These prediction thresholds can be tuned for recall and precision by training module 222 as described below. For example, selecting a threshold goal for recall for a classifier would specify a recall threshold that is based on predictions of a test dataset 232 for a given candidate class. A target value for the recall threshold is discovered by first sorting all of the prediction values (i.e., values predicted when the candidate class is used to classify datum of the test dataset 232) for the candidate class. The recall goal and the size of the candidate class can then be used to find the prediction value for the datum at the threshold goal size index of the sorted test dataset 232 for the candidate class. In this example, if there are 10 data elements in the test set for a candidate class and a recall goal of 0.9, the test set should be sorted by prediction values and then select the prediction value of the 9th element” [0024 - 0025]).
	Hagen teaches tuning prediction thresholds (‘probability threshold value’) for recall and precision is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the automated system of Burns with the prediction threshold disclosed by Hagen to reduce the number of incorrectly classified data. By using a prediction threshold, machine learning engineers can adjust the rate at which samples are classified based on their needs, thereby allowing engineers to balance the trade-off between the number of false positives and false negatives.  

With respect to claim 8, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan and Kabeya, and further in view of Hagen teaches:
The computer-implemented method of claim 7, wherein the one or more prediction metrics comprise: a precision threshold value and a recall threshold value (Hagen discloses “Training module 222 tunes classifiers with prediction thresholds. For example, classifiers can be trained as described above with respect to instructions 122 and 124. Training module 222 can grade classifiers based on their F-scores, which is the harmonic mean between precision and recall. Precision is the quotient of (true positives)/(true positives+false positives) and, thus, is maximized when false positives are minimized. Recall is the quotient of (true positives)/(true positives+false negatives) and is maximized when false negatives are minimized. In some cases, the hierarchy of classifiers 230 alternates tuning the classifiers for precision and recall within the classifier hierarchy” [0023]. See [0024-0026] describing how recall and precisions threshold goals are set.).
	Hagen teaches tuning recall and precision thresholds is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the automated system of Burns with the recall and precisions thresholds disclosed by Hagen to reduce the number of incorrectly classified data. By adjusting recall and precision thresholds, machine learning engineers can adjust the rate at which samples are classified based on their needs, thereby allowing engineers to balance the trade-off between the number of false positives and false negatives and develop a reliable model. 

With respect to claim 19, the claim recites similar limitations corresponding to claim 7, therefore the same rationale of rejection is applicable.

With respect to claim 20, the claim recites similar limitations corresponding to claim 8, therefore the same rationale of rejection is applicable.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Burns in view of Krishnamurti, further in view of Gopalakrishnan and Hagen.

With respect to claim 9, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches the computer-implemented method of claim 1, however the combination does not teach determining parent and child indicators, which is taught by Hagen:
wherein determining the one or more indicators from the plurality of indicator candidates comprises: determining a parent indicator from the plurality of indicator candidates (Hagen discloses parent indicators as superclasses “class A” and “not class A”, “FIG. 4 is an example hierarchy of classifiers 400 that can be trained and applied to observed data. The hierarchy of classifiers 400 shows classifiers hierarchically arranged into superclasses and subclasses. The root of hierarchy 400 is class A classifier 402A and not class A classifier 402B, which can be associated with, for example, a recall threshold. When observed data is processed at class A classifiers 402A, 402B, the recall threshold is applied to optimize the classification. In this case, the recall threshold minimizes false negatives (i.e., false “not class A′s”), and observed data is classified as either class A 404 or “not class A”” [0035].); 
and determining a child indicator from the plurality of indicator candidates based on the parent indicator (Hagen discloses child indicators as subclasses “class B”, “class C” and “not class A, B, C”, “Data classified as “Not class A” is processed by class B classifier 406A, class C classifier 406B, not class A, B, C classifier 406C. In this example, the “not class A” classifiers 406A, 406B, 406C can be associated a precision threshold. The precision threshold is applied when observed data is classified as class B 408, class C410, or “not class A, B, C” 412” [0036]. See Figure 4 depicting classifying data with superclasses (‘parent indicators’) and subclasses (‘child indicators’).), 
wherein the one or more indicators are determined as at least one of the parent indicator or the child indicator (See Figure 4 depicting “class A” and “not class A” being used (‘determined’) as superclasses (‘parent indicators’). Figure 4 also depicts classes B, C, and “not class A, B, C” as subclasses (‘child indicators’).).
	Hagen teaches classifying data into superclasses and subclasses is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the automated system of Burns with the hierarchical classifier of Hagen to classify data hierarchically. By classifying data hierarchically, data is able to be grouped into broad categories and assigned specific labels, resulting in data that is more structured and interpretable for decision-making. 

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Burns in view of Krishnamurti, further in view of Gopalakrishnan and Goravar et al. (US 20240079102 A1), hereinafter Goravar.

With respect to claim 10, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches the computer-implemented method of claim 1, however the combination does not teach receiving a query and searching for tagged event data which is taught by Goravar:
further comprising: receiving a query that comprises a given event indicator (Goravar discloses a user may enter (‘query’) desired entities (‘given event indicators’), “the extracted instances of the desired entities may be assembled into a data structure, where the data structure may be faster and more efficient for the patient summary system to search than the labeled text content. … For example, the caregiver may enter a set of desired entities into the patient summary system, and the patient summary system may enter each of the desired entities into a respective entity recognition model. Outputs of the respective entity recognition models may be aggregated and refined as described above, to generate the labeled text content. The instances of the desired entities in the labeled text content may be assembled into the data structure. The patient summary system may select, for example, via a stored configuration or preference of the user, a desired format of the patient summary” [0085].
Goravar further discloses “a user of the patient summary system may wish to see a listing of all instances of the entity “cancer” in the text content. The patient summary system may request a list of instances of the entity “cancer” found in the text content from the relational database for which a number of instances is greater than 0” [0087].); 
searching the tagged event data in the hardware storage device (Goravar discloses instances of desired entities are searched for in labeled text content (‘tagged event data’), “Outputs of the respective entity recognition models may be aggregated and refined as described above, to generate the labeled text content. The instances of the desired entities in the labeled text content may be assembled into the data structure … the patient summary system may search for the instances of the desired entities in the data structure, and may generate the patient summary in accordance with the desired format, based at least partially on data retrieved from the data structure. Because the data structure may be searched more quickly and efficiently than the labeled text content, a speed with which the patient summary may be generated may be increased” [0085]. Searching for instances of entities in labeled text content implies stored data which further implies a hardware storage device.); 
and displaying a search result of event data that are tagged with the given event indicator (Goravar discloses “method 400 includes generating a summary of the labeled version of the medical report from the aggregated labeled text data outputted by the model, where the summary summarizes patient information related to the one or more desired entities. To generate the summary, the patient summary system may extract instances of the desired entities, which may be identified by labels as described above, and generate text content based on the entities to display to a caregiver. The text content may include, for example, numbers and types of entities and instances included in the medical report, excerpts of labeled text of the medical report, and/or additional patient data relating to the extracted entities” [0084].).
	Goravar teaches receiving a list of desired entities (labels) and retrieving text content based on the desired labels is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the automated system of Burns with the technique disclosed by Goravar to allow users to retrieve relevant data based on labels. By using labels to organize data, users are able to efficiently lookup and retrieve data based on specific criteria or categories, thereby reducing the time and effort required for data retrieval and analysis. 

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Burns in view of Krishnamurti, further in view of Gopalakrishnan and Denney et al. (US 20140015855 A1), hereinafter Denney.

With respect to claim 12, the combination of Burns in view of Krishnamurti, further in view of Gopalakrishnan teaches the computer-implemented method of claim 1, however the combination does not teach performing k-means clustering analysis, which is taught by Denney:
further comprising: performing k-means clustering analysis according to the one or more features (Denney discloses “method includes an initialization phase in block 410, where k distinct labeled descriptors are chosen at random as initial cluster centroids. In some embodiments, the k labeled descriptors are selected to increase the diversity of the initial cluster centroids, for example using k-means++. The initial assigning of a selected labeled descriptor as a cluster centroid may be accomplished by randomly selecting one of the labels assigned to a descriptor from the label assignment binary vector of the descriptor. In order to make the label as specific as possible, if a descriptor has multiple labels and the selected label has one or more child labels assigned to the descriptor (as defined by a label hierarchy), then the cluster label may be changed to a selected one of the child labels. In some embodiments, the child label selection is done at random. This process may be repeated until the selected label has no child labels. This process provides a more specific label as the initial cluster label centroid. Care must be taken to ensure that no two identical centroids (descriptors and selected labels) are repeated in the initial clusters” [0040]. See Figure 4 depicting a method for clustering using k-means.).
	Denney teaches using k-means for clustering descriptors (‘features’) is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the automated system of Burns with the k-means clustering technique disclosed by Denney to increase the interpretability of classification results. By using k-means to produce clusters, classification results are more interpretable which can help detect outliers and identify underlying patterns in the data. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEDRO J MORALES whose telephone number is (571)272-6106. The examiner can normally be reached 8:30 AM - 6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MIRANDA M HUANG can be reached at (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/PEDRO J MORALES/Examiner, Art Unit 2124                                                                                                                                                                                                        
/VINCENT GONZALES/Primary Examiner, Art Unit 2124
Read full office action
Prosecution Timeline

Dec 23, 2022
Application Filed
Oct 23, 2025
Non-Final Rejection mailed — §103
Jan 20, 2026
Examiner Interview Summary
Jan 20, 2026
Applicant Interview (Telephonic)
Jan 22, 2026
Response Filed
Mar 31, 2026
Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/771,051
Patent 12639625
BIAS ADJUSTMENT DEVICE, INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING PROGRAM
4y 1m to grant Granted May 26, 2026
17/724,539
Patent 12591803
SYSTEMS AND METHODS FOR APPLYING MACHINE LEARNING BASED ANOMALY DETECTION IN A CONSTRAINED NETWORK
3y 11m to grant Granted Mar 31, 2026
17/514,297
Patent 12530412
SEARCH-QUERY SUGGESTIONS USING REINFORCEMENT LEARNING
4y 2m to grant Granted Jan 20, 2026
17/840,851
Patent 12524673
MULTITASK DISTRIBUTED LEARNING SYSTEM AND METHOD BASED ON LOTTERY TICKET NEURAL NETWORK
3y 7m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 4 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
67%
Grant Probability
99%
With Interview (+50.0%)
3y 9m (~3m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 9 resolved cases by this examiner. Grant probability derived from career allowance rate.