Last updated: April 19, 2026
Application No. 17/910,556
METHOD AND SYSTEM OF FOR PREDICTING DISEASE RISK BASED ON MULTIMODAL FUSION

Non-Final OA §101§102§103§112
Filed
Sep 09, 2022
Examiner
SANFORD, DIANA PATRICIA
Art Unit
1687
Tech Center
1600 — Biotechnology & Organic Chemistry
Assignee
Shandong University
OA Round
1 (Non-Final)
Interview Optional

— +25.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 6 resolved cases, 2023–2026
Examiner Intelligence

SANFORD, DIANA PATRICIA View full profile →
Grants 83% — above average
Career Allow Rate
5 granted / 6 resolved
+23.3% vs TC avg
Strong +25% interview lift
Without
With
+25.0%
Interview Lift
resolved cases with interview
Typical timeline
4y 8m
Avg Prosecution
40 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
31.6%
-8.4% vs TC avg
§103
29.9%
-10.1% vs TC avg
§102
9.9%
-30.1% vs TC avg
§112
25.8%
-14.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 6 resolved cases
Office Action

§101 §102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Claims 1-10 are pending and under consideration in this action. 

Priority
The instant application is a 371 of PCT/CN2021/106860, filed 7/16/2021, which claims priority to Chinese Application Number 202110486200.2, filed 4/30/2021, as reflected in the filing receipt mailed on 3/13/2024. Acknowledgment is made of Applicant's claim for foreign priority under 35 U.S.C. 119 (a)-(d). Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55. The claims to the benefit of priority are acknowledged and the effective filing date of claims 1-10 is 4/30/2021. 

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 9/9/2022 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the IDS has been considered by the examiner.

Claim Objections
Claims 3-4 and 6-10 are objected to because of the following informalities:
Claim 3 is missing an “and” between the limitation “during the prediction, …then performing the prediction of disease risk by a Softmax classifier;” and “adopting a weighing…”.
Claims 4 and 6-9 recite the phrase “…and removing dirty read”, which should be corrected to “…and removing dirty reads”, for clarity.
Claim 3, 6, and 8-10 recite the phrase “into a/the fully connected dence layer” which should be corrected to “into a/the fully connected dense layer” for clarity.
Claim 6 is also missing a semicolon between the limitations “the system further comprises…removing dirty read and” and “the system further comprises…prediction results of disease risk”.
Claims 9 and 10 are missing a semicolon after the limitation “the operation of extracting the fusion features comprises … extracting fusion features by using the piecewise pooling operation”.
Appropriate correction is required.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 4 and 7-10 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 4 and 7-10 recites the phrase “…and removing dirty read”. The term “dirty” in claim 4 is a relative term which renders the claim indefinite. The term “dirty” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The Specification (Pg. 10, Lines 7-11) reiterates that the data cleaning module performs operations to remove dirty reads. However, the Specification does not provide, for example, any parameters that define what makes a read “dirty”. This rejection can be overcome by amendment of claims 4, and 7-10 to clarify the what defines a read dirty. 
Claims 9 and 10 recites the phrases “the FCN is used for extracting structured data features”, “the BERT is used for extracting unstructured features”, “connecting the unstructured data features and the structured data features in parallel along the specified dimension, reducing the imbalance rate through the method of analyzing minority class sample data, and newly generating the sample of the class by using the SMOTE, and then extracting fusion features by using the piecewise pooling operation”, “obtaining EHR data of the patient with the known disease risk outcome”, “building the label set based on the known outcome”, “connected with the feature fusion module in series at the decision layer”, and “implemented based on the Pytorch framework” in lines 17-23, 27, 30, and 35-36 (claim 9) or lines 18-23, 28, 31, and 36-37 (claim 10), respectively. There is insufficient antecedent basis for these limitations in the claim, since there is no prior mention of these phrases in claim 1, to which this claim depends. This rejection can be overcome my amendment of claims 9 and 10 to recite “an FCN is used for extracting structured data features”, “a BERT is used for extracting unstructured features”, “connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data, and newly generating the sample of the class by using a SMOTE, and then extracting fusion features by using a piecewise pooling operation”, “obtaining EHR data of a patient with a known disease risk outcome”, “building a label set based on the known outcome”, “connected with the feature fusion module in series at a decision layer”, and “implemented based on a Pytorch framework”. 
Applicant is kindly reminded that any amendment must find adequate support in the Specification as originally filed.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite both (1) mathematical concepts (mathematical relationships, formulas or equations, or mathematical calculations) and (2) mental processes, i.e., concepts performed in the human mind (including observations, evaluations, judgements or opinions) (see MPEP § 2106.04(a)).
Step 1:
	In the instant application, claims 1-4 and 7-8 are directed towards a process, claims 5-6 are directed towards a system, and claim 9 is directed towards a machine, which falls into one of the categories of statutory subject matter (Step 1: YES).
Claim 10 is directed towards a computer readable storage medium, which does not fall within one of the categories of statutory subject matter (Step 1: NO). Regarding claim 10, the BRI of computer-readable medium encompasses non-statutory forms of signal transmission and therefore equates to “signals per se”. Claims that equate to “signals per se” are not a statutory category of invention (see MPEP § 2106.03). However, claim 10 could be amended to be statutory subject matter by replacing the phrase “computer readable storage medium” with the phrase “non-transitory computer-readable storage medium”. Nonetheless, this amendment would still result in a rejection of the claim under 35 U.S.C 101 for recitation of a judicial exception without significantly more. In the interest of compact prosecution, claim 10 has been analyzed using the Alice/Mayo two-part test below. 
Step 2A, Prong One:
	In accordance with MPEP § 2106, claims found to recite statutory subject matter (Step 1: YES) are then analyzed to determine if the claims recite any concepts that equate to an abstract idea, law of nature or natural phenomenon (Step 2A, Prong One). The following instant claims recite limitations that equate to one or more categories of judicial exceptions:
Claim 1 recites an mathematical concept (i.e., inputting data into a model) in “inputting the EHR data into a disease risk prediction model to obtain a disease risk prediction result”; a mental process (i.e., an evaluation of the data for extraction) in “extracting structured data features and unstructured data features”; a mental process (i.e., combining data and an evaluation of data for extraction) in “fusing the structured data features and the unstructured data features, and extracting fusion features”; and a mental process (i.e., an evaluation of the fusion features) in “decision-making on the fusion features to obtain the disease risk prediction result”. 
Claim 3 recites a mathematical concept in “wherein, an operation of extracting the fusion features comprises: connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data and newly generating a sample of the class by using Synthetic Minority Oversampling Technique (SMOTE), then extracting the fusion features by using a piecewise pooling operation”; a mathematical concept in “performing the prediction of disease risk by a Softmax classifier”; and a mathematical concept in “adopting a weighting of a cross-entropy loss and a hinge loss to jointly constrain the disease risk prediction model”.
Claim 4 recites a mental process (i.e., an evaluation of data to remove outliers/dirty reads) and a mathematical concept (i.e., calculating means to replace values) “wherein, the prediction model of disease risk further comprises a step of performing a data cleaning before extracting the structured data features and the unstructured data features; the data cleaning comprises replacing outlier values, completing missing values using mean values, and removing dirty read”; and a mental process (i.e., an observation of the data) in “the unstructured data is a text”. 
Claim 5 recites a mental process (i.e., an evaluation of data for extraction) in “a feature extraction module, for extracting features on EHR data to obtain unstructured data features and structured data features”; a mental process (i.e., combining data and an evaluation of data for extraction) in “a feature fusion module, for fusing the unstructured data features and the structured data features to extract and obtain fusion features”; and a mental process (i.e., classifying input data to obtain disease risk) in “a classification module, for obtaining a disease risk prediction result by using the extracted fusion features as an input”.
Claim 6 recites a mental process (i.e., an evaluation of the extraction module) in “wherein, the feature extraction module comprises a structured data feature extraction module and an unstructured data feature extraction module”; a mathematical concept in “the feature fusion module connects the unstructured data features and the structured data features in parallel along a specified dimension, reduces an imbalance rate through a method of analyzing minority class sample data and newly generating the sample of the class by using a SMOTE, and then extracts the fusion features by using a piecewise pooling operation”; a mathematical concept in “predicts an outcome of a patient through a Softmax classifier”; a mental process (i.e., an evaluation of data to remove outliers/dirty reads) and a mathematical concept (i.e., calculating means to replace values) in “data cleaning module for preprocessing the EHR data after obtaining the EHR data and before performing the feature extraction on the EHR data; wherein, the preprocessing comprises the EHR data cleaning module performing operations of replacing outlier values, completing missing values using mean values, and removing dirty reads”. 
Claims 7 and 9-10 recite a mental process (i.e., an evaluation of the data to perform processing/cleaning) in “performing data processing on structured data and unstructured data separately, including performing data cleaning to obtain cleaned structured data and cleaned unstructured data”; a mental process (i.e., an evaluation of data for extracting features) in “performing feature extraction to obtain unstructured data features and structured data features”; a mental process (i.e., combining data and evaluating the data to extract features) in “fusing unstructured data features and structured data features, and extracting fused features”; a mental process (i.e., an evaluation of the data for use medicine) in “using the fused features as data to be identified for medical purposes”; a mental process (i.e., an evaluation of data to remove outliers/dirty reads) and a mathematical concept (i.e., calculating means to replace values) in “the data cleaning comprises replacing outlier values, completing missing values using mean values, and removing dirty reads”; a mental process (i.e., an observation of the unstructured data) in “the unstructured data is text”; and a mathematical concept in “connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data, and newly generating the sample of the class by using a SMOTE, and then extracting fusion features by using a piecewise pooling operation”.
Claims 8 and 9-10 recite a mathematical concept (i.e., building a dataset or label set) in “building a dataset based on the obtained EHR data, the dataset comprises a structured dataset and an unstructured dataset; and building a label set based on a known outcome”; a mental process (i.e., an evaluation of the data to extract features or fuse data) in “building a feature extraction module for extracting features of the structured data, a feature extraction module for extracting features of the unstructured data and a feature fusion model”; a mathematical concept (i.e., building the network in parallel or in series) in “connecting the structured data feature extraction module and the feature extraction module unstructured data in parallel and then being connected with the feature fusion module in series at a decision layer”; a mental process (i.e., an observation of the implementation) in “the disease risk prediction network is implemented based on a Pytorch framework”; a mental process (i.e., an evaluation of data to remove outliers/dirty reads) and a mathematical concept (i.e., calculating means to replace values) in “before the dataset being built, further comprising a step of performing the data cleaning on the obtained EHR data, wherein the data cleaning comprises replacing outlier values, completing missing values using mean values, and removing dirty reads”; a mathematical concept in “connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data and newly generating samples of the class by using a SMOTE, and then extracting the fusion features by using a piecewise pooling operation”; and a mathematical concept in “when using the dataset for training, inputting the fusion features, as an input, into a fully connected dense layer to train a Softmax classifier”.
These recitations are similar to the concepts of collecting information, and displaying certain results of the collection and analysis is Electric Power Group, LLC, v. Alstom (830 F.3d 1350, 119 USPQ2d 1739 (Fed. Cir. 2016)), comparing information regarding a sample or test to a control or target data in Univ. of Utah Research Found. v. Ambry Genetics Corp. (774 F.3d 755, 113 U.S.P.Q.2d 1241 (Fed. Cir. 2014)) and Association for Molecular Pathology v. USPTO (689 F.3d 1303, 103 U.S.P.Q.2d 1681 (Fed. Cir. 2012)), and organizing and manipulating information through mathematical correlations in Digitech Image Techs., LLC v Electronics for Imaging, Inc. (758 F.3d 1344, 111 U.S.P.Q.2d 1717 (Fed. Cir. 2014)) that the courts have identified as concepts that can be practically performed in the human mind or mathematical relationships.
	The abstract ideas recited in the claims are evaluated under the broadest reasonable interpretation (BRI) of the claim limitations when read in light of and consistent with the specification, and are determined to be directed to mental processes that in the simplest embodiments are not too complex to practically perform in the human mind. Additionally, the recited limitations that are identified as judicial exceptions from the mathematical concepts grouping of abstract ideas are abstract ideas irrespective of whether or not the limitations are practical to perform in the human mind. 
Specifically, independent claims 1 and 5 involve nothing more than inputting data into a model, extracting data features, and drawing conclusions based on the features. Since there are no specifics in the methodology recited in claims 1 and 5, the step reciting “into a disease risk prediction model to obtain a disease risk prediction result” is, under the BRI, performed using mathematical operations. Additionally, since there are no specifics recited, the steps of extracting features, and performing decision making on the features is something, that under BRI, one could perform mentally. Therefore, the claimed steps are not further defined beyond something that reads on performing a calculation using a computer as a tool, and merely looking at data and making a determination. 
Independent claim 7 involves nothing more that processing data, cleaning data, extracting data features, and using an algorithm to process the data features. Since there are no specifics recited, the steps of processing data, cleaning data, and extracting data features is something, that under BRI, one could perform mentally. Additionally, the SMOTE algorithm is a mathematical operation performed using a computer as a tool. Therefore, the claimed steps are not further defined beyond something that reads on analyzing and extracting data, and performing a calculation using a computer as a tool.
Independent claim 8 involves nothing more than building datasets, building a model including extracting features from data, data cleaning, and using an algorithm to process the data features. Since there are no specifics recited, building the dataset and building the model including extracting features from the data, and data is something that under BRI, one could perform mentally, or using a generic computer as a tool. Additionally, the SMOTE algorithm and Softmax classifier are mathematical operations performed using a computer as a tool. Therefore, the claimed steps are not further defined beyond something that reads on building dataset, building a model, extracting data, and performing a calculation using a computer as a tool. As such, said steps are directed to judicial exceptions. The instant claims must therefore be examined further to determine whether they integrate the abstract idea into a practical application (Step 2A, Prong One: YES).
Step 2A, Prong Two:
In determining whether a claim is directed to a judicial exception, further examination is performed that analyzes if the claim recites additional elements that when examined as a whole integrates the judicial exception(s) into a practical application (MPEP § 2106.04(d)). A claim that integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception. The claimed additional elements are analyzed to determine if the abstract idea is integrated into a practical application (MPEP § 2106.04(d)(I)). If the claim contains no additional elements beyond the abstract idea, the claim fails to integrate the abstract idea into a practical application (MPEP § 2106.04(d)(III)). The following independent claims recite limitations that equate to additional elements:
Claim 1 recites “obtaining electronic health record (EHR) data of a patient, comprising structured and unstructured data”; and “outputting the disease risk prediction result”.
Claim 7 recites “obtaining EHR data, the EHR data comprising structured data and unstructured data”; “an FCN is used for extracting structured data features”; and “a BERT is used for extracting unstructured features”.
Claim 8 recites “obtaining EHR data of a patient with a known disease risk outcome, the data comprising structured data and unstructured data”; “training the disease risk prediction network using the datasets (the structured dataset and the unstructured dataset) with the label set as a label to build the disease risk prediction model”; “the structured data feature extraction module is an FCN module”; and “the unstructured data feature extraction module is a BERT module”.
Regarding the above cited limitations in claims 1, 7, and 8 of (i) obtaining electronic health record data, comprising structured and unstructured data (claims 1, 7, and 8); (ii) an FCN is used for extracting structured data features (claims 7 and 8); (iii) a BERT is used for extracting unstructured features (claims 7 and 8); and (iv) training the disease risk prediction network using the datasets (the structured dataset and the unstructured dataset) with the label set as a label to build the disease risk prediction model (claim 8). These limitations equate to insignificant, extra-solution activity of mere data gathering because these limitations gather data before or after the recited judicial exceptions of using a disease risk prediction model to obtain a disease risk prediction result (see MPEP § 2106.04(d)). 
Regarding the above cited limitations in claim 1 of (v) outputting the disease risk prediction result. This limitation equates to an extra-solution activity of generally outputting a result, which is incidental to the primary process of using a disease risk prediction model to obtain a disease risk prediction result (see MPEP § 2106.05(g)). 
Additionally, none of the recited dependent claims recite additional elements which would integrate the judicial exception into a practical application. Specifically, claim 2 further limits the extraction of structured and unstructured data features; claim 3 further limits the prediction using fusion features as input; claim 6 further limits the structured and unstructured data extraction modules and the prediction using fusion features as input, as well as recites extra-solution activities analogous to claim 1 above; and claims 9-10 recite generic computer components that equate to mere instructions to implement an abstract idea on a generic computer. As such, claims 1-10 are directed to an abstract idea (Step 2A, Prong Two: NO). 
Step 2B: 
Claims found to be directed to a judicial exception are then further evaluated to determine if the claims recite an inventive concept that provides significantly more than the judicial exception itself (Step 2B). The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The instant independent claims recite the same additional elements described in Step 2A, Prong Two above.
Regarding the above cited limitations in claims 1, 7, and 8 of (i) obtaining electronic health record data, comprising structured and unstructured data. These limitations do not include any specific steps for generating or acquiring the unstructured/structured electronic health record data. Under the BRI, these limitations are merely receiving data for the subsequent step of using a disease risk prediction model to obtain a disease risk prediction result. Therefore, these limitations equate to receiving/transmitting data over a network, which the courts have established as a WURC limitation of a generic computer in buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014). 
Regarding the above cited limitations in claims 1, 7, and 8 of (ii) an FCN is used for extracting structured data features (claims 7 and 8); (iii) a BERT is used for extracting unstructured features (claims 7 and 8); (iv) training the disease risk prediction network using the datasets (the structured dataset and the unstructured dataset) with the label set as a label to build the disease risk prediction model (claim 8); and (v) outputting the disease risk prediction result (claim 1). These limitations when viewed individually and in combination, are WURC limitations as taught by Wang et al. (Integrative Analysis of Patient Health Records and Neuroimages via Memory-Based Graph Convolutional Network. 2018 IEEE International Conference on Data Mining (ICDM), pg. 767-776 (2018)), Blinov et al. (Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-Based Neural Networks. In: Artificial Intelligence in Medicine. AIME 2020. Lecture Notes in Computer Science, Vol 12299. Springer, Cham. Pg. 111-121 (2020)), and Zhang et al. (Combining structured and unstructured data for predictive models: A deep learning approach. BMC Med Inform Decis Mak. 20(1): 280 (2020)). Wang et al. discloses a method for classifying disease based on electronic health records, and uses an FCN for feature extraction of structured data (limitation (ii)) (Wang et al., Abstract and Pg. 772, Col. 2, Para. 3). Blinov et al. discloses a method of predicting diagnoses from textual electronic health records data using a bidirectional encoder representation from transformers (BERT) (limitation (iii)) (Blinov et al., Abstract). Zhang et al. discloses a deep learning method for combining structured and unstructured data from electronic health records for risk prediction tasks (Zhang et al., Abstract). Zhang et al. further discloses that the deep learning models are trained (limitation (iv)) (Zhang et al., Pg. 6, Col. 2, Para. 5-6). Zhang et al. further discloses the output performance for in-hospital mortality prediction (limitation (v)) (Zhang et al., Pg. 7, Table 2).
These additional elements do not comprise an inventive concept when considered individually or as an ordered combination that transforms the claimed judicial exception into a patent-eligible application of the judicial exception. Therefore, the instant claims do not amount to significantly more than the judicial exception itself (Step 2B: NO). As such, claims 1-10 are not patent eligible. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 5, and 9-10 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Zhang et al. (Combining structured and unstructured data for predictive models: A deep learning approach. BMC Med Inform Decis Mak. 20(1): 280 (2020); published 10/29/2020).
Regarding claim 1, Zhang et al. teaches a deep learning method for combining structured and unstructured data from electronic health records for risk prediction tasks (i.e., a method for predicting disease risk based on multimodal fusion) (Abstract). Zhang et al. further teaches the use of Medical Information Mart for Intensive Care III (MIMIC-III), a publicly available critical care database as input for the model. MIMIC-III comprises deidentified health-related data associated with over forty thousand patients who stayed in critical care units. This database includes patient health information such as demographics, vital signs, lab test results, medications, diagnosis codes, as well as clinical notes (i.e., obtaining electronic health record (EHR) data of a patient, comprising structured data and unstructured data) (Pg. 2, Col. 2, Para. 5). Zhang et al. further teaches that one of the benchmark prediction tasks is in-hospital mortality prediction (Pg. 3, Col. 2, Para. 1-2). The unstructured text and structured data are input into the CNN-based fusion-CNN to make predictions (i.e., inputting the EHR data into a disease risk prediction model to obtain a disease risk prediction result) (Pg. 5, Fig. 1). Zhang et al. further teaches that the output layer takes patient representation as input and makes predictions. Table 2 shows the performance of various models on the in-hospital mortality prediction task (i.e., outputting the disease prediction risk) (Pg. 6, Col. 1, Para. 6; Pg. 8, Col. 2, Para. 2; and Pg. 7, Table 2). Zhang et al. further teaches that patient features consist of features from both structured data (static information and temporal signals) and unstructured data (clinical text). For demographic information (i.e., structured data), patient’s age, gender, marital status, ethnicity, and insurance information are considered. For admission-related information (i.e., structured data), admission type is included as features. For temporal signals (i.e., structured data), they considered 7 frequently sampled vital signs: heart rate, systolic blood pressure (SysBP), diastolic blood pressure (DiasBP), mean arterial blood pressure (MeanBP), respiratory rate, temperature, SpO2; and 19 common lab tests. After feature selection, they extracted values of these time-series features up to the first 24 hours of each hospital admission. For clinical notes (i.e., unstructured data), they considered Nursing, Nursing/Other, Physician, and Radiology notes, because these kinds of notes are in the majority of clinical notes and are frequently recorded in MIMIC-III database. They only extracted the first 24 hours’ notes for each admission to enable early prediction of outcomes (i.e., extracting structured data features and unstructured data features) (Pg. 3, Col. 1, Para. 1-4). Zhang et al. further teaches that the final patient representation z is obtained by concatenating the representations of clinical text, temporal signals, along with static information. The representation of each patient is                         
                            
                                    z
                                
                                    p
                                
                            =
                            
                                            z
                                        
                                            s
                                            t
                                            a
                                            t
                                            i
                                            c
                                        
                                    ;
                                    
                                            z
                                        
                                            t
                                            e
                                            m
                                            p
                                            o
                                            r
                                            a
                                            l
                                        
                                    ;
                                    
                                            z
                                        
                                            t
                                            e
                                            x
                                            t
                                        
                    , the size of this vector is                         
                            
                                    d
                                
                                    s
                                    t
                                    a
                                    t
                                    i
                                    c
                                
                     +                         
                            
                                    d
                                
                                    t
                                    e
                                    m
                                    p
                                    o
                                    r
                                    a
                                    l
                                
                     +                         
                            
                                    d
                                
                                    t
                                    e
                                    x
                                    t
                                
                    . The patient representation is then fed to a final output layer to make predictions (i.e., fusing the structured data features and the unstructured data features and extracting fusion features) (Pg. 6, Col. 1, Para. 5 and Pg. 5, Fig. 1-2). Zhang et al. further teaches that the output layer takes patient representation as input and makes predictions. For each patient representation                         
                            
                                    z
                                
                                    p
                                
                    , they have a task-specific target                         
                            y
                            .
                            y
                            ∈
                            
                                    0,1
                                
                     is a single binary label indicating whether the in-hospital mortality occurs. For each prediction task, the output layer receives an instance of patient representation                         
                            
                                    z
                                
                                    p
                                
                     as input and tries to predict the ground truth                         
                            y
                        
                    . For binary classification tasks, the output layer is:                         
                            o
                            =
                            σ
                            (
                            W
                            
                                    z
                                
                                    p
                                
                            +
                            b
                            )
                        
                    . The                         
                            W
                        
                     matrices and                         
                            b
                        
                     vectors are the trainable parameters, and                         
                            σ
                        
                     represents a sigmoid activation function (i.e., decision-making on the fusion features to obtain the disease risk prediction result) (Pg. 6, Col. 1, Para. 6 – Col. 2, Para. 2).
Regarding claim 5, Zhang et al. teaches the limitations of a feature extraction module, for extracting features on EHR data to obtain unstructured data features and structured data features; a feature fusion module, for fusing the unstructured data features and the structured data features to extract and obtain fusion features; and a classification module for obtaining a disease risk prediction result by using the extracted fusion features as input as described for claim 1 above. 
Regarding claim 9, Zhang et al. teaches that that all experiments were performed on a 32-core Intel(R) Core(TM) i9-9960X CPU @ 3.10GHz machine with NVIDIA TITAN RTX GPU processor (i.e., a computer device, comprising a memory and a processor, the memory storing a computer program, wherein, when the computer program being executed by the processor, implement steps of a method) (Pg 10, Col. 1, Para. 4). Zhang et al. further teaches the steps of a method as claimed in claim 1 as described for claim 1 above. 
Regarding claim 10, Zhang et al. teaches that all experiments were performed on a 32-core Intel(R) Core(TM) i9-9960X CPU @ 3.10GHz machine with NVIDIA TITAN RTX GPU processor (i.e., a computer readable storage medium having stored thereon computer program instructions, wherein, when the computer program instructions being executed by a processor, implement steps of a method) (Pg 10, Col. 1, Para. 4). Zhang et al. further teaches the steps of a method as claimed in claim 1 as described for claim 1 above. 
Therefore, Zhang et al. teaches all the limitations in claims 1, 5, and 9-10. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
1.  Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. as applied to claims 1, 5, and 9-10 above, and further in view of Wang et al. (Integrative Analysis of Patient Health Records and Neuroimages via Memory-Based Graph Convolutional Network. 2018 IEEE International Conference on Data Mining (ICDM), Singapore, pp. 767-776 (2018); published 12/30/2018) and Blinov et al. (Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-Based Neural Networks. In: Artificial Intelligence in Medicine. AIME 2020. Lecture Notes in Computer Science, Vol 12299. Springer, Cham. Pg. 111-121 (2020); published 9/26/2020).
Zhang et al., as applied to claims 1, 5, and 9-10 above, does not teach wherein, using a Fully Convolutional Network (FCN) to extract the structured data features; and using a Bidirectional Encoder Representation from Transformer (BERT) to extract the unstructured features.
Regarding claim 2, Wang et al. teaches a method for classifying disease based on electronic health records and medical images for patients (Abstract). Wang et al. further teaches the use of an FCN for feature extraction for the raw connectivity matrix (i.e., using a Fully Convolutional Network (FCN) to extract the structured data features) (Pg. 772, Col. 2, Para. 3).
Regarding claim 2, Blinov et al. teaches a method of predicting clinical diagnoses from textual Electronic Health Records data. They present a modification of Bidirectional Encoder Representations from Transformers (BERT) model for sequence classification that implements a novel way of Fully-Connected (FC) layer composition and a BERT model pretrained only on domain data (Abstract). The architecture of the model is shown in Fig. 1 (i.e., using a Bidirectional Encoder Representation from Transformer (BERT) to extract the unstructured features) (Pg. 116, Fig. 1).
Therefore, regarding claim 2, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the FCN of Wang et al. because the method of Wang et al. improves the classification of Parkinson’s disease patients over healthy controls compared to conventional approaches (Wang et al., Pg. 775, Col. 1, Para. 1). It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modified the deep learning method of Zhang et al. with the BERT model of Blinov et al. because using a BERT model with a vocabulary and pretraining dataset tailored to medical texts representation improves performance of the classification tasks, specifically for less prevalent diseases (Blinov et al., Pg. 120, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Wang et al. and Blinov et al. with reasonable expectation of success due to the same nature of the problem to be solved, since all three are drawn towards a method for extracting unstructured and/or structured data from electronic health records. Therefore, regarding claim 2, the instant invention is prima facie obvious (MPEP § 2142).

2.  Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. as applied to claims 1, 5, and 9-10 above, and further in view of Kim et al. (A Comparison of Oversampling Methods for Constructing a Prognostic Model in the Patient with Heart Failure, 2020 International Conference on Information and Communication Technology Convergence (ICTC) (2020), Pg. 379-83; published 12/21/2020), Wang et al. (Integrative Analysis of Patient Health Records and Neuroimages via Memory-Based Graph Convolutional Network. 2018 IEEE International Conference on Data Mining (ICDM), pp. 767-776 (2018); published 12/30/2018), and Liu et al. (Kernel-blending connection approximated by a neural network for image classification. In Computational Visual Media, 6(4): 467-476, (2020); published 12/2020).
Regarding claim 3, Zhang et al. teaches the incorporation of max-pooling layers in the fusion networks (i.e., extracting the fusion features by using a piecewise pooling operation) (Pg. 6, Col. 1, Para. 3-4). Zhang et al. further teaches that loss functions are binary cross entropy loss (Pg. 6, Col. 2, Para. 2).
Zhang et al., as applied to claims 1, 5, and 9-10 above, does not teach connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data and newly generating a sample of the class by using Synthetic Minority Oversampling Technique (SMOTE); during the prediction, inputting the fusion features as an input into a fully connected dense layer, and then performing the prediction of disease risk by a Softmax classifier; and adopting a weighting of a cross-entropy loss and a hinge loss to jointly constrain the disease risk prediction model.
Regarding claim 3, Kim et al. teaches a method for increasing the reliability of prognostic models using electronic health record data, by performing a comparative analysis of renowned oversampling methods (Abstract). Kim et al. further teaches that SMOTE is an over-sampling method that generates data on the line portion connecting minority class samples and its k nearest neighbors (KNN). First, random data is selected from minority samples. Next, the KNN are randomly chosen. In SMOTE, a new synthetic minority class (                        
                            
                                    x
                                
                                    n
                                    e
                                    w
                                
                    ) is generated, which lies on the line segment between                         
                            
                                    x
                                
                                    i
                                
                     and                         
                            
                                    x
                                
                                    k
                                
                     in                         
                            
                                    x
                                
                                    n
                                    e
                                    w
                                
                            =
                            (
                            
                                    x
                                
                                    i
                                
                                    -
                                    x
                                
                                    k
                                
                            )
                            ×
                            δ
                        
                    , wherein                         
                            
                                    x
                                
                                    i
                                
                     is minority class random data, k is hyper-parameter of KNN,                         
                            
                                    x
                                
                                    k
                                
                     is KNN of                         
                            
                                    x
                                
                                    i
                                
                     and                         
                            δ
                        
                     is a random value between 0 and 1 (i.e., connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data and newly generating a sample of the class by using Synthetic Minority Oversampling Technique (SMOTE)) (Pg. 380, Col. 1, Para. 1).
Regarding claim 3, Wang et al. teaches that the output component is composed of a fully connected layer and a SoftMax relationship for classification of Parkinson's disease (i.e., inputting the fusion features as an input into a fully connected dense layer, and then performing the prediction of disease risk by a Softmax classifier) (Abstract and Pg. 768, Col. 1, Para. 4).
Regarding claim 3, Liu et al. teaches a novel loss function involving a cross-entropy loss and a hinge loss to improve the generalizability of the neural network (Abstract). Liu et al. further teaches that by combining the cross-entropy loss and the improved hinge loss, the proposed loss function is defined as follows:                         
                            J
                             
                            =
                             
                                    J
                                
                                    s
                                
                            +
                             
                                    J
                                
                                    h
                                
                    . When applying the loss function for training, the weights and biases in the feature extraction layers and kernel mapping layer are learned by backpropagating the gradients from the linear classification layer (i.e., adopting a weighting of a cross-entropy loss and a hinge loss to jointly constrain the disease risk prediction model) (Pg. 471, Col. 1, Para. 1).
Therefore, regarding claim 3, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the SMOTE method of Kim et al. because the method of Kim et al. incorporates an oversampling method to improve the reliability of prognostic models using electronic health record data (Kim et al., Abstract). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Kim et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are incorporate a method for processing electronic health records data.
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the prediction method of Wang et al. because the method of Wang et al. improves the classification of Parkinson’s disease patients over healthy controls compared to conventional approaches (Wang et al., Pg. 775, Col. 1, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Wang et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are drawn towards a method for extracting unstructured data from electronic health records.
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the combined loss function of Liu et al. because the combined loss function improves the generalizability of the neural network, minimizing both empirical and structural risks (Liu et al., Abstract and Pg. 468, Col. 2, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Liu et al. with reasonable expectation of success due to the same nature of the problem to be solved, since incorporate a method for minimizing loss. Therefore, regarding claim 3, the instant invention is prima facie obvious (MPEP § 2142).

3.  Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. as applied to claims 1, 5, and 9-10 above, and further in view of Shi et al. (DeepDiagnosis: DNN-Based Diagnosis Prediction from Pediatric Big Healthcare Data. 2018 Sixth International Conference on Advanced Cloud and Big Data (CBD). Pg. 287-292 (2018); published 11/11/2018).
Regarding claim 4, Zhang et al. teaches that for temporal signals (i.e., structured data), they use values of these time-series features up to the first 24 hours of each hospital admission. For each temporal signal, the average is used to represent the signal at each timestep (hour). Then, each temporal variable was normalized using min-max normalization. To handle missing values, we simply use “0” to impute. For clinical notes, only the first 24 hours' notes for each admission are used to enable early prediction of outcomes (i.e., the prediction model of disease risk further comprises a step of performing a data cleaning before extracting the structured data features and the unstructured data features) (Pg. 3, Col. 1, Para. 3-4). Though not explicitly disclosed by Zhang et al., it would have been obvious to one of ordinary skill in the art to use the average values calculated during the temporal signal preprocessing as the missing values, since the average temporal signals were already calculated on an hourly basis (i.e., the data cleaning comprises completing missing values using mean values) (Pg. 3, Col. 1, Para. 3). Zhang et al. further teaches the incorporation of sequential unstructured notes, including Nursing, Nursing/Other, Physician, and Radiology notes (i.e., the unstructured data is a text) (Pg. 3, Col. 1, Para. 4).
Zhang et al., as applied to claims 1, 5, and 9-10 above, does not teach the data cleaning comprises replacing outlier values and removing dirty reads. 
Regarding claim 4, Shi et al. teaches a deep neural network-based diagnosis prediction algorithm by mining pediatric EHRs (Abstract). Shi et al. further teaches the data preprocessing, which includes removing incomplete and incorrect data. In the outpatient department, when a patient require a laboratory test only, there is an outpatient record without a diagnosis code which is interference to their experiment. They also exclude those records which is lack of key information such as clinical symptoms. In addition, they removed inevasible bias such as systematic errors in the free-text records, which might result in significant bias when EHRs data are used naively for clinical research (i.e., the data cleaning comprises replacing outlier values and removing dirty reads) (Pg. 290, Col. 1, Para. 3 – Col. 2, Para. 1).
Therefore, regarding claim 4, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the data cleaning of Shi et al. because the data cleaning of Shi et al. eliminates systemic errors and significant bias in the electronic health record data (Shi et al., Pg. 290, Col. 1, Para. 3 – Col. 2, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Shi et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both incorporate a method for data cleaning for electronic health records. Therefore, regarding claim 4, the instant invention is prima facie obvious (MPEP § 2142).

4.  Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. as applied to claims 1, 5, and 9-10 above, and further in view of Wang et al. (Integrative Analysis of Patient Health Records and Neuroimages via Memory-Based Graph Convolutional Network. 2018 IEEE International Conference on Data Mining (ICDM), Singapore, pp. 767-776 (2018); published 12/30/2018), Blinov et al. (Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-Based Neural Networks. In: Artificial Intelligence in Medicine. AIME 2020. Lecture Notes in Computer Science, Vol 12299. Springer, Cham. Pg. 111-121(2020); published 9/26/2020), Rasmy et al. (Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction. arXiv: 2005.12833 (2020); published 5/22/2020), Kim et al. (A Comparison of Oversampling Methods for Constructing a Prognostic Model in the Patient with Heart Failure, 2020 International Conference on Information and Communication Technology Convergence (ICTC) (2020), Pg. 379-83; published 12/21/2020), and Shi et al. (DeepDiagnosis: DNN-Based Diagnosis Prediction from Pediatric Big Healthcare Data. 2018 Sixth International Conference on Advanced Cloud and Big Data (CBD). Pg. 287-292 (2018); published 11/11/2018).
Regarding claim 6, Zhang et al. teaches that the unstructured text and structured data are processed for features separately before combining as a patient representation (i.e., the feature extraction module comprises a structured data feature extraction module and an unstructured data feature extraction module) (Pg. 5, Fig. 1). Zhang et al. further teaches the limitation of extracts the fusion features by using a piecewise pooling operation as described for claim 3 above. Zhang et al. further teaches the limitations of a data acquisition module for obtaining the EHR data and a result output module for outputting the prediction results of disease risk as described for claim 1 above. Zhang et al. further teaches the limitation of a data cleaning module for preprocessing the EHR data after obtaining the EHR data and before performing the feature extraction on the EHR data; wherein the preprocessing comprises EHR data cleaning module performing operations of completing missing values using mean values as described for claim 4 above. 
Zhang et al., as applied to claims 1, 5, and 9-10 above, does not teach wherein, the structured data feature extraction module uses a pre-processed structured data as an input of an FCN, maps the data to each hidden semantic node, and obtains the structured data features; wherein, the unstructured data feature extraction module uses a BERT to extract features of the unstructured data; the BERT comprises a BERT Encoder comprising multiple BERT Layers, and each the BERT Layer is an Encoder Block in a Transformer; each the Encoder Block comprises two layers being a self-attentive mechanism layer and a feed-forward neural network layer, separately; the feature fusion module connects the unstructured data features and the structured data features in parallel along a specified dimension, reduces an imbalance  rate through a method of analyzing minority class sample data and newly generating the sample of the class by using a SMOTE; the classification module inputs the fusion features or the structured data as an input into a fully connected dense layer, and then predicts an outcome of a patient through a Softmax classifier; and the EHR data cleaning module performing operations of replacing outlier values and removing dirty reads.
Regarding claim 6, Wang et al. teaches the limitation of wherein the structured data feature extraction module uses a pre-processed structural data as an input of an FCN and obtains the structured data features as described for claim 2 above. Though not explicitly taught by Wang et al., it is obvious to one of ordinary skill in the art that fully convolutional neural networks contain hidden layers (i.e., maps the data to each hidden semantic node). Wang et al. further teaches the limitation of the classification module inputs the fusion features or the structured data as an input into a fully connected dense layer, and then predicts an outcome of a patient through a Softmax classifier as described for claim 3 above. 
Regarding claim 6, Blinov et al. teaches the architecture of the BERT model in Fig. 1. The model input is single sentence tokens (i.e., the unstructured data), corresponding to text from the symptoms and anamnesis fields of a patient's visit to the doctor (i.e., wherein the unstructured data feature extraction module uses a BERT to extract features of the unstructured data). The model contains multiple layers and encoders (i.e., the BERT comprises a BERT Encoder comprising multiple BERT Layers) (Pg. 116, Fig. 1 and Pg. 115, Para. 5). As a base tokenizer (with the vocabulary of approximately 120k tokens) and a model, they used RuBERT (architecturally the same as base BERT model for the English: 12 transformer block layers, hidden size H=768 and 12 self-attention heads) because it significantly outperformed the multilingual variant of BERT (i.e., each the BERT Layer is an Encoder Block in a Transformer; each the Encoder Block comprises two layers being a self-attentive mechanism layer) (Pg. 116, Para. 2).
Regarding claim 6, Rasmy et al. teaches the addition of a feed-forward layer (FFL) to average the outputs from all of the visits to represent a sequence, instead of using only a single token (i.e., a feed-forward neural network layer) (Pg. 7, Para. 2).
Regarding claim 6, Kim et al. teaches the limitation of connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data, and newly generating the sample of the class by using a SMOTE as described for claim 3 above. 
Regarding claim 6, Shi et al. teaches the limitation of the data cleaning comprises replacing outlier values and removing dirty reads as described for claim 4 above. 
Therefore, regarding claim 6, it would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the FCN of Wang et al. because the method of Wang et al. improves the classification of Parkinson’s disease patients over healthy controls compared to conventional approaches (Wang et al., Pg. 775, Col. 1, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Wang et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are drawn towards a method for extracting structured data from electronic health records.
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the BERT model of Blinov et al. because using a BERT model with a vocabulary and pretraining dataset tailored to medical texts representation improves performance of the classification tasks, specifically for less prevalent diseases (Blinov et al., Pg. 120, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Wang et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are drawn towards a method for extracting unstructured data from electronic health records.
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with BERT model of Rasmy et al. because it enables the pre-training, fine-tuning paradigm to be applied to solve various EHR-based problems, particularly when only a small amount of data is available (Rasmy et al, Abstract). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Rasmy et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are drawn towards a method for extracting unstructured data from electronic health records.
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the SMOTE method of Kim et al. because the method of Kim et al. incorporates an oversampling method to improve the reliability of prognostic models using electronic health record data (Kim et al., Abstract). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Kim et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are incorporate a method for processing electronic health records data. 
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the data cleaning of Shi et al. because the data cleaning of Shi et al. eliminates systemic errors and significant bias in the electronic health record data (Shi et al., Pg. 290, Col. 1, Para. 3 – Col. 2, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Shi et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both incorporate a method for data cleaning for electronic health records. Therefore, regarding claim 6, the instant invention is prima facie obvious (MPEP § 2142).

5.  Claim 7-8 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (Combining structured and unstructured data for predictive models: A deep learning approach. BMC Med Inform Decis Mak. 20(1): 280 (2020); published 10/29/2020) in view of Shi et al. (DeepDiagnosis: DNN-Based Diagnosis Prediction from Pediatric Big Healthcare Data. 2018 Sixth International Conference on Advanced Cloud and Big Data (CBD). Pg. 287-292 (2018); published 11/11/2018), Wang et al. (Integrative Analysis of Patient Health Records and Neuroimages via Memory-Based Graph Convolutional Network. 2018 IEEE International Conference on Data Mining (ICDM), pp. 767-776 (2018); published 12/30/2018), Blinov et al. (Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-Based Neural Networks. In: Artificial Intelligence in Medicine. AIME 2020. Lecture Notes in Computer Science, Vol 12299. Springer, Cham. Pg. 111-121(2020); published 9/26/2020) and Kim et al. (A Comparison of Oversampling Methods for Constructing a Prognostic Model in the Patient with Heart Failure, 2020 International Conference on Information and Communication Technology Convergence (ICTC) (2020), Pg. 379-83; published 12/21/2020).
Regarding claim 7, Zhang et al. teaches the limitations of obtaining the EHR data, the EHR data comprising structured and unstructured data; performing feature extraction to obtain unstructured data features and structured data features; fusing unstructured data features and structured data features; and extracting fused features as described for claim 1 above. Zhang et al. further teaches that the unstructured text and structured data (temporal signals and static information) are processed separately (i.e., performing data processing on structured data and unstructured data separately) (Pg. 5, Fig. 1). Zhang et al. further teaches that the data cleaning was performed on both the sequential notes (i.e., unstructured data) and temporal data (i.e., structured data) as described for claim 4 above (i.e., including performing data cleaning to obtain cleaned structured data and cleaned unstructured data). Zhang et al. further teaches that the patient representation (i.e., the fused features) are used in the output layers to make predictions. The predictive tasks include in-hospital mortality prediction, length of hospital stay, and hospital readmission prediction (i.e., using the fused features as data to be identified for medical purposes) (Pg. 5, Fig. 1 and Pg. 3, Col. 2, Para. 1-4). Zhang et al. further teaches the limitations of the data cleaning comprises completing missing values using mean values and the unstructured data is text as described for claim 4 above. Zhang et al. further teaches the limitation of extracting the fusion features by using a piecewise pooling operation as described for claim 3 above. 
Regarding claim 8, Zhang et al. teaches a deep learning method for combining structured and unstructured data from electronic health records for risk prediction tasks (i.e., a method of constructing a disease risk prediction model) (Abstract). Zhang et al. further teaches the limitation of obtaining EHR data of a patient with a known disease risk outcome, the data comprising structured data and unstructured data as described for claim 1 above. Zhang et al. further teaches that based on the MIMIC-III dataset, they evaluated our proposed models on 3 predictive tasks (i.e. in-hospital mortality prediction, 30-day readmission prediction, and long length of stay prediction). To build corresponding cohorts, they first removed all patients whose age < 18 years old and all hospital admissions whose length of stay is less than 1 day. Besides, patients without any records of required temporal signals and clinical notes were removed. In total, 39,429 unique admissions are eligible for prediction tasks (i.e., building a dataset based on the obtained EHR data, the dataset comprises a structured dataset and an unstructured dataset) (Pg. 6, Col. 2, Para. 3). Zhang et al. further teaches that label statistics and characteristics of 3 prediction tasks are provided in Table 1 (i.e., building a label set based on a known outcome) (Pg. 6, Col. 2, Para. 3 and Pg. 4, Table 1). Zhang et al. further teaches the architecture of the network in Fig. 1. The feature from the unstructured text or structured data are extracted and subsequently combined into a patient representation. Then the final patient representation is passed to the output layers to make predictions (i.e., building a disease risk prediction network, comprising: building a feature extraction module for extracting features of the structured data, a feature extraction module for extracting features of the unstructured data and a feature fusion module, then connecting the structured data feature extraction module and the feature extraction module unstructured data in parallel and then being connected with the feature fusion module in series at a decision layer) (Pg. 5, Fig. 1). Zhang et al. further teaches that deep learning models are implemented in PyTorch (i.e., the disease risk prediction network is implemented based on a Pytorch framework) (Pg. 6, Col. 2, Para. 5). Zhang et al. further teaches that deep learning models are trained with Adam optimizer with a learning rate of 0.0001 and ReLU as the activation function. The batch size is chosen as 64 and the max epoch number is set to 50. For evaluation, 70% of the data are used for training, and 10% for validation, 20% for testing. For binary classification tasks, AUROC is used as the main metric (i.e., training the disease risk prediction network using the datasets (the structured dataset and the unstructured dataset) with the label set as a label to build the disease risk prediction model) (Pg. 6, Col. 2, Para. 5-6). Zhang et al. further teaches the limitations of the data cleaning comprises completing missing values using mean values and the unstructured data is text as described for claim 4 above. Zhang et al. further teaches the limitation of extracting the fusion features by using a piecewise pooling operation as described for claim 3 above. 
Zhang et al. does not teach the data cleaning comprises replacing outlier values and removing dirty reads (claims 7 and 8); an FCN is used for extracting structured data features (claim 7); a BERT is used for extracting unstructured features (claim 7); connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data, and newly generating the sample of the class by using a SMOTE (claims 7 and 8); the structured data feature extraction module is an FCN module (claim 8); the unstructured data feature extraction module is a BERT module (claim 8); and when using the dataset for training, inputting the fusion features, as an input, into a fully connected dense layer to train a Softmax classifier (claim 8).
Regarding claims 7 and 8, Shi et al. teaches the limitation of the data cleaning comprises replacing outlier values and removing dirty reads as described for claim 4 above. 
Regarding claim 7, Wang et al. teaches the limitation of an FCN is used for extracting structured data features as described for claim 2 above. 
Regarding claim 7, Blinov et al. teaches the limitation of a BERT is used for extracting unstructured features as described for claim 2 above. 
Regarding claims 7 and 8, Kim et al. teaches the limitation of connecting the unstructured data features and the structured data features in parallel along a specified dimension, reducing an imbalance rate through a method of analyzing minority class sample data, and newly generating the sample of the class by using a SMOTE as described for claim 3 above. 
Regarding claim 8, Wang et al. teaches the limitation of the structured data feature extraction module is an FCN module as described for claim 2 above. Wang et al. further teaches the training of the model (Pg. 771, Col. 2, Para. 1-2). Wang et al. further teaches the limitation of inputting the fusion features, as an input, into a fully connected dense layer to train a Softmax classifier as described for claim 3 above (i.e., when using the dataset for training, inputting the fusion features, as an input, into a fully connected dense layer to train a Softmax classifier).
Regarding claim 8, Blinov et al. teaches the limitation of the unstructured data feature extraction module is a BERT module as described for claim 2 above. 
Therefore, regarding claims 7-8, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the data cleaning of Shi et al. because the data cleaning of Shi et al. eliminates systemic errors and significant bias in the electronic health record data (Shi et al., Pg. 290, Col. 1, Para. 3 – Col. 2, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Shi et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both incorporate a method for data cleaning for electronic health records.
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the FCN of Wang et al. because the method of Wang et al. improves the classification of Parkinson’s disease patients over healthy controls compared to conventional approaches (Wang et al., Pg. 775, Col. 1, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Wang et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are drawn towards a method for extracting structured data from electronic health records.
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the BERT model of Blinov et al. because using a BERT model with a vocabulary and pretraining dataset tailored to medical texts representation improves performance of the classification tasks, specifically for less prevalent diseases (Blinov et al., Pg. 120, Para. 1). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Wang et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are drawn towards a method for extracting unstructured data from electronic health records.
It would have also been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the deep learning method for combining structured and unstructured data from electronic health records of Zhang et al. with the SMOTE method of Kim et al. because the method of Kim et al. incorporates an oversampling method to improve the reliability of prognostic models using electronic health record data (Kim et al., Abstract). One of ordinary skill in the art would be able to combine the teachings of Zhang et al. with Kim et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both are incorporate a method for processing electronic health records data. Therefore, regarding claims 7-8, the instant invention is prima facie obvious (MPEP § 2142).

Conclusion
No claims allowed. 

Inquiries
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DIANA P SANFORD whose telephone number is (571)272-6504. The examiner can normally be reached Mon-Fri 8am-5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz Skowronek can be reached at (571)272-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/D.P.S./Examiner, Art Unit 1687                                                                                                                                                                                                        
/Karlheinz R. Skowronek/Supervisory Patent Examiner, Art Unit 1687
Read full office action
Prosecution Timeline

Sep 09, 2022
Application Filed
Mar 12, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/396,197
Patent 12603153
DE NOVO GENERATION OF HIGH DIVERSITY PROTEINS IN SILICO WITH SELECTIVE AFFINITY AND CROSS-REACTIVITY MINIMIZATION
2y 5m to grant Granted Apr 14, 2026
17/364,044
Patent 12592826
GEOSPATIAL-TEMPORAL PATHOGEN TRACING
2y 5m to grant Granted Mar 31, 2026
17/356,237
Patent 12565673
METHODS FOR THE DESIGN OF NONALLOSTERIC SIRTUIN ACTIVATING COMPOUNDS
2y 5m to grant Granted Mar 03, 2026
17/372,808
Patent 12547889
METHOD AND APPARATUS FOR SYNTHESIZING TARGET PRODUCTS BY USING NEURAL NETWORKS
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 4 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
83%
Grant Probability
99%
With Interview (+25.0%)
4y 8m
Median Time to Grant
Low
PTA Risk
Based on 6 resolved cases by this examiner. Grant probability derived from career allow rate.