Last updated: April 19, 2026
Application No. 17/580,748
MACHINE LEARNING MODELS FOR AUTOMATED PROCESSING OF AUDIO WAVEFORM DATABASE ENTRIES

Non-Final OA §103§112
Filed
Jan 21, 2022
Examiner
PADDA, ARI SINGH KANE
Art Unit
3791
Tech Center
3700 — Mechanical Engineering & Manufacturing
Assignee
Evernorth Strategic Development Inc.
OA Round
3 (Non-Final)
This examiner grants 17% of cases after interview

— +15.6% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 42 resolved cases, 2023–2026
Examiner Intelligence

PADDA, ARI SINGH KANE View full profile →
Grants only 17% of cases
Career Allow Rate
7 granted / 42 resolved
-53.3% vs TC avg
Strong +16% interview lift
Without
With
+15.6%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
50 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
13.3%
-26.7% vs TC avg
§103
44.4%
+4.4% vs TC avg
§102
10.7%
-29.3% vs TC avg
§112
31.4%
-8.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 42 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/30/2026 has been entered.
(Examiner's Note: While the applicant has underlined the claim 12 limitation “obtaining claims data associated with the database entity, wherein the claims data is separate from the audio data”, this limitation was previously presented within claims, filed 07/14/2025, and is NOT new to claim 12.)

Claims Pending


Applicant’s cancellation of claims 9 and 11 and addition of claims 21 and 22 in the response filed 01/30/2026, and previous withdrawal of claims 1-8 and 10 is acknowledged.
Claims 12-22 are the current claims hereby under examination.

Claim Objections - Withdrawn




Applicant’s amendments, filed 07/14/2025, have been fully considered, and the previous objection withdrawn. 
Claim Interpretation






The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
Claim 12: The claim limitation “the machine learning model is configured to generate the condition likelihood output according to the defined individual words indicating at least one of a slur in the voice-based user input, a mispronunciation in in the voice-based user input, a pause in the voice-based user input, a forgotten word in the voice-based user input, a repetition of a word in the voice-based user input, or a specified phrase being present in the voice-based user input” has been interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because it uses a generic placeholder “model” coupled with functional language “configured to generate the condition likelihood output according to the defined individual words indicating at least one of a slur in the voice-based user input, a mispronunciation in in the voice-based user input, a pause in the voice-based user input, a forgotten word in the voice-based user input, a repetition of a word in the voice-based user input, or a specified phrase being present in the voice-based user input” without reciting sufficient structure to achieve the function. Furthermore, the generic placeholder is not preceded by a structural modifier that has a known structural meaning before the phrase “model” 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. 
A review of the specification shows that the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitation:
“Any suitable machine learning model may be used, as described further below. For example, the machine learning model module 122 may train a historic model that utilizes generated features to predict the occurrence of a medical condition, may train an artificial neural network (ANN) to predict the occurrence of a medical condition using a time-series raw waveform audio data input, may train an ANN to predict the occurrence of a medical condition using a time-series of pre-processed audio data, etc.”, or equivalents thereof, as described in Par. 52 of the disclosure filed on 01/21/2022, where the applicant’s specification lacks sufficient detail in regards to the structure of the machine learning model, and will be interpreted as any generic algorithm capable of the indicated functionality.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112









The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 12-22 rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 12 recites the limitations “wherein the machine learning model is configured to generate the condition likelihood output according to the defined individual words indicating at least one of a slur in the voice-based user input, a mispronunciation in in the voice-based user input, a pause in the voice-based user input, a forgotten word in the voice-based user input, a repetition of a word in the voice-based user input, or a specified phrase being present in the voice-based user input”, where the applicant’s specification lacks sufficient detail in regards to manner in which the machine learning model generates a condition likelihood output. The applicant does indicate “Any suitable machine learning model may be used, as described further below. For example, the machine learning model module 122 may train a historic model that utilizes generated features to predict the occurrence of a medical condition, may train an artificial neural network (ANN) to predict the occurrence of a medical condition using a time-series raw waveform audio data input…” (Par. 52 of applicant’s spec.), the overall training process (Par. 60-61 of applicant’s spec.), “The number of neurons can be optimized. At the beginning of training, a network configuration is more likely to have excess nodes. Some of the nodes may be removed from the network during training that would not noticeably affect network performance. For example, nodes with weights approaching zero after training can be removed (this process is called pruning). The number of neurons can cause under-fitting (inability to adequately capture signals in dataset) or over-fitting (insufficient information to train all neurons; network performs well on training dataset but not on test dataset).” (Par. 69 of applicant’s spec.), “In various implementations, each input node in the input layer may be associated with a numerical value, which can be any real number...” (Par. 77 of applicant’s spec.) “phrases or slang that are specific to using certain drugs, such as "I need a fix," or "getting well," etc., may be identified by a machine learning model. Such phrases may be particularly useful if they are not typically spoken by non-addicts. If audio inputs to the model include the specified phrases, the machine learning model may output a higher likelihood that the speaker is addicted to drugs” (Par. 94 of applicant’s specification), and “a machine learning model may be trained to predict a likelihood of a depression condition. In this case, the model may be trained to look for specific tone in audio data of a member's speech, for a specific choice of words or phrases (such as cussing, slang, changes in word choice over time, etc.), pauses…” (Par. 95 of applicant’s spec.). However, this lacks sufficient detail in regards to the specific weights or biases present within the model itself that lead to the claimed condition likelihood output, as simply reciting the existence of nodes and the presence of any amount of weight does not provide sufficient support as to the exact weights and biases utilized for the machine learning model. As such, the claim is rejected.
Claim 12 recites the limitations “training a machine learning model with historical feature vector inputs to generate a condition likelihood output” and “processing, by the machine learning model, the feature vector input to generate the condition likelihood output”, where the applicant’s specification lacks sufficient detail in regards to manner in which the machine learning model processes the input to generate a condition likelihood output. The applicant does indicate “Any suitable machine learning model may be used, as described further below. For example, the machine learning model module 122 may train a historic model that utilizes generated features to predict the occurrence of a medical condition, may train an artificial neural network (ANN) to predict the occurrence of a medical condition using a time-series raw waveform audio data input…” (Par. 52 of applicant’s spec.), the overall training process (Par. 60-61 of applicant’s spec.), “The number of neurons can be optimized. At the beginning of training, a network configuration is more likely to have excess nodes. Some of the nodes may be removed from the network during training that would not noticeably affect network performance. For example, nodes with weights approaching zero after training can be removed (this process is called pruning). The number of neurons can cause under-fitting (inability to adequately capture signals in dataset) or over-fitting (insufficient information to train all neurons; network performs well on training dataset but not on test dataset).” (Par. 69 of applicant’s spec.), “In various implementations, each input node in the input layer may be associated with a numerical value, which can be any real number...” (Par. 77 of applicant’s spec.) “phrases or slang that are specific to using certain drugs, such as "I need a fix," or "getting well," etc., may be identified by a machine learning model. Such phrases may be particularly useful if they are not typically spoken by non-addicts. If audio inputs to the model include the specified phrases, the machine learning model may output a higher likelihood that the speaker is addicted to drugs” (Par. 94 of applicant’s specification), and “a machine learning model may be trained to predict a likelihood of a depression condition. In this case, the model may be trained to look for specific tone in audio data of a member's speech, for a specific choice of words or phrases (such as cussing, slang, changes in word choice over time, etc.), pauses…” (Par. 95 of applicant’s spec.). However, this lacks sufficient detail in regards to the specific weights or biases present within the model itself that lead to the claimed condition likelihood output, as simply reciting the existence of nodes and the presence of any amount of weight does not provide sufficient support as to the exact weights and biases utilized for the machine learning model. As such, the claim is rejected. 
Claims 13-22 are dependent on claim 12 and as such are also rejected. 












The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 12-22 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 12 limitation “wherein the machine learning model is configured to generate the condition likelihood output according to the defined individual words indicating at least one of a slur in the voice-based user input, a mispronunciation in in the voice-based user input, a pause in the voice-based user input, a forgotten word in the voice-based user input, a repetition of a word in the voice-based user input, or a specified phrase being present in the voice-based user input” invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. The corresponding structure within the applicant’s specification states a machine learning model (Par. 52 of applicant’s spec.)(Par. 60-61 of applicant’s spec.)(Par. 69 of applicant’s spec.) (Par. 77 of applicant’s spec.)(Par. 94 of applicant’s specification) (Par. 95 of applicant’s spec.), which fails to effectively define the metes and bounds of the claim as it is unclear as to the actual weights or biases used by the applicant for the machine learning model itself in generating the condition likelihood output. For examination purposes, this will be interpreted as any generic algorithm capable of the indicated function. Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.



Claim 21 recites the limitation “wherein processing the feature vector input to generate the condition likelihood output includes” in lines 1-2. There is insufficient antecedent basis for this limitation in the claim. Are the steps of this limitation also performed “by the machine learning model”, from claim 1? For examination purposes, this will be interpreted as the limitations that follow the above limitation being included in “processing, by the machine learning model, the feature vector input to generate the condition likelihood output,” from claim 1. 
Claims 13-22 are dependent on claim 12, and as such are also rejected. 

Claim Rejections - 35 USC § 103













The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The claims are generally directed towards a method of processing data using a machine learning model. The method comprises training a machine learning model with vector inputs that is specific to historical database entities and generate a condition likelihood output. The method further includes obtaining audio data and claims data associated with each database entity, generating a feature vector based on the data, processing the feature vector to generate a likelihood output, and outputting a condition likelihood.
Claim(s) 12-15 and 17-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Stamatopoulos (US Pub. No. 20190088367) hereinafter Stamatopoulos, and further in view of Anushiravani (US Pub. No. 20200388287) hereinafter Anushiravani, Rosenbek (US Pub. No. 20150265205) hereinafter Rosenbek, and Jain (US Pat. No. 11504011) hereinafter Jain. 
Regarding claim 12, Stamatopoulos discloses A computerized method for automated processing of audio waveform database entries using a machine learning model, the method comprising (Fig. 34 (machine learning model)) (Fig. 38 (repeated process)):
training a machine learning model (Par. 421 (ANN)) with historical feature inputs to generate a condition output (Par. 421, “In one embodiment of the present invention, an artificial neural network (ANN) can be trained and evaluated to determine lung pathology, disease type and severity. The ANN system for determining lung pathology comprises a training module (shown in FIG. 34) and an evaluation module (shown in FIG. 35).”)(Fig. 34) (Par. 422, “FIG. 34 illustrates a block diagram providing an overview of the manner in which an artificial neural network can be trained to ascertain lung pathologies in accordance with an embodiment of the present invention”)(Par. 429, “The PDFs that correspond to a specific health…” “…category or to a category indicating disease by employing a Binary Hypothesis Likelihood Ratio Test.”), wherein the historical feature inputs include historical data structures specific to multiple historical database entities (Par. 425, “respiratory recordings at block 3401 that the training system uses may be annotated by specialists regarding health status, disease, pathology and severity and can include references from other diagnostic tests such auscultation, spirometry, CT scans, blood and sputum inflammatory and genetic markers, etc. The metadata used to annotate the respiratory recordings at block 3401 may comprise respiratory measurements and diagnostics 3411 (spirometry, plethysmography, inflammatory markers, ventilation, CT scans, auscultation, etc.), medication 3412, patient symptoms 3413, and doctor's diagnoses 3414.”), wherein the historical data structures include multiple audio data entries and multiple claims data entries (Par. 425 (respiratory recordings annotated with metadata)), and wherein the condition output is indicative of a specified condition associated with one of the multiple historical database entities (Fig. 38, step 3812 (data trained to determine a pathology and severity));
obtaining a set of multiple database entities (Fig. 38, step 3802);
for each database entity in the set of multiple database entities: 
processing, by the machine learning model (Par. 451 (deep learning model of block 3410)), the input (Fig. 38, step 3810) to generate the condition output (Fig. 38, step – 3812).
	Stamatopoulos fails to explicitly disclose feature vector inputs to generate a condition likelihood output; feature vector inputs, and a condition likelihood output. 
However, Anushiravani teaches feature vector inputs (Par. 99, 101-102 (feature vector input)) (Fig. 4) to generate a condition likelihood output (Par. 98, 109 (disease probabilities output by classifier 600));
feature vector inputs (Par. 99, 101-102 (feature vector input)) (Fig. 4), and 
the condition likelihood output (Par. 98, 109 (disease probabilities output by classifier 600)).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos with that of Anushiravani to include feature vector inputs to generate a condition likelihood output; feature vector inputs, and a condition likelihood output through the combination of references as it would have yielded the predictable result of explicitly providing the probability of a disease (Anushiravani (Par. 98)) and organizing the data for input. 
Modified Stamatopoulos fails to explicitly disclose obtaining audio data associated with the database entity, wherein the audio data comprises voice-based user input;
processing the audio data using at least one of keyword processing and natural language processing to generate processed audio features;
parsing the processed audio features to define individual words;
processing the defined individual words to generate processed audio input features; and
obtaining claims data associated with the database entity, wherein the claims data is separate from the audio data.
However, Stamatopoulos does disclose obtaining audio data associated with the database entity (Fig. 38, step 3802), wherein the audio data comprises voice-based user input (Par. 448, 450 (wheeze and crackle are determined from the audio file));
obtaining claims data associated with the database entity (Fig. 38, step 3804) (Par. 449 (recordings annotated with metadata)), wherein the claims data is separate from the audio data (Fig. 38, step 3804) (Par. 449 (recordings annotated with metadata, where the claims data is a separate type of data)).
However, Rosenbek teaches obtaining audio data associated with the database entity, wherein the audio data comprises voice-based user input (Par. 63, “Referring to FIG. 5, a person's voice is input through a telephone or mobile communication device 501a or microphone 501b and transmitted to a server 503, such as an ASP, via a network 502. The voice signal can be transmitted via internet, phone, VoIP, satellite, cable, cellular or other networks. Accordingly, mass screening can be accomplished for users of the network provider. The server 503 may include a database, memory or other storage device 504 that can retain previous voice samples of the same user, voice samples of other users connected to the network, and/or data related to the user(s). Accordingly, it is possible to obtain, analyze and monitor biomarkers in speech/language over long periods of time.”) (Par. 64 (voice sample));
processing the audio data using at least one of keyword processing and natural language processing to generate processed audio features (Par. 68 (language module - 515));
parsing the processed audio features to define individual words (Par. 68, “The language marker module 515 can include an automatic speech recognition (ASR) module 507 and a language module 508. As shown in FIG. 7, according to one embodiment, the user's language is transcripted via the ASR module 507, which can incorporate large vocabulary systems, word spotting, and phoneme recognition”);
processing the defined individual words to generate processed audio input features (Par. 68, “Then, once the words (language) are determined by ASR, recognized words (and phrases and sentences) can be classified into syntactical categories in the language module 508. For example, recognized words can be classified as nouns, verbs, and adjectives. Then, phrase and/or sentence complexity can be determined by, for example, evaluating the number and order of various syntactical categories that occur in someone's speech. In one embodiment, a primary analysis 509 for syntax coding can be performed to classify the recognized words/language.”); and
obtaining claims data associated with the database entity, wherein the claims data is separate from the audio data (Par. 77, “information obtained related to cough behavior and changes in voice quality can then be combined with other information and data such as meteorological information (e.g. temperature and humidity), incidence of diseases in the population, the speaker's age, gender, ethnic/racial background, socio-economic status, predisposition to specific diseases, and geographical or location information (e.g., location and address), etc., to further improve the accuracy of screening for infectious diseases and/or determine a likelihood of a particular disease.”), and 
voice monitoring for disease conditions (Par. 18-19, “According to certain embodiments of the invention, speech and/or language changes can be used as biomarkers for neurological diseases…” “… provide one or more biomarkers indicative of a likelihood of disease onset and/or stage of degeneration”) (Par. 29, “likelihood of one or more neurological/neurodegenerative or other disease, such as infectious and/or respiratory disease, condition(s).”).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos and Anushiravani with that of Rosenbek to include obtaining audio data associated with the database entity, wherein the audio data comprises voice-based user input; processing the audio data using at least one of keyword processing and natural language processing to generate processed audio features; parsing the processed audio features to define individual words; processing the defined individual words to generate processed audio input features; and obtaining claims data associated with the database entity, wherein the claims data is separate from the audio data through the combination of references through the combination of references as it would have yielded the predictable result of allowing for early detection and identification of diseases (Rosenbek (Par. 68, 74-75, 78)).
Modified Stamatopoulos fails to explicitly disclose generating a feature vector input according to the processed audio input features and the claims data.
However, Rosenbek does further teach generating an input according to the processed audio input features and the claims data (Par. 77, “information obtained related to cough behavior and changes in voice quality can then be combined with other information and data such as meteorological information (e.g. temperature and humidity), incidence of diseases in the population, the speaker's age, gender, ethnic/racial background…” (combination of information)).
Anushiravani further teaches generating a feature vector input according to the processed audio input features and the claims data (Par. 88-89, (vector – 306 generated from audio and user data)) (Par. 99, Fig. 4).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, and Rosenbek with that of Rosenbek and Anushiravani to include generating a feature vector input according to the processed audio input features and the claims data of Rosenbek through the combination of references as it would have yielded the predictable result of mathematically representing the data and organizing the data for input (Anushiravani (Par. 89)) and improving model accuracy (Rosenbek (Par. 77)).
Modified Stamatopoulos fails to explicitly disclose processing, by the machine learning model, the feature vector input to generate the condition likelihood output, the condition likelihood output indicative of a likelihood that a patient associated with the database entry has a specified behavioral health condition, wherein the machine learning model is configured to generate the condition likelihood output according to the defined individual words indicating at least one of a slur in the voice-based user input, a mispronunciation in in the voice-based user input, a pause in the voice-based user input, a forgotten word in the voice-based user input, a repetition of a word in the voice-based user input, or a specified phrase being present in the voice-based user input.
However, Rosenbek further teaches processing, by the machine learning model, the input to generate the condition likelihood output (Par. 71, “Once the information from the speech and/or language analysis is obtained, comparators 512 can be used to reach a diagnostic decision…” “…The diagnostic decision provides information indicative of a likelihood and type of disease and may be stored in a database associated with the system”) (Par. 70, “after performing the speech and/or language analysis, modeling and coding can be performed by the coding module 511 via statistical approaches, machine learning, pattern recognition, or other algorithms to combine information from various biomarkers before reaching a diagnostic decision”), 
the condition likelihood output indicative of a likelihood that a patient associated with the database entry has a specified behavioral health condition (Par. 71, “Once the information from the speech and/or language analysis is obtained, comparators 512 can be used to reach a diagnostic decision…” “…The diagnostic decision provides information indicative of a likelihood and type of disease and may be stored in a database associated with the system”) (Par. 43, “Similarly to PD, the biomarkers for Alzheimer's disease may include the measures described above as well as detailed analyses of the speaker's language characteristics.”) (Par. 44, “Biomarkers for respiratory diseases…”) (Par. 68, “A reduction in sentence complexity can be an indicator of a neurological disease. In addition, certain neurological diseases, such as Alzheimer's, cause particular language patterns to emerge. Such language patterns can be determined via the secondary analysis.”) (Par. 18-19, “According to certain embodiments of the invention, speech and/or language changes can be used as biomarkers for neurological diseases…” “… provide one or more biomarkers indicative of a likelihood of disease onset and/or stage of degeneration”) (Par. 29, “likelihood of one or more neurological/neurodegenerative or other disease, such as infectious and/or respiratory disease, condition(s).”), 
wherein the machine learning model (Par. 71, “Once the information from the speech and/or language analysis is obtained, comparators 512 can be used to reach a diagnostic decision…” “…The diagnostic decision provides information indicative of a likelihood and type of disease and may be stored in a database associated with the system”) (Par. 70, “after performing the speech and/or language analysis, modeling and coding can be performed by the coding module 511 via statistical approaches, machine learning, pattern recognition, or other algorithms to combine information from various biomarkers before reaching a diagnostic decision”) is configured to generate the condition likelihood output according to the defined individual words indicating at least one of a slur in the voice-based user input, a mispronunciation in in the voice-based user input, a pause in the voice-based user input, a forgotten word in the voice-based user input, a repetition of a word in the voice-based user input, or a specified phrase being present in the voice-based user input (Par. 63, “Referring to FIG. 5, a person's voice is input through a telephone or mobile communication device 501a or microphone 501b and transmitted to a server 503, such as an ASP, via a network 502. The voice signal can be transmitted via internet, phone, VoIP, satellite, cable, cellular or other networks....”) (Par. 68, “A reduction in sentence complexity can be an indicator of a neurological disease. In addition, certain neurological diseases, such as Alzheimer's, cause particular language patterns to emerge. Such language patterns can be determined via the secondary analysis”) (Par. 71, “Once the information from the speech and/or language analysis is obtained, comparators 512 can be used to reach a diagnostic decision…” “…The diagnostic decision provides information indicative of a likelihood and type of disease and may be stored in a database associated with the system”) (Par. 33, “… The analyses may also include, but is not limited to, analyses of the number of words spoken, the types of words (e.g. nouns, verbs, adjectives, articles, etc.) grammatical complexity of the phrases and/or sentence, the number of occurrence of specific words/phrases in conversation, or instances of dysfluencies such as pauses, hesitations or repetitions of words or part-words. The analysis may also evaluate, as an alternative or in addition, the frequency (i.e. the number of occurrences), the intensity (i.e. the strength), or other characteristics of cough during a conversation”)(Par. 42, “In accordance with an embodiment of the invention, one or more acoustic measures for Parkinson's disease can include, but are not limited to, fundamental frequency (F.sub.0), voice onset time, pause duration, and/or changes in F.sub.0; voice onset time, and/or pause duration...”) (Par. 43, “Similarly to PD, the biomarkers for Alzheimer's disease may include the measures described above as well as detailed analyses of the speaker's language characteristics…”).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, and Rosenbek with that of Rosenbek to include processing, by the machine learning model of Stamatopoulos, the feature vector input of Anushiravani to generate the condition likelihood output, the condition likelihood output indicative of a likelihood that a patient associated with the database entry has a specified behavioral health condition, wherein the machine learning model is configured to generate the condition likelihood output according to the defined individual words indicating at least one of a slur in the voice-based user input, a mispronunciation in in the voice-based user input, a pause in the voice-based user input, a forgotten word in the voice-based user input, a repetition of a word in the voice-based user input, or a specified phrase being present in the voice-based user input through the combination of references as it would have yielded the predictable result of allowing for early detection and identification of diseases (Rosenbek (Par. 68, 74-75, 78))
Modified Stamatopoulos fails to explicitly disclose determining whether the condition likelihood output is greater than a specified likelihood threshold.
However, Jain teaches determining whether the condition likelihood output is greater than a specified likelihood threshold (Col. 41, lines 43-59 (determination made after likelihood exceeds a certain level)).
Stamatopoulos, Anushiravani, Rosenbek, and Jain are considered to be analogous art to the claimed invention as they are involved with machine learning and biological related data.
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, and Rosenbek with that of Jain to include determining whether the condition likelihood output is greater than a specified likelihood threshold through the combination of references as it would have yielded the predictable result of improving the overall accuracy and directly alerting the user regarding their health.   
Modified Stamatopoulos fails to explicitly disclose assigning the database entity to an identified condition subset of the multiple database entities in response to determining that the condition likelihood output is greater than the specified likelihood threshold.
However, Stamatopoulos does teach assigning the database entity to an identified condition subset of the multiple database entities in response to determining (Par. 439-440 (binary hypothesis test))(Par. 442 (health subject identification is stored)) (Par. 446 (classified session stored in response to a classification being made)) that the output is greater than the threshold (Par. 439-440 (Binary hypothesis test)(threshold of 0)). 
Jain further teaches in response to determining that the condition likelihood output is greater than the specified likelihood threshold (Col. 41, lines 43-59 (determination made after likelihood exceeds a certain level)).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Jain to include assigning the database entity to an identified condition subset of the multiple database entities in response to determining that the condition likelihood output is greater than the specified likelihood threshold through the combination of references as it would have yielded the predictable result of improving the accuracy of the model and inform the user regarding their condition (Jain (Col. 41, lines 43-59)). 
Modified Stamatopoulos fails to explicitly disclose for each database entity in the identified condition subset, transforming a user interface to display the condition likelihood output associated with the database entity.
However, Jain further teaches for each database entity in the identified condition subset, transforming a user interface to display the condition likelihood output associated with the database entity (Col. 41, lines 43-59).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Jain to include for each database entity in the identified condition subset, transforming a user interface to display the condition likelihood output associated with the database entity through the combination of references as it would have yielded the predictable result of directly informing the patient regarding their condition (Jain (Col. 41, lines 43-59)).

Regarding claim 13, modified Stamatopoulos further discloses the memory hardware is configured to store multiple machine learning model each associated with a different one of multiple condition classification types (Stamatopoulos (Par. 456 (software capable of the indicated function)) (Par. 421 (ANN)) (Par. 426-427 (multiple machine learning algorithms))); 
the method includes identifying one of the multiple machine learning models according to a specified condition prediction type (Stamatopoulos (Par. 439-440 (binary hypothesis test))(Par. 442 (pathology detected)) (Par. 444 (data fed into ANN))); and
processing the feature input includes processing the feature input using the selected machine learning model (Stamatopoulos (Par. 444 (process input))). 
Modified Stamatopoulos fails to explicitly disclose feature vector inputs. 
However, Anushiravani further teaches feature vector inputs (Anushiravani (Par. 99, 101-102 (feature vector input)) (Fig. 4)). 
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Anushiravani to explicitly include feature vector inputs for the reasoning as indicated above. 
Regarding claim 14, modified Stamatopoulos further discloses wherein the machine learning model includes a historic machine learning model (Stamatopoulos (Par. 421 (training module in ANN system))), and training the historic machine learning model includes (Stamatopoulos (Fig. 34 (training for ANN))).
Stamatopoulos fails to explicitly disclose obtaining call transcription data associated with the multiple audio data entries; processing the call transcription data to using at least one of keyword and natural language processing to generate processed transcription input features.
However, Rosenbek further teaches obtaining call transcription data associated with the multiple audio data entries (Rosenbek (Par. 63 (voice samples from a phone)));
processing the call transcription data using at least one of keyword and
natural language processing to generate processed transcription input features (Rosenbek (Par. 68 (language module - 515))).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Rosenbek to include obtaining call transcription data associated with the multiple audio data entries; processing the call transcription data using at least one of keyword and natural language processing to generate processed transcription input features through the combination of references as it would have yielded the predictable result of allowing for the detection of early stages of additional diseases (Rosenbek (Par. 75)) from a phone (Rosenbek (Par. 63)). 
 	Modified Stamatopoulos fails to explicitly disclose supplying a training feature vector input to the machine learning model to train the machine learning model, wherein the training feature vector input includes the processed transcription input features and the processed audio input features.
However, Stamatopoulos does teach supplying a training feature input to the machine learning model to train the machine learning model (Stamatopoulos (Fig. 34 (training input ANN))).
Rosenbek further teaches wherein the input includes the processed transcription input features and the processed audio input features (Rosenbek (Par. 68, “recognized words (and phrases and sentences) can be classified into syntactical categories in the language module 508. For example, recognized words can be classified as nouns, verbs, and adjectives”) (Par. 70)).
Anushiravani further teaches feature vector inputs (As indicated above). 
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Rosenbek to include supplying a training feature vector input to the machine learning model to train the machine learning model of Stamatopoulos, wherein the training feature vector input includes the processed transcription input features and the processed audio input features of Rosenbek through the combination of references as it would have yielded the predictable result of allowing for the detection of early stages of additional diseases (Rosenbek (Par. 75)) from a phone (Rosenbek (Par. 63)).
Regarding claim 15, modified Stamatopoulos fails to explicitly disclose the limitations of the claim. 
However, Rosenbek further teaches wherein the processed audio input features include at least one of an intensity of the audio waveform, a fundamental frequency (Rosenbek (Par. 67, “For speech analysis, the user's speech is analyzed according to predetermined metrics (acoustic measures) in a speech metrics module 506. For example, acoustic analysis can be performed to quantify metrics including, but not limited to fundamental frequency characteristics, intensity, articulatory characteristics, speech/voice quality, prosodic characteristics, and speaking rate”)), a formant frequency, Mel Frequency Cepstrum Coefficients (MFCCs), a glottal flow, a jitter value, a zero crossing value, a trailing intensity, and a white space length.
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Rosenbek to include wherein the processed audio input features include at least one of an intensity of the audio waveform, a fundamental frequency, a formant frequency, Mel Frequency Cepstrum Coefficients (MFCCs), a glottal flow, a jitter value, a zero crossing value, a trailing intensity, and a white space length through the combination of references as it would have yielded the predictable result of providing additional analysis regarding the measured audio data. 

Regarding claim 17, modified Stamatopoulos further discloses wherein the machine learning model comprises a processed audio model (Stamatopoulos (Par. 427, “The ANN algorithms initially analyze the recordings contained in the training set by employing the frame-based analysis of wheeze module 2700 and crackle module 2750 in order to tune the ANN algorithms that will later evaluate new incoming recordings to determine whether they are associated with healthy lungs, and if not, then to determine lung pathology and disease type (e.g., asthma, COPD, etc.) and severity (mild, moderate, severe).”)), and training the processed audio model (Stamatopoulos (Par. 427 (frame based analysis))) includes:
separating the multiple audio data entries into temporal frames with overlap (Stamatopoulos (Par. 428 (overlapping frames)));
for each temporal frame, obtaining multiple processed audio features (Stamatopoulos (Par. 429-430, Fig. 34 (descriptors extracted))), and
supplying a training feature vector input to the processed audio model to train the processed audio model (Stamatopoulos (Fig. 34, (recordings provided from block 3401 for training)), wherein the training feature vector input includes the multiple processed audio features (Stamatopoulos (Fig. 34 (descriptors)) (Fig. 38, step - 3804))).
Modified Stamatopoulos fails to explicitly disclose wherein the multiple processed audio features include at least one of log-Mel bank features associated with the frame, Mel Frequency Cepstrum Coefficients associated with the frame, MFCC summary statistics associated with the frame, and MFCC difference values between a current temporal index and a prior temporal index.
However, Anushiravani further teaches wherein the multiple processed audio features include at least one of log-Mel bank features associated with the frame, Mel Frequency Cepstrum Coefficients associated with the frame (Anushiravani (Par. 95 (MFCC))), MFCC summary statistics associated with the frame, and MFCC difference values between a current temporal index and a prior temporal index.
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Anushiravani to include wherein the multiple processed audio features include at least one of log-Mel bank features associated with the frame, Mel Frequency Cepstrum Coefficients associated with the frame, MFCC summary statistics associated with the frame, and MFCC difference values between a current temporal index and a prior temporal index through the combination of references as it would have yielded the predictable result of improving the quality of the data. 
Regarding claim 18, modified Stamatopoulos further fails to explicitly disclose the limitation of the claim.
However, Rosenbek further teaches wherein the specified condition associated with one of the multiple historical database entities includes at least one of a post­partum depression medical condition, an anxiety medical condition, a drug addiction medical condition, or a Parkinson's disease medical condition (Rosenbek (Par. 63, “Referring to FIG. 5, a person's voice is input through a telephone or mobile communication device 501a or microphone 501b and transmitted to a server 503, such as an ASP, via a network 502. The voice signal can be transmitted via internet, phone, VoIP, satellite, cable, cellular or other networks....”) (Par. 68, “A reduction in sentence complexity can be an indicator of a neurological disease. In addition, certain neurological diseases, such as Alzheimer's, cause particular language patterns to emerge. Such language patterns can be determined via the secondary analysis”) (Par. 71, “Once the information from the speech and/or language analysis is obtained, comparators 512 can be used to reach a diagnostic decision…” “…The diagnostic decision provides information indicative of a likelihood and type of disease and may be stored in a database associated with the system”) (Par. 33, “… The analyses may also include, but is not limited to, analyses of the number of words spoken, the types of words (e.g. nouns, verbs, adjectives, articles, etc.) grammatical complexity of the phrases and/or sentence, the number of occurrence of specific words/phrases in conversation, or instances of dysfluencies such as pauses, hesitations or repetitions of words or part-words. The analysis may also evaluate, as an alternative or in addition, the frequency (i.e. the number of occurrences), the intensity (i.e. the strength), or other characteristics of cough during a conversation”)(Par. 42, “In accordance with an embodiment of the invention, one or more acoustic measures for Parkinson's disease can include, but are not limited to, fundamental frequency (F.sub.0), voice onset time, pause duration, and/or changes in F.sub.0; voice onset time, and/or pause duration...”) (Par. 43, “Similarly to PD, the biomarkers for Alzheimer's disease may include the measures described above as well as detailed analyses of the speaker's language characteristics…”)).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Rosenbek to include wherein the specified condition associated with one of the multiple historical database entities includes at least one of a post­partum depression medical condition, an anxiety medical condition, a drug addiction medical condition, or a Parkinson's disease medical condition for the reasoning as indicated in claim 12 above. 

Regarding claim 19, modified Stamatopoulos fails to explicitly disclose the limitations of the claim. 
However, Stamatopoulos does teach building a training dataset for training the machine learning model (Stamatopoulos (Par. 446 (training database updated with each session))). 
However, Anushiravani further teaches identifying a date associated with each of the multiple audio data entries and the multiple claims data entries (Anushiravani (Par. 99 (timestamp)));
determining a date associated with a condition of each of the multiple historical database entities, based at least in part on the dates associated with the multiple claims data entries (Anushiravani (Par. 99 (timestamp))(Par. 109 (disease at each timestamp))); and
wherein each audio data entry included in the training dataset has a date that is within a specified time window of the determined date associated with the condition of a corresponding one of the multiple historical database entities (Anushiravani (Par. 99-100 (timestamp))(Par. 109 (disease at each timestamp))).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Stamatopoulos and Anushiravani to include identifying a date associated with each of the multiple audio data entries and the multiple claims data entries; determining a date associated with a condition of each of the multiple historical database entities, based at least in part on the dates associated with the multiple claims data entries; and building a training dataset for training the machine learning model, wherein each audio data entry included in the training dataset has a date that is within a specified time window of the determined date associated with the condition of a corresponding one of the multiple historical database entities through the combination of references as it would have yielded the predictable result of organizing the data. 

Claim(s) 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Stamatopoulos in view of Anushiravani, Rosenbek, and Jain as applied to claim 12 above, and further in view of Yu (US Pub. No. 20210375260) hereinafter Yu. 
Stamatopoulos, Anushiravani, Rosenbek, and Jain teach the method of claim 12 above. 
Regarding claim 16, modified Stamatopoulos further discloses wherein the machine learning model comprises a raw audio model (Stamatopoulos (Par. 427, “The ANN algorithms initially analyze the recordings contained in the training set by employing the frame-based analysis of wheeze module 2700 and crackle module 2750 in order to tune the ANN algorithms that will later evaluate new incoming recordings to determine whether they are associated with healthy lungs, and if not, then to determine lung pathology and disease type (e.g., asthma, COPD, etc.) and severity (mild, moderate, severe).”)), and the raw audio model (Stamatopoulos (Par. 427, (frame based analysis))) comprises:
Modified Stamatopoulos fails to explicitly disclose a one-dimensional convolution layer which receives an input of multiple frames, wherein the convolution layer includes multiple filters; a recurrent layer, including a plurality of weighted nodes, which uses the output from the convolution layer to identify temporal dependence through use of a Long Short-Term Memory (LSTM); and a final layer that maps the convolution layer and the recurrent layer to a final output.
However, Yu teaches a one-dimensional convolution layer which receives an input of multiple frames (Par. 70 (CNN layer)) (Fig. 9, (CNN layer)) (Claims 9-10 (multiple features from the signal)), wherein the convolution layer includes multiple filters (Claim 10 (convolutional filters)); a recurrent layer, including a plurality of weighted nodes (Par. 77, Fig. 9, (LSB with LSTM tubes)), which uses the output from the convolution layer to identify temporal dependence through use of a Long Short-Term Memory (LSTM) (Fig. 9 (LSB)) (Claim 10 (LSTM temporal modeling)); and a final layer that maps the convolution layer and the recurrent layer to a final output (Claim 10 (LSTM applied to LSB))(Fig. 9 (LSTM layer)).
Stamatopoulos, Anushiravani, Rosenbek, Jain, and Yu are considered to be analogous art to the claimed invention as they are involved with machine learning and biological related data.
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Yu to include a one-dimensional convolution layer which receives an input of multiple frames, wherein the convolution layer includes multiple filters; a recurrent layer, including a plurality of weighted nodes, which uses the output from the convolution layer to identify temporal dependence through use of a Long Short-Term Memory (LSTM); and a final layer that maps the convolution layer and the recurrent layer to a final output through the combination of references as it would have yielded the predictable result of improving the accuracy of the model.  

Claim(s) 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Stamatopoulos in view of Anushiravani, Rosenbek, and Jain as applied to claim 12 above, and further in view of Edwards (US Pub. No. 20200381130) hereinafter Edwards. 
Stamatopoulos, Anushiravani, Rosenbek, and Jain teach the method of claim 12 above. 
Regarding claim 20, modified Stamatopoulos fails to explicitly disclose the limitations of the claim. 
However, Edwards teaches obtaining multiple social media data entries each associated with one of the multiple historical database entities (Par. 30 (social media data)); and
generating one or more social media input features based on the multiple social media data entries (Par. 30 (social media)) (Par. 33, “As will be discussed in greater detail below, the voice signal 122 could be obtained from a wide variety of sources, such as pre-recorded voice samples (e.g., from a person's voice mail box, from a recording specifically obtained from the person, or from some other source, including social media postings, videos, etc.).”), wherein the inputs include the one or more social media input features (Par. 30, (social media input)) (Par. 33, “As will be discussed in greater detail below, the voice signal 122 could be obtained from a wide variety of sources, such as pre-recorded voice samples (e.g., from a person's voice mail box, from a recording specifically obtained from the person, or from some other source, including social media postings, videos, etc.). Next, in step 124, an audio pre-processing step is performed on the voice signal 122. This step can involve digital signal processing (DSP) of the signal 122, audio segmentation, and speaker diarization.”).
Anushiravani teaches historical feature vector (as indicated above). 
Stamatopoulos, Anushiravani, Rosenbek, Jain, and Edwards are considered to be analogous art to the claimed invention as they are involved with machine learning and biological related data. 
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Edwards to include obtaining multiple social media data entries each associated with one of the multiple historical database entities; and generating one or more social media input features based on the multiple social media data entries, wherein the historical feature vector inputs of Anushiravani include the one or more social media input features through the combination of references as it would have yielded the predictable result of providing additional data and improving the overall quality of the model. 

Claim(s) 21-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Stamatopoulos in view of Anushiravani, Rosenbek, and Jain as applied to claim 12 above, and further in view of Davis (US Pub. No. 20210345925) hereinafter Davis.
Stamatopoulos, Anushiravani, Rosenbek, and Jain teach the method of claim 12 above. 
Regarding claim 21, modified Stamatopoulos fails to explicitly disclose the limitations of the claim. 
However, Davis teaches wherein processing the feature vector input to generate the condition likelihood output includes (Par. 43, “FIG. 2 shows an example of a data processing system 200. The data processing system 200 in this example includes the detection device 110 and the data processing device 120 of FIG. 1.”) (Par. 47, “Returning to FIG. 2, the input data 205 are transformed into features of a feature vector 220 by feature vector generation engine 210 (such as using the NLP models described in relation to FIG. 3A).”) (Par. 54, “The classifier data 260 is sent to a prediction engine 270. The prediction engine 270 is configured to assign probabilities to one or more health risks as being present for the patient. The prediction data 280 shows the likelihood that each of one or more health risks is present for the patient. The collection of health risks and their associated probabilities of the probabilities data 280 can together be used to determine if the patient has a disease or other health condition.”):
identifying a baby delivery date associated with a patient of the database entry (Par. 76, “131 women in the postpartum period. Women were asked open-ended questions…”) (Par. 50, “As previously stated, feature data can be found by asking patients (e.g., pregnant and postpartum women, in this case) to describe their recent activities, interactions, and feeling”);
determining whether a depression event occurs within a specified time window after the baby delivery date  (Par. 46, “the data processing device 120 is configured to map the topics and sentiments conveyed in natural language journal entries to measures of psychosocial risk using three distinct natural language processing algorithmic approaches…”)(Par. 50, “As previously stated, feature data can be found by asking patients (e.g., pregnant and postpartum women, in this case) to describe their recent activities, interactions, and feeling”) (Par. 76, “…131 women in the postpartum period. Women were asked open-ended questions, e.g., “What events have most impacted your mood in the past 24 hours?” and multiple-choice questions, e.g., “How would you describe your mood in the past 24 hours (very poor=1 to very good=5)?” as well as established psychometric measures of wellbeing, including the EPDS. To predict EPDS scores from our sample's open-ended responses, the methods described above in relation to FIGS. 1-6C were used. By running two of these algorithm types on the same data set, a set of unique scores were generated from the open-ended text and were entered into a penalized logistic regression model of depression, using a threshold of EPDS score >13. Table 3 presents initial results.”) (Par. 81, “the data processing system 200 is configured to perform (702) a Natural Language Processing (NLP) on input data received from one or more input sources. In some implementations, the NLP of the input data is performed by the input source. The input source can include a detection device (e.g., detection device 110 of FIG. 1) configured to receive text data…”); and
generating a post-partum depression condition output in response to the depression event occurring within the specified time window after the baby delivery date (Fig. 2, Par. 54, “The classifier data 260 is sent to a prediction engine 270. The prediction engine 270 is configured to assign probabilities to one or more health risks as being present for the patient. The prediction data 280 shows the likelihood that each of one or more health risks is present for the patient….”) (Par. 43, “FIG. 2 shows an example of a data processing system 200. The data processing system 200 in this example includes the detection device 110 and the data processing device 120 of FIG. 1.”) (Par. 47, “Returning to FIG. 2, the input data 205 are transformed into features of a feature vector 220 by feature vector generation engine 210 (such as using the NLP models described in relation to FIG. 3A).”) (Par. 76, “131 women in the postpartum period. Women were asked open-ended questions…”) (Par. 50, “As previously stated, feature data can be found by asking patients (e.g., pregnant and postpartum women, in this case) to describe their recent activities, interactions, and feeling”) (Par. 77, “Table 3 shows R2 and Area Under the ROC curve (AUROC) for depression by each of the NLP approaches across U.S. reproductive-aged women. EPDS >13 indicates meaningful possibility to high probability of clinical depression.”). 
Stamatopoulos, Anushiravani, Rosenbek, Jain, and Davis are considered to be analogous art to the claimed invention as they are involved with machine learning and biological related data. 
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, and Jain with that of Davis to include wherein processing the feature vector input to generate the condition likelihood output includes: identifying a baby delivery date associated with a patient of the database entry; determining whether a depression event occurs within a specified time window after the baby delivery date; and generating a post-partum depression condition output in response to the depression event occurring within the specified time window after the baby delivery date through the substitution of disease types as postpartum depression is a known condition within one year after delivery (Davis (Par. 5)) and it would have yielded the predictable result of evaluating the mental health of individuals after delivery (Davis (Par. 4-6)). 
Regarding claim 22, modified Stamatopoulos fails to explicitly disclose the limitations of the claim. 
However, Davis further teaches wherein the specified time window is one year (Davis (Par. 5, “Generally, depression, mental health risks, and other risks during pregnancy and the postpartum period (including early postpartum—6-8 weeks after delivery to late postpartum—up to one year after delivery)…”)).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the method of Stamatopoulos, Anushiravani, Rosenbek, Jain, and Davis with that of Davis to include wherein the specified time window is one year for the reasoning as indicated in claim 21 above.  
Response to Arguments

























Applicant’s arguments, filed 01/30/2026, regarding the previous 112 rejections, have been fully considered and deemed as not persuasive. 
The applicant’s argument, regarding the added limitations to the claim, have been fully considered and deemed as not persuasive. As the limitations were not previously addressed, the limitations have been addressed in the 112 rejections as indicated above.
The applicant argument, that there is sufficient support for the machine learning model within the specification and that the specification is not required to provide the actual weight values or biases present within the model, has been fully considered and deemed as not persuasive.
As indicated in the 112 rejection above, “The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.”
The applicant specifically focuses on Par. 93-95, which states “a machine learning model may be trained to predict a likelihood that a person is currently experiencing addiction. In this case, the model may be trained to identify one or more features in audio data of the member's speech that is associated with addiction, such as slurs in the speech, mispronunciation, pauses, forgetting basic words, difficulty in completing sentences, repetition of words or phrases, etc. The model may compare similar statements of one person against their own prior statements, or against similar statements by other people.”(Par. 93 of applicant’s spec.), “In various implementations, a machine learning model may be trained to identify phrases that are specific to members that may be experiencing drug addiction…” (Par. 94 of applicant’s spec.), and “As another example, a machine learning model may be trained to predict a likelihood of a depression condition. In this case, the model may be trained to look for specific tone in audio data of a member's speech, for a specific choice of words or phrases (such as cussing, slang, changes in word choice over time, etc.), pauses when speaking, a speed of the speech, or any other suitable features that have been highlighted as indicative of depression in published research studies…” (Par. 95 of applicant’s spec.). However, this does not provide sufficient detail in regards to the structure of the machine learning model itself, and merely focuses on the result of the generated algorithm.  
The applicant further argues that there is sufficient support for the training of the machine learning model as indicated in Par. 57-78 of the provided specification and Fig. 3-5 of the provided drawings. While the applicant has provided an overview of the training process within the indicated paragraphs and figures, the applicant’s specification lacks sufficient detail in regards to the structure of the machine learning model itself, this does not provide sufficient support as to the exact weights and biases utilized for the machine learning model. 
The applicant further argues that there is sufficient support for the claimed machine learning model, specifically focusing on Par. 67, Par. 69 and Par. 77 of the provided specification. The corresponding Paragraphs applicant’s specification state “FIG. 4A shows a fully connected neural network, where each neuron in a given layer is connected to each neuron in a next layer. In the input layer, each input node is associated with a numerical value, which can be any real number. In each layer, each connection that departs from an input node has a weight associated with it, which can also be any real number (see FIG. 4B). In the input layer, the number of neurons equals number of features (columns) in a dataset. The output layer may have multiple continuous outputs.” (Par. 67 of applicant’s spec.),  “The number of neurons can be optimized. At the beginning of training, a network configuration is more likely to have excess nodes. Some of the nodes may be removed from the network during training that would not noticeably affect network performance. For example, nodes with weights approaching zero after training can be removed (this process is called pruning). The number of neurons can cause under-fitting (inability to adequately capture signals in dataset) or over-fitting (insufficient information to train all neurons; network performs well on training dataset but not on test dataset).” (Par. 69 of applicant’s spec.) and “In various implementations, each input node in the input layer may be associated with a numerical value, which can be any real number. In each layer, each connection that departs from an input node has a weight associated with it, which can also be any real number. In the input layer, the number of neurons equals number of features (columns) in a dataset. The output layer may have multiple continuous outputs.” (Par. 77 of applicant’s spec.). However, the indicated paragraphs do not provide sufficient support as simply reciting the existence of nodes and the presence of any amount of weight does not provide sufficient support as to the exact weights and biases utilized for the machine learning model. 
As such, the applicant has not provided sufficient detail in regards to the structure of the machine learning model itself, and the rejection is maintained. 

Applicant’s arguments, filed 01/30/2026, regarding the previous 103 rejection, have been fully considered and deemed as not persuasive.
The applicant’s argument, that the prior art does not teach the added limitations to claim, have been fully considered and deemed as not persuasive. As the limitations were not previously addressed, the limitations have been addressed in the 103 rejection as indicated above.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARI SINGH KANE PADDA whose telephone number is (571)272-7228. The examiner can normally be reached Monday - Friday 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jason Sims can be reached at (571) 272-7540. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ARI S PADDA/           Examiner, Art Unit 3791                                                                                                                                                                                             
/JASON M SIMS/           Supervisory Patent Examiner, Art Unit 3791
Read full office action
Prosecution Timeline

Jan 21, 2022
Application Filed
Apr 05, 2025
Non-Final Rejection — §103, §112
Jul 14, 2025
Response Filed
Oct 17, 2025
Final Rejection — §103, §112
Jan 30, 2026
Request for Continued Examination
Feb 20, 2026
Response after Non-Final Action
Mar 16, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/417,515
Patent 12588839
Component Concentration Measuring Device
2y 5m to grant Granted Mar 31, 2026
17/200,453
Patent 12564351
PERSONAL APPARATUS FOR CONDUCTING ELECTROENCEPHALOGRAPHY
2y 5m to grant Granted Mar 03, 2026
17/527,398
Patent 12558189
METHODS AND APPARATUS FOR DIRECT MARKING
2y 5m to grant Granted Feb 24, 2026
17/251,254
Patent 12029548
DEVICE FOR SELECTIVE COLLECTION AND CONDENSATION OF EXHALED BREATH
2y 5m to grant Granted Jul 09, 2024
17/130,259
Patent 11850049
APPARATUS FOR AUTOMATICALLY MEASURING URINE VOLUME AND SYSTEM FOR AUTOMATICALLY MEASURING URINE VOLUME
2y 5m to grant Granted Dec 26, 2023
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
17%
Grant Probability
32%
With Interview (+15.6%)
4y 1m
Median Time to Grant
High
PTA Risk
Based on 42 resolved cases by this examiner. Grant probability derived from career allow rate.