DETAILED ACTION
This communication is in response to the Amendments and Arguments filed on 11/14/2025.
Claims 1-35 are pending and have been examined.
All previous objections / rejections not mentioned in this Office Action have been withdrawn by the examiner.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments and Amendments
Applicant has amended independent claim 1, 11, and 21. Applicant has also added new claims 33-35.
Regarding the Applicant’s arguments for the rejections under 35 U.S.C. § 101, applicant has amended independent claim 1, 11, and 21. Rejection for claim 11 has been withdrawn. Regarding claim 1 and 21 Applicant asserts that the claims recited cannot be performed in the mind and discloses a specific technological improvement and therefore is not directed to an abstract idea. Examiner respectfully disagrees. During patent examination, pending claims must be “given their broadest reasonable interpretation consistent with the specification.” MPEP 2111. Also, claims "must particularly point out and distinctly claim the invention." MPEP 2173. First, broadly interpreted, the human mind can convert speech into text and measure semantic relevancy. Second, the use of automated speech recognition and signal processing circuitry reads to generalized computer components that are mere instructions to apply an exception using a generic computer. Third, the use of machine learning algorithm or model in claim 1 and 21 can be interpreted as rules or instructions that can be followed mentally. Lastly, broad reading of the claim language does not include additional elements that include specific technological improvement. Broadly interpreted, the semantic relevance score can be computed in the human mind and does not include specific technological improvement. The claim does not describe any specific improvement to technical features that would demonstrate integration of the abstract idea into a practical application. Therefore, the claims as currently recited does not overcome the 35 U.S.C. § 101 abstract idea rejection.
Regarding the Applicant’s arguments for the rejections under 35 U.S.C. § 102, applicant has amended independent claim 1, 11, and 21. Since Applicant’s arguments are directed towards the new amendment, the arguments are moot in view of new grounds for rejection.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claim 33 rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. The specification does not teach “iteratively determine a semantic relevance score for the subject across multiple timepoints”.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) are: “audio input circuitry configured to receive” in claim 1, “signal processing circuitry configured to” in claims 1, 6, 8, 9, 10, 16, 33, “classifier configured to” in claims 10, 20, “model configured to” in claim 21, 32, “device is configured to” in claim 27, and “application is configured to” in claim 31.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim 1-10 and 21-34 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1 the limitations of “receive an audio signal provided by a subject”, “process the input signal to detect one or more metrics of speech of the subject”, and “analyze the one or more metrics of speech using a speech assessment algorithm to generate an evaluation of a cognitive function of the subject”, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. More specifically, the mental process of a human hearing speech, analyzing the speech in the mind, and thinking of the cognitive function of the person speaking. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the --Mental Processes-- grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
Regarding claim 21 the limitations of “receiving input signal comprising speech audio for a plurality of subjects”, “processing the input signal to detect one or more metrics of speech in the speech audio for the plurality of subjects”, “identifying classifications corresponding to cognitive function for the speech audio for the plurality of subjects”, and “training a model using machine learning based on a training data set comprising the one or more metrics of speech and the classifications identified in the speech audio, thereby generating a machine learning predictive model configured to generate an evaluation of cognitive function based on speech”, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. More specifically, the mental process of a human hearing speech, analyzing the speech in the mind, identifying classifications corresponding to cognitive function of the speaker in the mind, and thinking of and learning a set of rules by hearing speech examples and associated classification. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the --Mental Processes-- grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
This judicial exception is not integrated into a practical application because the recitation of a device, audio input circuitry, and signal processing circuitry in claim 1, reads to generalized computer components, based upon the claim interpretation wherein the structure is interpreted using P0041, P0042, P0083-P0089 in the specification. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using generalized computer components, in claim 1 and 11, to hear speech, analyze the speech in the mind, and think of the cognitive function of the person speaking amounts to no more than mere instructions to apply the exception using a generic computer component. Also, the additional element of using generalized computer components, in claim 21, to hear speech, analyze the speech in the mind, identify classifications corresponding to cognitive function of the speaker in the mind, and think of and learn a set of rules by hearing speech examples and associated classification, amounts to no more than mere instructions to apply the exception using generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
With respect to claim 2 and 22, the claim recites “wherein the evaluation of the cognitive function comprises detection or prediction of future cognitive decline”, which reads on a human thinking of future cognitive decline from speech heard in the mind. No additional limitations are present.
With respect to claim 3 and 23, the claim recites “wherein the evaluation of the cognitive function comprises a prediction or classification of normal cognition, early mild cognitive impairment, mild cognitive impairment, or dementia”, which reads on a human thinking of cognitive function of normal cognition, early mind cognitive impairment, mild cognitive impairment, or dementia in the mind after hearing speech. No additional limitations are present.
With respect to claim 4 and 24, the claim recites “wherein the one or more metrics of speech of the subject comprises a metric of semantic relevance, word count, ratio of unique words to total number of words (MATTR), pronoun-to-noun ratio, propositional density, number of pauses during an audio speech recording within the input signal, or any combination thereof”, which reads on a human thinking of metrics of speech after hearing speech in the mind. No additional limitations are present.
With respect to claim 5 and 25, the claim recites “wherein the metric of semantic relevance measures a degree of overlap between a content of a picture and a description of the picture detected from the speech in the input signal”, which reads on a human thinking of semantic relevance of speech heard and picture in the mind. No additional limitations are present.
With respect to claim 6 and 27, the claim recites “display an output comprising the evaluation”, which reads on a human writing the cognitive function on paper using a pen or pencil. No additional limitations are present.
With respect to claim 7, the claim recites “wherein the notification element comprises a display”, which reads on a human writing the cognitive function on paper using a pen or pencil. No additional limitations are present.
With respect to claim 8, the claim recites “prompt the subject to provide a speech sample from which the input signal is derived”, which reads on a human writing words on paper using a pen or pencil to ask people to talk. No additional limitations are present.
With respect to claim 9, the claim recites “utilize at least one machine learning classifier to generate the evaluation of the cognitive function of the subject”, which reads on a human using a set of rules to think of an evaluation of cognitive function. No additional limitations are present.
With respect to claim 10, the claim recites “utilize a plurality of machine learning classifiers comprising a first classifier configured to evaluate the subject for a first cognitive function or condition and a second classifier configured to evaluate the subject for a second cognitive function or condition”, which reads on a human using 2 set of rules to think of an evaluation of cognitive function. No additional limitations are present.
With respect to claim 26, the claim recites “configuring a computing device with executable instructions for analyzing the one or more metrics of speech using the machine learning predictive model to generate an evaluation of a cognitive function of a subject based on the input speech sample”, which reads on a human using a set of rules to think of an evaluation of cognitive function. No additional limitations are present.
With respect to claim 28, the claim recites “wherein the computing device is a desktop computer, a laptop, a smartphone, a tablet, or a smartwatch”, which reads on a human utilizing the mind for evaluating cognitive function. No additional limitations are present.
With respect to claim 29, the claim recites “wherein the configuring the computing device with executable instructions comprises providing a software application for installation on the computing device”, which reads on a human utilizing the mind for evaluating cognitive function. No additional limitations are present.
With respect to claim 30, the claim recites “wherein the computing device is a smartphone, a tablet, or a smartwatch; and wherein the software application is a mobile application”, which reads on a human utilizing the mind for evaluating cognitive function. No additional limitations are present.
With respect to claim 31, the claim recites “wherein the mobile application is configured to prompt the subject to provide the input speech sample”, which reads on a human writing words on paper using a pen or pencil to ask people to talk. No additional limitations are present.
With respect to claim 32, the claim recites “wherein the input speech sample is processed by one or more machine learning models to generate the one or more metrics of speech; wherein the machine learning predictive model is configured to the evaluation of cognitive function as a composite metric based on the one or more metrics of speech”, which reads on a human thinking of one or more metrics of speech. No additional limitations are present.
With respect to claim 33, the claim recites “wherein the signal processing circuitry is further configured to iteratively determine a semantic relevance score for the subject across multiple timepoints to track longitudinal changes in language of the subject as an indicator of cognitive decline progression”, which reads on a human hearing speech, utilizing a set of rules to generate metrics of speech, utilizing another set of rules to evaluate cognitive function from the metrics of speech. No additional limitations are present.
With respect to claim 34, the claim recites “wherein to detect the one or more metrics of speech the signal processing circuitry executes computer-implemented algorithms that process the speech audio and transcript to identify linguistic and acoustic elements and to compute quantitative measures within the audio signal that relate to vocabulary including a ratio of unique words to total number of words (MATTR), a propositional density, a type-to-token ratio (TTR), and mean word length”, which reads on a human hearing speech to think of text and determining quantitative measures in the text. No additional limitations are present.
With respect to claim 35, the claim recites “wherein the predictors comprise at least one of: ratio of unique words to total number of words (MATTR), pronoun-to-noun ratio, propositional density, parse tree height, mean length of word, type-to-token ratio, proportion of relevant details correctly identified in a picture description, duration of pauses relative to total speaking duration, and word count”, which reads on a human hearing speech and utilizing a set of metrics of speech in the mind. No additional limitations are present.
These claims further do not remedy the judicial exception being integrated into a practical application and further fail to include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-31 and 34-35 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Zaldua et al. (U.S. PG Pub No. 20240023877), hereinafter Zaldua.
Regarding claim 1 Zaldua teaches:
A device for evaluating cognitive function based on speech, the device comprising: (P0045, System comprising a mobile device and a data processing apparatus for carrying out the method of detecting cognitive impairment.)
audio input circuitry configured to receive an audio signal provided by a subject; and (P0049, Receiving audio data representing recorded utterances.; P0155, The mobile device may also record the utterances made by the patient P and generate audio data. For this purpose, the mobile device may be provided with a microphone or an array of microphones.)
signal processing circuitry configured to: process the input signal to detect one or more metrics of speech of the subject, the one or more metrics of speech including a measure of semantic relevancy that defines a degree of overlap between reference content units of a picture and a description of the picture from the audio signal, (P0052, The received audio data may then be processed by a speech-to-text engine to produce a text transcription of the recorded utterances. The speech-to-text engine may be implemented locally or remotely in an external server.; P0055, Next, the text transcription of the recorded utterances may be processed to calculate a plurality of test variables.; P0116, Image description test.; P0123, Number of correct F-words.; P0151, In an image description text, the predetermined visual information may include an image, and text-based instruction asking the patient to describe as many things as they can see in the image within a time limit.)
wherein to detect the measure of semantic relevancy the signal processing circuitry generates a transcript of the description from the audio signal using automated speech recognition (ASR), and algorithmically computes a semantic relevance score via a scan of the transcript to identify the content units of the picture in the description, wherein the semantic relevance score reflects a proportion of spoken words from the description related to the content units and provides an automated and more accurate insight into cognition; and (P0052, The received audio data may then be processed by a speech-to-text engine to produce a text transcription of the recorded utterances. The speech-to-text engine may be implemented locally or remotely in an external server.; P0055, Next, the text transcription of the recorded utterances may be processed to calculate a plurality of test variables.; P0018, Test variables may comprise one, two, or all of: mean number of correct words, mean percentage of incorrect words, and mean average correct word closeness.; P0089, Following list of test variables are found to be suitable: [Examples P0090-P0125].; P0116, Image description test.; P0123, Number of correct F-words.; P0151, In an image description text, the predetermined visual information may include an image, and text-based instruction asking the patient to describe as many things as they can see in the image within a time limit.)
analyze, with the signal processing circuitry, the one or more metrics of speech using a machine-learning speech assessment algorithm comprising executable instructions and trained to generate an evaluation of a cognitive function of the subject from the one or more metrics of speech. (P0056, The plurality of test variables may be passed on to a trained detection model. The trained detection model, taking the plurality of test variables as input, may calculate an impairment possibility indicating a likelihood that the patient suffers from the cognitive impairment.; P0146, Each of the detection models and the final detection model may be implemented using any suitable Machine Learning algorithm.)
Regarding claim 11 Zaldua teaches:
A computer-implemented method for evaluating cognitive function based on speech, the method comprising: (P0045, System comprising a mobile device and a data processing apparatus for carrying out the method of detecting cognitive impairment.)
receiving, with audio input circuitry, an input signal provided by a subject; (P0049, Receiving audio data representing recorded utterances.; P0155, The mobile device may also record the utterances made by the patient P and generate audio data. For this purpose, the mobile device may be provided with a microphone or an array of microphones.)
generating, via signal processing circuitry, a transcript from the input signal; (P0052, The received audio data may then be processed by a speech-to-text engine to produce a text transcription of the recorded utterances. The speech-to-text engine may be implemented locally or remotely in an external server.; P0055, Next, the text transcription of the recorded utterances may be processed to calculate a plurality of test variables.)
automatically extracting, via the signal processing circuitry, a plurality of speech features defining digital speech-based measures from the transcript; (P0018, Test variables may comprise one, two, or all of: mean number of correct words, mean percentage of incorrect words, and mean average correct word closeness.; P0089, Following list of test variables are found to be suitable: [Examples P0090-P0125].)
selecting, via signal processing circuitry, a restricted number of the plurality of speech features as predictors for model input, the predictors previously shown to be representative of vocabulary, language processing, and ability to convey relevant picture details; and (P0089, Following list of test variables are found to be suitable: [Examples P0090-P0125].; P0088, It may be desirable to choose test variables which, individually, provides significant predictive power.; P0116, Image description test.; P0123, Number of correct F-words.; P0151, In an image description text, the predetermined visual information may include an image, and text-based instruction asking the patient to describe as many things as they can see in the image within a time limit.)
applying, by the signal processing circuitry, the model input to a trained machine learning (ML) model to output at least one of a classification or prediction for evaluation of cognitive function for the subject, the trained predictive model having previously been fit with the plurality of speech features and trained on training data labeled to separate healthy subjects from other subjects according to cognitive function. (P0056, The plurality of test variables may be passed on to a trained detection model. The trained detection model, taking the plurality of test variables as input, may calculate an impairment possibility indicating a likelihood that the patient suffers from the cognitive impairment.; P0146, Each of the detection models and the final detection model may be implemented using any suitable Machine Learning algorithm.; P0157, The detection models require prior training with reference data. Each of the detection models may be trained separately with its own set of reference data.; P0158, As in an actual performance of the method of detecting cognitive impairment disclosed in the present application, for the purpose of preparing training data, the neuropsychological test may also be presented to the patient P1, P2 by displaying predetermined visual information and/or providing predetermined audible information to the patient P1, P2 prompting them to make utterances.)
Regarding claim 2 and 12 Zaldua teaches claim 1 and 11 and further teaches:
wherein the evaluation of the cognitive function comprises detection or prediction of future cognitive decline. (P0044, Present disclosure may enable scalable and accurate screening of cognitive impairment, which may be useful for large-scale screening for early signs of cognitive impairment.; P0081, By dividing the final impairment probability into three ranges using two predetermined thresholds, the indication may indicate an absence of impairment or one of two degrees of impairment. It should be understood that a finer division of the ranges of final impairment probability may be used. That is, three or more predetermined thresholds may be defined and applied to the final impairment probability, so that the indication may indicate four or more possible outcomes corresponding to varying degrees of cognitive impairment or non-impairment.)
Regarding claim 3 and 13 Zaldua teaches claim 1 and 11 and further teaches:
wherein the evaluation of the cognitive function comprises a prediction or classification of normal cognition, early mild cognitive impairment, mild cognitive impairment, or dementia. (P0082, It may be useful, from a health policy point of view, to provide fast and effective screening of mild cognitive impairment or early stage dementia. In this case, it may be enough to use two predetermined thresholds resulting in three possible outcomes, namely 1) no cognitive impairment, 2) mild cognitive impairment, and 3) dementia.)
Regarding claim 4 and 14 Zaldua teaches claim 1 and 11 and further teaches:
wherein the one or more metrics of speech of the subject comprises a metric of semantic relevance, word count, ratio of unique words to total number of words (MATTR), pronoun-to-noun ratio, propositional density, number of pauses during an audio speech recording within the input signal, or any combination thereof. (P0089, Following list of test variables are found to be suitable: [Examples P0090-P0125].; P0118, Number of nouns.; P0119, Number of verbs.; P0120, Ratio nouns/pronouns.)
Regarding claim 5 and 15 Zaldua teaches claim 4 and 14 and further teaches:
wherein the metric of semantic relevance measures a degree of overlap between a content of a picture and a description of the picture detected from the speech in the input signal. (P0116, Image description test.; P0123, Number of correct F-words.; P0151, In an image description text, the predetermined visual information may include an image, and text-based instruction asking the patient to describe as many things as they can see in the image within a time limit.)
Regarding claim 6 and 16 Zaldua teaches claim 1 and 11 and further teaches:
display an output comprising the evaluation. (P0186, Results of a method of the invention may be displayed to a user or stored in any suitable storage medium.)
Regarding claim 7 and 17 Zaldua teaches claim 1 and 11 and further teaches:
generates a wherein the notification element that comprises a display. (P0155, With reference to FIG. 8, the presentation of predetermined visual information and/or predetermined audio information may be achieved by a mobile device, which may be provided with a display and/or loudspeakers.)
Regarding claim 8 Zaldua teaches claim 7 and further teaches:
cause the display to prompt the subject to provide a speech sample from which the input signal is derived. (P0147, The patient may be asked to make utterances in accordance with different neuropsychological tests, which utterances may be recorded to generate the audio data used in subsequent processing. In order to prompt the patient to make these utterances, the present method may further comprise displaying predetermined visual information.)
Regarding claim 18 Zaldua teaches claim 11 and further teaches:
prompting the subject to provide a speech sample from which the input signal is derived. (P0147, The patient may be asked to make utterances in accordance with different neuropsychological tests, which utterances may be recorded to generate the audio data used in subsequent processing. In order to prompt the patient to make these utterances, the present method may further comprise displaying predetermined visual information.)
Regarding claim 9 and 19 Zaldua teaches claim 1 and 11 and further teaches:
utilize at least one machine learning classifier to generate the evaluation of the cognitive function of the subject. (P0145, The calculation of the final impairment probability may be performed using the trained final detection model.; P0146, Detection model may be implemented using any suitable Machine Learning algorithm.)
Regarding claim 10 Zaldua teaches claim 9 and further teaches:
utilize a plurality of machine learning classifiers comprising a first classifier configured to evaluate the subject for a first cognitive function or condition and a second classifier configured to evaluate the subject for a second cognitive function or condition. (P0073, The text description may be processed to calculate first and second pluralities of test variables associated with first and second neuropsychological tests, and first and second impairment probabilities may be calculated by first and second trained detection models. … The final impairment probability may be calculated based on the first, second.)
Regarding claim 20 Zaldua teaches claim 19 and further teaches:
the at least one machine learning classifier comprises a first classifier configured to evaluate the subject for a first cognitive function or condition and a second classifier configured to evaluate the subject for a second cognitive function or condition. (P0073, The text description may be processed to calculate first and second pluralities of test variables associated with first and second neuropsychological tests, and first and second impairment probabilities may be calculated by first and second trained detection models. … The final impairment probability may be calculated based on the first, second.)
Regarding claim 21 Zaldua teaches:
A computer-implemented method for generating a speech assessment algorithm comprising a machine learning predictive model for evaluating cognitive function based on speech, the method comprising: (P0045, System comprising a mobile device and a data processing apparatus for carrying out the method of detecting cognitive impairment.; P0146, Each of the detection models and the final detection model may be implemented using any suitable Machine Learning algorithm.; P0157, One or more detection models may be used in the present method. The detection models require prior training with reference data. Each of the detection models may be trained separately with its own set of reference data. With reference to FIG. 5, there is disclosed a method of training a detection model for use with a neuropsychological test.)
receiving input signal comprising speech audio for a plurality of subjects; (P0158, FIG. 5, for a detection model corresponding to a given neuropsychological test, a plurality of patients P1, P2 are required to complete the neuropsychological test. As in an actual performance of the method of detecting cognitive impairment disclosed in the present application, for the purpose of preparing training data, the neuropsychological test may also be presented to the patient P1, P2 by displaying predetermined visual information and/or providing predetermined audible information to the patient P1, P2 prompting them to make utterances.; P0159, The utterances made by the patient P1, P2 during the test may be recorded, such as by the computing device.)
processing the input signal to detect one or more metrics of speech in the speech audio for the plurality of subjects; (P0161, The audio data may be processed by a speech-to-text engine to produce respective text transcription of the recorded utterances made by the respective patient P1, P2. Analogous to the method of detecting cognitive impairment, the text transcription may be processed to calculate a plurality of test variables associated with the respective neuropsychological test for which the detection model is to be trained.)
identifying classifications corresponding to cognitive function for the speech audio for the plurality of subjects; and (P0160, The audio data may be given to a clinical practitioner C, who may listen to the recorded utterances represented by the audio data. The clinical practitioner C may, based on the audio data, make diagnoses indicating whether the respective patient suffers from the cognitive impairment.; P0161, Each plurality of test variables may be associated with the respective diagnosis made by the clinical practitioner C. Each plurality of test variables may also be associated with the respective patient P1, P2.)
training a model using machine learning based on a training data set comprising the one or more metrics of speech and the classifications identified in the speech audio, thereby generating a machine learning predictive model configured to generate an evaluation of cognitive function based on speech. (P0162, By repeating the above process for a plurality of patients P1, P2, a collection of pluralities of test variables and a corresponding collection of diagnoses may be obtained and may serve as reference data for training the detection model. The collection of diagnoses may be taken as ground truth for the purpose of training the detection model. Using the collection of pluralities of test variables and the corresponding collection of diagnoses, the detection model may be trained.)
Regarding claim 22 Zaldua teaches claim 21 and further teaches:
wherein the evaluation of the cognitive function comprises detection or prediction of future cognitive decline. (P0044, Present disclosure may enable scalable and accurate screening of cognitive impairment, which may be useful for large-scale screening for early signs of cognitive impairment.; P0081, By dividing the final impairment probability into three ranges using two predetermined thresholds, the indication may indicate an absence of impairment or one of two degrees of impairment. It should be understood that a finer division of the ranges of final impairment probability may be used. That is, three or more predetermined thresholds may be defined and applied to the final impairment probability, so that the indication may indicate four or more possible outcomes corresponding to varying degrees of cognitive impairment or non-impairment.)
Regarding claim 23 Zaldua teaches claim 21 and further teaches:
wherein the evaluation of the cognitive function comprises a prediction or classification of normal cognition, early mild cognitive impairment, mild cognitive impairment, or dementia. (P0082, It may be useful, from a health policy point of view, to provide fast and effective screening of mild cognitive impairment or early stage dementia. In this case, it may be enough to use two predetermined thresholds resulting in three possible outcomes, namely 1) no cognitive impairment, 2) mild cognitive impairment, and 3) dementia.)
Regarding claim 24 Zaldua teaches claim 21 and further teaches:
wherein the one or more metrics of speech of the subject comprises a metric of semantic relevance, word count, ratio of unique words to total number of words (MATTR), pronoun-to-noun ratio, propositional density, number of pauses during an audio speech recording within the input signal, or any combination thereof. (P0089, Following list of test variables are found to be suitable: [Examples P0090-P0125].; P0118, Number of nouns.; P0119, Number of verbs.; P0120, Ratio nouns/pronouns.)
Regarding claim 25 Zaldua teaches claim 24 and further teaches:
wherein the metric of semantic relevance measures a degree of overlap between a content of a picture and a description of the picture detected from the speech in the input signal. (P0116, Image description test.; P0123, Number of correct F-words.; P0151, In an image description text, the predetermined visual information may include an image, and text-based instruction asking the patient to describe as many things as they can see in the image within a time limit.)
Regarding claim 26 Zaldua teaches claim 21 and further teaches:
configuring a computing device with executable instructions for analyzing the one or more metrics of speech using the machine learning predictive model to generate an evaluation of a cognitive function of a subject based on the input signal. (P0056, The plurality of test variables may be passed on to a trained detection model. The trained detection model, taking the plurality of test variables as input, may calculate an impairment possibility indicating a likelihood that the patient suffers from the cognitive impairment.; P0145, The calculation of the final impairment probability may be performed using the trained final detection model.; P0146, Detection model may be implemented using any suitable Machine Learning algorithm.)
Regarding claim 27 Zaldua teaches claim 26 and further teaches:
display an output comprising the evaluation. (P0186, Results of a method of the invention may be displayed to a user or stored in any suitable storage medium.)
Regarding claim 28 Zaldua teaches claim 26 and further teaches:
wherein the computing device is a desktop computer, a laptop, a smartphone, a tablet, or a smartwatch. (P0045, System comprising a mobile device and a data processing apparatus for carrying out the method of detecting cognitive impairment.; P0156, Mobile device may instead be a desktop computer.)
Regarding claim 29 Zaldua teaches claim 26 and further teaches:
wherein the configuring the computing device with executable instructions comprises providing a software application for installation on the computing device. (P0186, The present invention may be embodied in a non-transitory computer-readable storage medium that stores instructions to carry out a method of the invention. The present invention may be embodied in a computer system comprising one or more processors and memory or storage storing instructions to carry out a method of the invention.)
Regarding claim 30 Zaldua teaches claim 29 and further teaches:
wherein the computing device is a smartphone, a tablet, or a smartwatch; and wherein the software application is a mobile application. (P0045, System comprising a mobile device and a data processing apparatus for carrying out the method of detecting cognitive impairment. [A person of ordinary skill in the art will know a smartphone is a mobile device.]; P0028, Computer program product comprising instructions which, when the program is executed by one or more processors, cause the one or more processors to receive a selection of one or more neuropsychological tests from an operator, and, based on the selection, selectively carry out the above method.)
Regarding claim 31 Zaldua teaches claim 30 and further teaches:
wherein the mobile application is configured to prompt the subject to provide the input signal. (P0045, System comprising a mobile device and a data processing apparatus for carrying out the method of detecting cognitive impairment.; P0147, The patient may be asked to make utterances in accordance with different neuropsychological tests, which utterances may be recorded to generate the audio data used in subsequent processing. In order to prompt the patient to make these utterances, the present method may further comprise displaying predetermined visual information.)
Regarding claim 34 Zaldua teaches claim 1 and further teaches:
wherein to detect the one or more metrics of speech the signal processing circuitry executes computer-implemented algorithms that process the speech audio and transcript to identify linguistic and acoustic elements and to compute quantitative measures within the audio signal that relate to vocabulary including a ratio of unique words to total number of words (MATTR), a propositional density, a type-to-token ratio (TTR), and mean word length. (P0089, Following list of test variables are found to be suitable: [Examples P0090-P0125].; P0118, Number of nouns.; P0119, Number of verbs.; P0120, Ratio nouns/pronouns.; P0006, The language features in question are speaking rate, number of pause fillers (e.g. “ums” and “ahs”), the difficulty of words, or the parts of speech of words following the pause fillers.)
Regarding claim 35 Zaldua teaches claim 11 and further teaches:
wherein the predictors comprise at least one of: ratio of unique words to total number of words (MATTR), pronoun-to-noun ratio, propositional density, parse tree height, mean length of word, type-to-token ratio, proportion of relevant details correctly identified in a picture description, duration of pauses relative to total speaking duration, and word count. (P0089, Following list of test variables are found to be suitable: [Examples P0090-P0125].; P0118, Number of nouns.; P0119, Number of verbs.; P0120, Ratio nouns/pronouns.; P0006, The language features in question are speaking rate, number of pause fillers (e.g. “ums” and “ahs”), the difficulty of words, or the parts of speech of words following the pause fillers.)
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim 32 is rejected under 35 U.S.C. 103 as being unpatentable over Zaldua in view of Senior et al. (U.S. PG Pub No. 20150039301), hereinafter Senior.
Regarding claim 32 Zaldua teaches claim 21.
Zaldua further teaches:
wherein the input signal is processed by one or more machine learning models to generate the one or more metrics of speech; wherein the machine learning predictive model is configured to the evaluation of cognitive function as a composite metric based on the one or more metrics of speech. (P0052, The received audio data may then be processed by a speech-to-text engine to produce a text transcription of the recorded utterances.; P0055, the text transcription of the recorded utterances may be processed to calculate a plurality of test variables.; P0056, The plurality of test variables may be passed on to a trained detection model. The trained detection model, taking the plurality of test variables as input, may calculate an impairment possibility indicating a likelihood that the patient suffers from the cognitive impairment.; P0145, The calculation of the final impairment probability may be performed using the trained final detection model.; P0146, Detection model may be implemented using any suitable Machine Learning algorithm.)
Zaldua does not specifically teach:
wherein the input signal is processed by one or more machine learning models to generate the one or more metrics of speech; wherein the machine learning predictive model is configured to the evaluation of cognitive function as a composite metric based on the one or more metrics of speech.
Senior, however, teaches:
wherein the input signal is processed by one or more machine learning models to generate the one or more metrics of speech; wherein the machine learning predictive model is configured to the evaluation of cognitive function as a composite metric based on the one or more metrics of speech. (P0017, FIG. 1 is a block diagram that illustrates an example of a system for speech recognition using neural networks. … computing system uses output from the neural network 140 to identify a transcription for the utterance.)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to process input speech by one or more machine learning models. It would have been obvious to combine the references because the use of neural network improves speech recognition by measuring audio characteristics that are independent of what words were spoken to better isolate or filter out the distracting audio characteristics and better identify words or components of words that were spoken. (Senior P0004)
Claim 33 is rejected under 35 U.S.C. 103 as being unpatentable over Zaldua in view of "Cognitive impairment screening using m-health: an android implementation of the mini-mental state examination (MMSE) using speech recognition" by Devos et al., hereinafter Devos.
Regarding claim 33 Zaluda teach claim 1.
Zaluda does not specifically teach:
wherein the signal processing circuitry is further configured to iteratively determine a semantic relevance score for the subject across multiple timepoints to track longitudinal changes in language of the subject as an indicator of cognitive decline progression.
Devos, however, teaches:
wherein the signal processing circuitry is further configured to iteratively determine a semantic relevance score for the subject across multiple timepoints to track longitudinal changes in language of the subject as an indicator of cognitive decline progression. (Introduction, Assessing the cognitive impairment of a person on a periodically. … A score is linked to these results and an indication of the person’s cognitive impairment (for instance, the state of dementia, if appropriate) is given.)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to track cognitive decline over time. It would have been obvious to combine the references because assessing the cognitive impairment of a person on a periodically basis is needed to adapt the care to the level of impairment. (Devos, Introduction)
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Vairavan et al. (U.S. PG Pub No. 20200327882): System and method for detecting cognitive decline using speech analysis.
Kim et al. : (U.S. PG Pub No. 11759145) Technique for identifying dementia based on voice data.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL WONSUK CHUNG whose telephone number is (571)272-1345. The examiner can normally be reached Monday - Friday (7am-4pm)[PT].
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, PIERRE-LOUIS DESIR can be reached at (571)272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/DANIEL W CHUNG/Examiner, Art Unit 2659
/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659