Office Action Analysis: 18004848 — SPEECH ANALYSIS FOR MONITORING OR DIAGNOSIS OF A HEALTH CONDITION

Office Action

§101 §102 §103
CTNF 18/004,848 CTNF 99511 DETAILED ACTION Notice of Pre-AIA or AIA Status 07-03-aia AIA 15-10-aia The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Claim Rejections - 35 USC § 101 07-04-01 AIA 07-04 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claim(s) 1-20 is/are rejected under 35 U.S.C. 101 because the claimed invention, considering all claim elements both individually and in combination as a whole, do not amount to significantly more than a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea). Claim 1 is a claim to a process, machine, manufacture, or composition of matter and therefore meets one of the categorical limitations of 35 U.S.C. 101. However, claim 1 meets the first prong of the step 2A analysis because it is directed to a/an abstract idea, as evidenced by the claim language of “obtaining one or more linguistic representations that each encode a sub-word, word, or multiple word sequence of the audio speech data;”, “obtaining one or more audio representations that each encode audio content of a segment of the audio speech data;”, “combining the linguistic representations and audio representations into an input sequence comprising: linguistic representations of a sequence of one or more words or sub-words of the audio speech data; and audio representations of segments of the audio speech data, where the segments together contain the sequence of the one or more words or sub-words;”, and “training a machine learning model using unsupervised learning to learn combined audio- linguistic representations of the input sequence for use in speech analysis for monitoring or diagnosis of a health condition.”. This claim language, under the broadest, reasonable interpretation, encompasses subject matter that may be performed by a human using mental steps or with pen and paper that can involve basic critical thinking, which are types of activities that have been found by the courts to represents abstract ideas (i.e., the mental comparison in Ambry Genetics, or the diagnosing an abnormal condition by performing clinical tests and thinking about the results in Grams). The claim language also meets prong 2 of the step 2A analysis because the above-recited claim language does not integrate the abstract idea into a practical application. The disclosed technologies do not improve a technical field (see MPEP 2106.05(a)), affect a particular treatment for a disease or medical condition (see MPEP 2106.04(d)(2)), effect a transformation or reduction of a particular article to a different state or thing (see MPEP 2106.04(d)(2)), apply the judicial exception with, or by use of, a particular machine (see MPEP 2106.05(b)), or apply the judicial exception in some meaningful way beyond generally linking the use of the abstract idea to a particular technological environment (MPEP 2106.04(d)(2) and 2106.05(e)). As a result, step 2A is satisfied and the second step, step 2B, must be considered. With regard to the second step, the claim does not appear to recite additional elements that amount to significantly more. Regarding claims 1 and 20, a generic computer structure such as a “processor” is not significantly more according to Alice v. CLS. Therefore, these elements do not add significantly more and thus the claim as a whole does not amount to significantly more than a judicial exception. Additionally, the ordered combination of elements do not add anything significantly more to the claimed subject matter. Specifically, the ordered combination of elements do not have any function that is not already supplied by each element individually. That is, the whole is not greater than the sum of its parts. In view of the above, independent claim 1 fails to recite patent-eligible subject matter under 35 U.S.C. 101. Dependent claim(s) 2-12 and 20 fail to cure the deficiencies of independent claim 1 by merely reciting additional abstract ideas, further limitations on abstract ideas already recited, and/or additional elements that are not significantly more. Thus, claim(s) 1-12 and 20 is/are rejected under 35 U.S.C. 101. Claim 13 is a claim to a process, machine, manufacture, or composition of matter and therefore meets one of the categorical limitations of 35 U.S.C. 101. However, claim 13 meets the first prong of the step 2A analysis because it is directed to a/an abstract idea, as evidenced by the claim language of “obtaining one or more linguistic representations that each encode a sub-word, word, or multiple word sequence of the audio speech data;”, “obtaining one or more audio representations that each encode audio content of a segment of the audio speech data;”, “combining the linguistic representations and audio representations into an input sequence comprising: linguistic representations of a sequence of one or more words or sub-words of the audio speech data; and audio representations of segments of the audio speech data, where the segments together contain the sequence of the one or more words or sub-words;”, and “inputting the input sequence into a machine learning model trained to map the input sequence to combined audio-linguistic representations of the audio speech data to provide an output associated with a health monitoring or diagnosis task.”. This claim language, under the broadest, reasonable interpretation, encompasses subject matter that may be performed by a human using mental steps or with pen and paper that can involve basic critical thinking, which are types of activities that have been found by the courts to represents abstract ideas (i.e., the mental comparison in Ambry Genetics, or the diagnosing an abnormal condition by performing clinical tests and thinking about the results in Grams). The claim language also meets prong 2 of the step 2A analysis because the above-recited claim language does not integrate the abstract idea into a practical application. The disclosed technologies do not improve a technical field (see MPEP 2106.05(a)), affect a particular treatment for a disease or medical condition (see MPEP 2106.04(d)(2)), effect a transformation or reduction of a particular article to a different state or thing (see MPEP 2106.04(d)(2)), apply the judicial exception with, or by use of, a particular machine (see MPEP 2106.05(b)), or apply the judicial exception in some meaningful way beyond generally linking the use of the abstract idea to a particular technological environment (MPEP 2106.04(d)(2) and 2106.05(e)). As a result, step 2A is satisfied and the second step, step 2B, must be considered. With regard to the second step, the claim does not appear to recite additional elements that amount to significantly more. Regarding claim 13 and 19, a generic computer structure such as a “processor” is not significantly more according to Alice v. CLS. Therefore, these elements do not add significantly more and thus the claim as a whole does not amount to significantly more than a judicial exception. Additionally, the ordered combination of elements do not add anything significantly more to the claimed subject matter. Specifically, the ordered combination of elements do not have any function that is not already supplied by each element individually. That is, the whole is not greater than the sum of its parts. In view of the above, independent claim 13 fails to recite patent-eligible subject matter under 35 U.S.C. 101. Dependent claim(s) 14-19 fail to cure the deficiencies of independent claim 13 by merely reciting additional abstract ideas, further limitations on abstract ideas already recited, and/or additional elements that are not significantly more. Thus, claim(s) 13-19 is/are rejected under 35 U.S.C. 101. Claim Rejections - 35 USC § 102 07-06 AIA 15-10-15 3. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 07-07-aia AIA 07-07 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – 07-12-aia AIA (a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention. 07-15-03-aia AIA Claim(s) 1 and 4-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Georgiou et al. (US 20200335092 A1), hereinafter Georgiou . Regarding claim 1, Georgiou discloses a computer-implemented method of training a machine learning model for performing speech analysis for monitoring or diagnosis of a health condition ([0002]: “fusion in machine learning applications, including sentiment analysis applications from signals including speech signals.”, [0030]: “is trained using any of a variety of neural network weight estimation techniques”), the method using training data comprising audio speech data ([0002]: “speech signals”), the method comprising: obtaining one or more linguistic representations that each encode a sub-word, word, or multiple word sequence of the audio speech data ([0027]: "For example, the input in mode 1 is text input, for example, represented by words, or alternatively tokenized representations derived from the words or subword unit"); obtaining one or more audio representations that each encode audio content of a segment of the audio speech data ([0027]: " Mode 2 is speech input, represented as an acoustic waveform, spectrogram, of other signal analysis of the audio signal"); combining the linguistic representations and audio representations into an input sequence ([0006]: “A fused output is generated based on the output of a series of fused processing stages”, Fig 1 and Fig 2, fused modes comprising the audio and text input) comprising: linguistic representations of a sequence of one or more words or sub-words of the audio speech data (Fig 1 elements 111 and 112, Fig 2 word and sentence level fusion); and audio representations of segments of the audio speech data, where the segments together contain the sequence of the one or more words or sub-words (Fig 1 elements 121 and 122, Fig 2 word and sentence level fusion); the method further comprising: training a machine learning model using unsupervised learning to learn combined audio-linguistic representations of the input sequence for use in speech analysis for monitoring or diagnosis of a health condition ([0063]: "These layers can be trained in an unsupervised manner to learn hierarchical representations as is customary in deep learning models.", [0052]: " Estimating and tracking human emotional states and behaviors from human generated signals", wherein monitoring emotion is correlated with an affective disorder or psychological state, consistent with applicant specification page 44 line 18). Regarding claim 4, Georgiou further discloses wherein combining the linguistic representations and audio representations comprises: forming a linguistic sequence comprising linguistic representations of a sequence of one or more words or sub-words of the audio speech data ([0027]: "For example, the input in mode 1 is text input, for example, represented by words, or alternatively tokenized representations derived from the words or subword unit"); forming an audio sequence comprising audio representations of segments of the audio speech data, where the segments together contain the sequence of the one or more words or sub- words; and combining the linguistic sequence and audio sequence by one or more of: concatenating the linguistic sequence and audio sequence along any dimension; summing the linguistic sequence and audio sequence; performing a linear or non-linear transformation on one or both of the audio sequence and linguistic sequence ([0029]: ". The second mode is an audio mode such that mode 2 stage 1 (121) processes audio samples, or alternatively spectrogram representations or other features extracted from the audio samples, in a manner aligned with the text tokens processed in the mode 1 stage 1 processor"); and combining the linguistic sequence and audio sequence by inputting to an initial neural network layer (Fig 1 fused mode) by one or more of: concatenating the linguistic sequence and audio sequence along any dimension ([0015]: "The fusion of the information signal representations can be achieved via a variety of techniques including but not limited to concatenation, averaging, pooling, conditioning, product, transformation by forward or recursive neural network layers."); summing the linguistic sequence and audio sequence; performing a linear or non-linear transformation on one or both of the audio sequence and linguistic sequence; and combining the linguistic sequence and audio sequence by inputting to an initial neural network layer. Regarding claim 5, Georgiou further discloses wherein combining the linguistic sequence and audio sequence comprises: training a neural network layer to align the audio sequence with the linguistic sequence by, for each linguistic representation, selecting one or more relevant audio representations using temporal alignment information, where the model obtains the temporal alignment information from the audio sequence by determining the time delays between the linguistic representation and each audio representation ([0029]: ". The second mode is an audio mode such that mode 2 stage 1 (121) processes audio samples, or alternatively spectrogram representations or other features extracted from the audio samples, in a manner aligned with the text tokens processed in the mode 1 stage 1 processor", Fig 2). Regarding claim 6, Georgiou further discloses wherein training the machine learning model comprises: pre-training the machine learning model using unsupervised learning on a first training data set to learn combined audio-linguistic representations of the input sequence ([0063]: "These layers can be trained in an unsupervised manner to learn hierarchical representations as is customary in deep learning models."); and adding a task-specific network layer after the pre-trained machine learning model and performing task-specific training using a second training data set comprising task-specific training data, associated with a specific health monitoring or diagnosis task (Fig 2 : “high level fusion”). Regarding claim 7, Georgiou further discloses training the pre-trained machine learning model and the task-specific layer together to map an input sequence to a target output associated with a health condition ([0052]: " Estimating and tracking human emotional states and behaviors from human generated signals"). Regarding claim 8, Georgiou further discloses wherein the health condition is related to one or more of a cognitive or neurodegenerative disease, motor disorder, affective disorder, neurobehavioral condition, head injury or stroke ([0052]: " Estimating and tracking human emotional states and behaviors from human generated signals", wherein emotional states may be considered a neurobehavioral condition). Regarding claim 9, Georgiou further discloses wherein the linguistic representations each encode a character or phoneme of the audio speech data ([0022]: "correspond to phone, syllable, word, and utterance levels"). Regarding claim 10, Georgiou further discloses wherein the audio representations comprise prosodic representations that each encode non-linguistic content of a segment of the audio speech data ([0043]: “As for the acoustic input, useful features such as MFCCs, pitch tracking and voiced/unvoiced segmenting [6] are used.”, wherein pitch is a prosodic representation) Regarding claim 11, Georgiou further discloses wherein obtaining a prosodic representation comprises inputting a segment of audio data into a prosody encoder trained to map an audio speech data segment to a prosodic representation encoding non-linguistic content of the audio speech data segment (Fig 2, audio encoder); wherein the prosody encoder is trained by :training a sequence-to-sequence autoencoder comprising an encoder for mapping input audio data to a reduced dimension representation and a decoder for reconstructing the input audio data from the reduced dimension representation ([0043]: “. As for the acoustic input, useful features such as MFCCs, pitch tracking and voiced/unvoiced segmenting [6] are used. All acoustic features (72-dimensional vectors) are provided by mmsdk-tool, which uses COVAREP [7] framework”); conditioning the autoencoder by providing information on the linguistic content of the audio data during training such that the autoencoder learns representations which encode the non-linguistic content of the input audio data ([0043]: “Word-alignment is also performed with mmsdk tool through P2FA [8] to get the exact time-stamp for every word. The alignment is completed by obtaining the average acoustic vector over every spoken word.”, wherein the data is processed in relation to the linguistic content); and using the trained encoder of the autoencoder as the prosody encoder (Fig 2). Regarding claim 12, Georgiou further discloses wherein each linguistic representation comprises a text token indicating a subword, word or multi-word sequence from a fixed-size unified vocabulary and wherein each audio representation comprises an audio token indicating a vector quantized audio representation encoding the audio content of a segment of audio data containing one or more words or subwords ([0027]: “tokenized representations derived from the words or subword units, with the text in some examples being derived by speech recognition from an audio input”); wherein together the text tokens and audio tokens form a fixed-size audio-linguistic vocabulary, such that any input segment of audio speech data can be represented by a sequence of text tokens and audio tokens ([0027]). Regarding claim 13, Georgiou discloses a computer-implemented method of training a machine learning model for performing speech analysis for monitoring or diagnosis of a health condition ([0002]: “fusion in machine learning applications, including sentiment analysis applications from signals including speech signals.”, [0030]: “is trained using any of a variety of neural network weight estimation techniques”), the method using training data comprising audio speech data ([0002]: “speech signals”), the method comprising: obtaining one or more linguistic representations that each encode a sub-word, word, or multiple word sequence of the audio speech data ([0027]: "For example, the input in mode 1 is text input, for example, represented by words, or alternatively tokenized representations derived from the words or subword unit"); obtaining one or more audio representations that each encode audio content of a segment of the audio speech data ([0027]: " Mode 2 is speech input, represented as an acoustic waveform, spectrogram, of other signal analysis of the audio signal"); combining the linguistic representations and audio representations into an input sequence ([0006]: “A fused output is generated based on the output of a series of fused processing stages”, Fig 1 and Fig 2, fused modes comprising the audio and text input) comprising: linguistic representations of a sequence of one or more words or sub-words of the audio speech data (Fig 1 elements 111 and 112, Fig 2 word and sentence level fusion); and audio representations of segments of the audio speech data, where the segments together contain the sequence of the one or more words or sub-words (Fig 1 elements 121 and 122, Fig 2 word and sentence level fusion); inputting the input sequence into a machine learning model trained to map the input sequence to combined audio-linguistic representations of the audio speech data to provide an output associated with a health monitoring or diagnosis task. ([0063]: "These layers can be trained in an unsupervised manner to learn hierarchical representations as is customary in deep learning models.", [0052]: " Estimating and tracking human emotional states and behaviors from human generated signals", wherein monitoring emotion is correlated with an affective disorder or psychological state, consistent with applicant specification page 44 line 18). Regarding claim 14, Georgiou further discloses a computer-implemented method of training a machine learning model for performing speech analysis for monitoring or diagnosis of a health condition ([0002]: “fusion in machine learning applications, including sentiment analysis applications from signals including speech signals.”, [0030]: “is trained using any of a variety of neural network weight estimation techniques”), the method using training data comprising audio speech data ([0002]: “speech signals”), the method comprising: obtaining one or more linguistic representations that each encode a sub-word, word, or multiple word sequence of the audio speech data ([0027]: "For example, the input in mode 1 is text input, for example, represented by words, or alternatively tokenized representations derived from the words or subword unit"); obtaining one or more audio representations that each encode audio content of a segment of the audio speech data ([0027]: " Mode 2 is speech input, represented as an acoustic waveform, spectrogram, of other signal analysis of the audio signal"); combining the linguistic representations and audio representations into an input sequence ([0006]: “A fused output is generated based on the output of a series of fused processing stages”, Fig 1 and Fig 2, fused modes comprising the audio and text input) comprising: linguistic representations of a sequence of one or more words or sub-words of the audio speech data (Fig 1 elements 111 and 112, Fig 2 word and sentence level fusion); and audio representations of segments of the audio speech data, where the segments together contain the sequence of the one or more words or sub-words (Fig 1 elements 121 and 122, Fig 2 word and sentence level fusion); the method further comprising: training a machine learning model using unsupervised learning to learn combined audio- linguistic representations of the input sequence for use in speech analysis for monitoring or diagnosis of a health condition ([0063]: "These layers can be trained in an unsupervised manner to learn hierarchical representations as is customary in deep learning models.", [0052]: " Estimating and tracking human emotional states and behaviors from human generated signals"). Regarding claim 15, Georgiou further discloses wherein the machine learning model comprises a pre-trained multimodal sequence encoder which maps the input sequence to combined audio-linguistic representations of the audio speech data and a task-specific decoder which maps the combined audio-linguistic representations to an output associated with a health monitoring or diagnosis task (Fig 2 encoders, [0030]: “each of the modes is first pretrained independently,”). Regarding claim 16, wherein the pre- trained multimodal sequence encoder comprises a transformer encoder ([0011]: “transformation of the same modality”). Regarding claim 17, Georgiou further discloses wherein the audio representations comprise prosodic representations that each encode non-linguistic content of a segment of the audio speech data ([0043]: “As for the acoustic input, useful features such as MFCCs, pitch tracking and voiced/unvoiced segmenting [6] are used.”, wherein pitch is a prosodic representation) Regarding claim 18, Georgiou further discloses wherein obtaining a prosodic representation comprises inputting a segment of audio data into a prosody encoder trained to map an audio speech data segment to a prosodic representation encoding non-linguistic content of the audio speech data segment (Fig 2, audio encoder); wherein the prosody encoder is trained by :training a sequence-to-sequence autoencoder comprising an encoder for mapping input audio data to a reduced dimension representation and a decoder for reconstructing the input audio data from the reduced dimension representation ([0043]: “. As for the acoustic input, useful features such as MFCCs, pitch tracking and voiced/unvoiced segmenting [6] are used. All acoustic features (72-dimensional vectors) are provided by mmsdk-tool, which uses COVAREP [7] framework”); conditioning the autoencoder by providing information on the linguistic content of the audio data during training such that the autoencoder learns representations which encode the non-linguistic content of the input audio data ([0043]: “Word-alignment is also performed with mmsdk tool through P2FA [8] to get the exact time-stamp for every word. The alignment is completed by obtaining the average acoustic vector over every spoken word.”, wherein the data is processed in relation to the linguistic content); and using the trained encoder of the autoencoder as the prosody encoder (Fig 2). Regarding claim 19, Georgiou further discloses a system for performing a speech processing task comprising data processing means configured to perform the method of claim 13 ([0068]: “may be implemented in software, with processor instructions being stored on a non-transitory machine-readable medium and executed by one or more processing systems.”). Regarding claim 20, Georgiou further discloses a system for performing a speech processing task comprising data processing means configured to perform the method of claim 1 ([0068]: “may be implemented in software, with processor instructions being stored on a non-transitory machine-readable medium and executed by one or more processing systems.”) . Claim Rejections - 35 USC § 103 07-20-aia AIA The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 07-21-aia AIA Claim (s) 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Georgiou in view of Trask et al. (US 20160247061 A1), hereinafter Trask . Regarding claim 2, Georgiou discloses the method of claim 1 but fails to disclose wherein training the machine learning model using unsupervised learning comprises training the machine learning model to predict a withheld part or property of the input sequence or audio speech data. Trask discloses a machine learning model including training the machine learning model to predict a withheld part or property of the input sequence or audio speech data ([0031]: "the neural network is configured to take the ordered list of linguistic units, with a linguistic unit omitted, and predict the omitted linguistic unit. This omitted linguistic unit is referred to as a “focus term.” For example, FIG. 3 depicts a neural network analyzing the phrase “SEE SPOT RUN.” The input nodes for “SEE” and “RUN” are activated, to predict the focus term “SPOT.” The neural network then appropriately predicts that the missing word is “SPOT” by returning a 100% at the “SPOT” output node."). It would have been obvious to a person of ordinary skill in the art prior to the effective filing date to modify the method disclosed by Georgiou with the method of predicting a withheld property disclosed by Trask in order to improve the accuracy of speech prediction . 07-21-aia AIA Claim (s) 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Georgiou in view of Larson et al. (“Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise”), hereinafter Larson . Regarding claim 3, Georgiou discloses the method of claim 1 but fails to disclose wherein training the machine learning model comprises masking or corrupting one or more of the linguistic and/or audio representations in an input sequence and training the machine learning model to predict the masked or corrupted linguistic and/or audio representations. Larson discloses a machine learning model (title) wherein training the machine learning model comprises masking or corrupting one or more of the linguistic and/or audio representations in an input sequence and training the machine learning model to predict the masked or corrupted linguistic and/or audio representations (Page 1 col 1 para 2: "Letting wi denote the i-th word in the text sequence w = [w0, . . . , wT −1], of length T the masked LM training strategy is to randomly replace a word wi with a mask token and then attempt to predict the masked word."). It would have been obvious to a person of ordinary skill in the art prior to the effective filing date to modify the method disclosed by Georgiou with the masking disclosed by Larson in order to improve the accuracy of the trained model (Larson page 1 col 1 para 1 : “used to fine-tune language models on written text data so that they are better equipped to handle ASR errors”) Conclusion 07-96 AIA The prior art made of record and not relied upon is considered pertinent to applicant&apos;s disclosure. Gosztolya et al. (US 20220039741 A1) – detection of neurocognitive impairment based on speech samples Any inquiry concerning this communication or earlier communications from the examiner should be directed to KAVYA SHOBANA BALAJI whose telephone number is (703)756-5368. The examiner can normally be reached Monday - Friday 8:30 - 5:30 ET. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jaqueline Cheng can be reached at 571-272-5596. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /KAVYA SHOBANA BALAJI/Examiner, Art Unit 3791 /DANIEL L CERIONI/Primary Examiner, Art Unit 3791 Application/Control Number: 18/004,848 Page 2 Art Unit: 3791 Application/Control Number: 18/004,848 Page 3 Art Unit: 3791 Application/Control Number: 18/004,848 Page 4 Art Unit: 3791 Application/Control Number: 18/004,848 Page 5 Art Unit: 3791 Application/Control Number: 18/004,848 Page 6 Art Unit: 3791 Application/Control Number: 18/004,848 Page 7 Art Unit: 3791 Application/Control Number: 18/004,848 Page 8 Art Unit: 3791 Application/Control Number: 18/004,848 Page 9 Art Unit: 3791 Application/Control Number: 18/004,848 Page 10 Art Unit: 3791 Application/Control Number: 18/004,848 Page 11 Art Unit: 3791 Application/Control Number: 18/004,848 Page 12 Art Unit: 3791 Application/Control Number: 18/004,848 Page 13 Art Unit: 3791 Application/Control Number: 18/004,848 Page 14 Art Unit: 3791 Application/Control Number: 18/004,848 Page 15 Art Unit: 3791 Application/Control Number: 18/004,848 Page 16 Art Unit: 3791 Application/Control Number: 18/004,848 Page 17 Art Unit: 3791 Application/Control Number: 18/004,848 Page 18 Art Unit: 3791 Application/Control Number: 18/004,848 Page 19 Art Unit: 3791 Application/Control Number: 18/004,848 Page 20 Art Unit: 3791 Application/Control Number: 18/004,848 Page 21 Art Unit: 3791 Application/Control Number: 18/004,848 Page 22 Art Unit: 3791 Application/Control Number: 18/004,848 Page 23 Art Unit: 3791 Application/Control Number: 18/004,848 Page 24 Art Unit: 3791
Read full office action
SPEECH ANALYSIS FOR MONITORING OR DIAGNOSIS OF A HEALTH CONDITION

This examiner grants 17% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

SPEECH ANALYSIS FOR MONITORING OR DIAGNOSIS OF A HEALTH CONDITION

This examiner grants 17% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email