Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The present Office Action is in response to the Request for Continued Examination dated 27 January 2026.
Request for Continued Examination
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 27 January 2026 has been entered.
DETAILED ACTION
In the RCE filed 27 January 2026:
Claims 2, 8-13, 16-17, 19-20 are cancelled
Claims 21-30 are new
Claims 1, 3, 14 and 18 are amended
Claims 1, 3-7, 14-15, 18 and 21-30 are pending.
Information Disclosure Statement
The Information Disclosure Statement(s) (lDS) submitted on 15 April 2026 is/are in compliance with the provisions of 37 CFR 1.97 and has/have been fully considered by the Examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 3-7, 14-15, 18 and 21-30 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
Claims 1, 14, 18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1
The claim recites a methods and system, which are within a statutory category.
Step 2A1
Claims 1, 14 and 18 (Claim 14 being representative)
The limitations of:
with at least a portion of the processed training data, having a depth of at least 2 and not more than 7, wherein the gradient-boosted tree machine-learning model comprises at least 200 weighted decision trees and not more than 600 weighted decision trees, wherein each decision tree of the weighted decision trees is assigned a probability- weighted association with one or more classifications of data in a dataset,
evaluate the data associated with the subject by transforming the data associated with the subject into discrete numerical vectors, wherein the discrete numerical vectors are input to the ASD model, and operates on the discrete numerical vectors, wherein the data associated with the subject that is evaluated consists of two or more of demographic data, comorbidity data, observational assessment and interview data, and medication data, is configured to:
wherein the subject corresponds to a leaf node of a weighted decision tree, wherein the model follows an optimal branch direction from the node, wherein the model learned the optimal branch direction during model training, wherein the weighted decision trees provide for one or more probability-weighted associations between inputs and outputs, wherein the outputs a score and compares the score to a tunable threshold selected to provide a sensitivity of at least about 0.75 for ASD versus non-ASD classification, wherein evaluation of the data associated with the subject yields an evaluation result, wherein the evaluation result indicates the presence or absence of the ASD,
as drafted, is a process that under broadest reasonable interpretation covers a mathematical concept that includes mathematical relationships, mathematical formulas or equations, and mathematical calculations but for the recitation of generic computer component language. See Specification, e.g., at para. 0022. That is, other than reciting the generic computer component language, the claim recites training and evaluating a machine learning model that encompasses a mathematical concept. For example, but for the generic computer component language, the claim encompasses diagnosing various mental disorders. The Examiner notes that the mathematical concept need not be expressed in mathematical symbols. MPEP § 2106.04(a)(2)(I). If a claim limitation, under its broadest reasonable interpretation, encompasses a mathematical concept but for the recitation of generic computer component language, then it falls within the “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
The limitations of:
Claims 1, 14 and 18 (Claim 14 being representative)
acquiring training data, wherein the acquired training data comprises structured data associated with a plurality of subjects including at least about 50% data associated with non-ASD subjects;
subjecting the training data to exploratory data analysis;
filtering the training data to remove subjects with data having a high degree of missing values, subjects with data having outlier values that deviate from a statistical distribution of the training data, or both;
and processing the filtered training data to input missing values, remove outliers, remove unidentified characters, remove highly correlated features, remove features with a high degree of missing values, remove features that are not important, or combinations thereof;
receive data associated with a subject, at least a portion of the data associated with the subject being structured data;
(i) evaluate the data associated with the subject to determine the presence or absence of an ASD and
(ii) only in response to a determination of the presence of an ASD, perform a machine-learning multiclassification to classify the ASD,
as drafted, is a process that, under the broadest reasonable interpretation, covers certain methods of organizing human activity (i.e., managing personal behavior including following rules or instructions) but for recitation of generic computer components. The claims encompass a series of rules or instructions for a person or persons to follow, with or without the aid of a computer, to diagnose mental disorders in the manner described in the identified abstract idea, supra. The rules or instructions are the claimed steps of “receiving, acquiring, filtering, processing, training, and evaluating” as indicated supra.
Other than reciting generic computer components (discussed infra), i.e., a system implemented by a data processor (computer), the claimed invention amounts to managing personal behavior or interaction between people. If a claim limitation, under its broadest reasonable interpretation, covers managing personal behavior or interactions between people but for the recitation of generic computer components, then it falls within the “certain methods of organizing human activity” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
The claim further recites “generating training data for a machine learning model.” When given its broadest reasonable interpretation in light of the disclosure, the generation of training for a machine learning model for diagnosing mental disorders using gradient-boosted tree learning represents the creation of mathematical interrelationships between data. As such, the generation of training data for the machine learning model represents a mathematical concept that is interpreted to be part of the identified abstract idea, supra. The types of identified abstract ideas are considered together as a single abstract idea for analysis purposes.
The Examiner notes that the training of a machine learning model is recited in the claim. The type of training utilized by the claimed invention is not described by the Applicant. As such the Examiner is required to analyze the training step given the broadest reasonable interpretation. The step(s) performed to train step(s) of the model/algorithm is/are considered to be part of the abstract idea because it/they fall(s) under data manipulations that humans perform (i.e., fitting a model to data) and thus are interpreted to be part of the abstraction--the rules or instructions that fall under Certain Methods of Organizing Human Activity. When given its broadest reasonable interpretation in light of the disclosure, the training of a machine learning model by gradient-boosted tree learning represents the creation of mathematical interrelationships between data. See, e.g., Example 47, Claim 2. As such, the training of the machine learning model represents a mathematical concept that is interpreted to be part of the identified abstract idea, supra. The types of identified abstract ideas are considered together as a single abstract idea for analysis purposes.
Step 2A2
This judicial exception is not integrated into a practical application. In particular, the claims recite the additional element of a computing system comprising a computing device, processor and non-transitory computer readable medium that implements the identified abstract idea. The computing system comprising a computing device, processor and non-transitory computer readable medium is not described by the applicant and is recited at a high-level of generality (i.e., a generic computer performing generic computer functions) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim further recites the additional element of using a trained machine learning model to diagnose mental disorders. This represents mere instructions to implement the abstract idea on a generic computer. Implementing an abstract idea using a generic computer or components thereof does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. See, e.g., Recentive Analytics, Inc. v. Fox Corp., No. 2023-2437 at 10 (Fed. Cir. April 18, 2025) (finding that claims that do no more than apply established methods of machine learning to a new data environment are ineligible). Alternatively, or in addition, the implementation of the trained machine learning model to diagnose mental disorders merely confines the use of the abstract idea (i.e., the trained model) to a particular technological environment or field of use (gradient-boosted tree machine learning) and thus fails to add an inventive concept to the claims.
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computing system comprising a computing device, processor and non-transitory computer readable medium to perform the noted steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept (“significantly more”).
As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using the trained machine learning model to diagnose mental disorders was found to represent mere instructions to implement the abstract idea on a generic computer and/or confine the use of the abstract idea (i.e., the trained model) to a particular technological environment or field of use (gradient-boosted tree machine learning). This has been re-evaluated under the “significantly more” analysis and determined to be insufficient to provide significantly more. MPEP 2106.05(I) indicates that mere instructions to implement the abstract idea on a generic computer and/or confining the use of the abstract idea to a particular technological environment or field of use cannot provide significantly more. See also Recentive Analytics, Inc. v. Fox Corp., No. 2023-2437 at 17 (Fed. Cir. April 18, 2025) (finding that applying machine learning to an abstract idea does not transform a claim into something significantly more).
Claims 3-7, 15, and 21-30 are similarly rejected because they either further define/narrow the abstract idea and/or do not further limit the claim to a practical application or provide as inventive concept such that the claims are subject matter eligible even when considered individually or as an ordered combination.
Claim(s) 3, 15, 22 merely describe(s) the list of classifications of the ASD, which further defines the abstract idea.
Claim(s) 4 merely describe(s) the demographic data, which further defines the abstract idea.
Claim(s) 5 merely describe(s) the comorbidity data, which further defines the abstract idea.
Claim(s) 6 merely describe(s) the observational assessment, which further defines the abstract idea.
Claim(s) 7 merely describe(s) the medication data, which further defines the abstract idea.
Claim(s) 21, 27 merely describe(s) capability of the ASD model, which further defines the abstract idea.
Claim(s) 23 merely describe(s) identifying data, which further defines the abstract idea.
Claim(s) 24 merely describe(s) using data from multiple age groups, which further defines the abstract idea.
Claim(s) 25 merely describe(s) model sensitivity, which further defines the abstract idea.
Claim(s) 26 merely describe(s) model depth, which further defines the abstract idea.
Claim(s) 28 merely describe(s) outputting the evaluation result, which further defines the abstract idea.
Claim(s) 28 also includes the additional elements of “a display, remote device and database” which generally links the abstract idea to a particular technological environment or field of use. MPEP 2106.04(d)(I) and MPEP 2106.05(A) indicate that merely “generally linking” the abstract idea to a particular technological environment or field of use cannot provide a practical application or significantly more.
Claim(s) 29 merely describe(s) the confidence score, which further defines the abstract idea.
Claim(s) 30 merely describe(s) data associated with the subject, which further defines the abstract idea.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The Examiner notes that the rejection will reference the translated documents (attached) corresponding to any foreign documents recited in the rejection.
Claims 1,3-6,14-15,18,21-23,25,27-30 is/are rejected under 35 U.S.C. 103(a) as being unpatentable over WALL et al (US Publication No.20210133509) in view of Feczko et al (US Publication No. 20200219619) in view of Dawaver et al (“Generating Compact Tree Ensembles via Annealing”).
Regarding Claim 1
WALL teaches a method implemented via a computing device, the method comprising:
receiving, by a processor of the computing device, data associated with a subject, at least a portion of the data associated with the subject being structured data, wherein an autism spectrum disorder (ASD) model is associated with the computing device, wherein a non-transitory machine-readable medium of the computing device causes the processor to implement the ASD model, and wherein the ASD model has been trained via a method comprising [WALL at Para. 0287 teaches on a computer system having a processor and a memory storing a computer program for execution by the processor. The computer program may comprise instructions for: 1) receiving data of the subject related to the cognitive function attribute; 2) evaluating the data of the subject using a machine learning model; and 3) providing an evaluation for the subject. The evaluation may be selected from the group consisting of an inconclusive determination and a categorical determination in response to the data. The machine learning model may comprise a selected subset of a plurality of machine learning assessment models. The categorical determination may comprise a presence of the cognitive function attribute and an absence of the cognitive function attribute; WALL at Para. 0299 teaches the diagnostic module may comprise a diagnostic machine learning classifier trained on a subject population. The therapeutic module may comprise a therapeutic machine learning classifier trained on at least a portion of the subject population. The diagnostic module may be configured to provide feedback to the therapeutic module based on performance of the personal therapeutic treatment plan. The data from the subject may comprise at least one of the subject and caregiver video, audio, responses to questions or activities, and active or passive data streams from user interaction with activities, games or software features of the system. The subject may have a risk selected from the group consisting of a behavioral disorder, neurological disorder and mental health disorder. The behavioral, neurological or mental health disorder may be selected from the group consisting of autism, autistic spectrum, attention deficit disorder, depression, obsessive compulsive disorder, schizophrenia, Alzheimer's disease, dementia, attention deficit hyperactive disorder and speech and learning disability. The diagnostic module may be configured for an adult to perform an assessment or provide data for an assessment of a child or juvenile. The diagnostic module may be configured for a caregiver or family member to perform an assessment or provide data for an assessment of the subject]:
and training the ASD model with at least a portion of the processed training data, wherein training the ASD model comprises training a gradient-boosted tree machine-learning model on discrete numerical vectors obtained by transforming the training data [WALL at Para. 0632 taches the machine learning algorithmic framework is GBDT (Gradient Boosted Decision Trees), which, upon training on the data in the training set, produces a set of automatically-created decision trees, each using some of the input features in the training set, and each producing a scalar output when run on new feature data pertaining to a new patient submission], … [ … ]
and evaluating, by the ASD model associated with the computing device, the data associated with the subject [WALL at Para. 0511 teaches these models or classifiers can be implemented in any of the systems or devices disclosed herein such as smartphones, mobile computing devices, or wearable devices], wherein the data associated with the subject that is evaluated via the ASD model consists of two or more of demographic data, comorbidity data, observational assessment and interview data, and medication data [WALL at Para. 0168 teaches data from the subject may comprise at least one of a sample of a diagnostic instrument, wherein the diagnostic instrument comprises a set of diagnostic questions and corresponding selectable answers, and demographic data], wherein evaluating the data associated with the subject comprises transforming the data into discrete numerical vectors [WALL at Para. 0407 teaches samples can be grouped according to subject-specific dimensions and sample weights can be computed and assigned to balance one group of samples against every other group of samples to mirror the expected distribution of subjects in an intended setting (dimensions interpreted as discrete numerical vectors)], wherein the ASD model is configured to evaluate the data associated with the subject to determine the presence or absence of an ASD by performing a machine-learning ASD versus non-ASD binary classification and [WALL at Para. 0401 teaches while the assessment model of the data processing module described with respect to FIGS. 9-10 was constructed and trained to classify subjects as having autism or no autism, a similar approach may be used to build an assessment model that can classify a subject as having one or more of a plurality of developmental disorders, as described herein], … [ … ] …, wherein the subject corresponds to a leaf node of a weighted decision tree of the gradient-boosted tree machine-learning model [WALL at Para. 0350 teaches when the dataset being queried in the assessment model reaches a “leaf”, or a final prediction node with no further downstream splits, the output values of the leaf can be output as the votes for the particular decision tree], wherein the model follows an optimal branch direction from the node, wherein the model learned the optimal branch direction during model training, wherein the weighted decision trees provide for one or more probability-weighted associations between model inputs and outputs [ WALL at Para. 0348 teaches an ensemble of decision trees can be constructed using a random subset of features at each split or decision node. The Gini criterion may be employed to choose the best partition, wherein decision nodes having the lowest calculated Gini impurity index are selected. At prediction time, a “vote” can be taken over all of the decision trees, and the majority vote (or mode of the predicted classifications) can be output as the predicted classification; WALL at Para. 0350 teaches when the dataset being queried in the assessment model reaches a “leaf”, or a final prediction node with no further downstream splits, the output values of the leaf can be output as the votes for the particular decision tree. Since the Random Forest model comprises a plurality of decision trees, the final votes across all trees in the forest can be summed to yield the final votes and the corresponding classification of the subject. While only two decision trees are shown in FIG. 3, the model can comprise any number of decision trees. A large number of decision trees can help reduce overfitting of the assessment model to the training data, by reducing the variance of each individual decision tree. For example, the assessment model can comprise at least about 10 decision trees, for example at least about 100 individual decision trees or more], wherein the model outputs a score and compares the score to a tunable threshold selected to provide a sensitivity of at least about 0.75 for ASD versus non-ASD classification [WALL at Para. 0167 teaches the inclusion rate may be no less than 70% and the categorical determination may result in a sensitivity of at least 80 with a corresponding specificity of at least 80], wherein evaluation of the data associated with the subject by the ASD model yields an evaluation result, wherein the evaluation result indicates the presence or absence of the ASD [WALL at Para. 0164 teaches the categorical determination may comprise a presence of the cognitive function attribute and an absence of the cognitive function attribute].
WALL does not teach acquiring training data, wherein the acquired training data comprises structured data associated with a plurality of subjects including at least about 50% data associated with non-ASD subjects;
subjecting the training data to exploratory data analysis;
filtering the training data to remove subjects with data having a high degree of missing values, subjects with data having outlier values that deviate from a statistical distribution of the training data, or both;
processing the filtered training data to input missing values, remove outliers, remove unidentified characters, remove highly correlated features, remove features with a high degree of missing values, remove features that are not important, or combinations thereof;
[ … ] … wherein the gradient-boosted tree machine-learning model has a depth of at least 2 and no more than 7, and wherein the gradient-boosted tree machine-learning model comprises from about 200 to about 600 weighted decision trees;
[ … ] … only in response to a determination of the presence of an ASD, the ASD model is configured to perform a machine-learning multiclassification of the ASD … [ … ]
Feczko teaches acquiring training data, wherein the acquired training data comprises structured data associated with a plurality of subjects including at least about 50% data associated with non-ASD subjects [Feczko at Para. 0221 the model was trained on measures from seven tasks that reflect multiple levels of information processing. 47 ASD diagnosed and 58 typically developing (TD) children between the ages of 9 and 13 participated in this study. (typically developing interpreted as non-ASD; interpreted as atleast 50 percent non-ASD)];
subjecting the training data to exploratory data analysis [Feczko at Para. 0285 teaches prior to the exploratory data analysis, there were a total of 143 subjects (73 ASD, 70 TD) with partially completed data. After eliminating subjects with more than 15 percent missing data, the subject list was finalized down to 105 subjects (47 ASD, 58 TD). In the final dataset, less than 3 percent of all possible data was missing. An inspection of the missing data was unable to find any patterns that distinguish the missing ASD data from the remaining cases];
filtering the training data to remove subjects with data having a high degree of missing values, subjects with data having outlier values that deviate from a statistical distribution of the training data, or both [Feczko at Para. 0285 teaches therefore, any measures and participants that were missing more than 15 percent of data were excluded];
processing the filtered training data to input missing values, remove outliers, remove unidentified characters, remove highly correlated features, remove features with a high degree of missing values, remove features that are not important, or combinations thereof [Feczko at Para. 0392 teaches the MRI ages were excluded because those would not factor into the RF model itself, for MRI data were analyzed independently from the RF model];
[ … ] … only in response to a determination of the presence of an ASD, the ASD model is configured to perform a machine-learning multiclassification of the ASD [Feczko at Para. 0164 teaches identifying a plurality of Autism Spectrum Disorder (ASD) subgroups based at least in part on the plurality of decision trees; Feczko at Para. 0198 teaches receiving, by the clinical device from the predictive system, an indication of an Autism Spectrum Disorder (ASD) subgroup of the human subject]… [ … ]
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine model of WALL with the data processing of Feczko with the motivation to improve the technical field by identifying more clinically relevant subtypes than previous techniques.
WALL/Feczko do not teach [ … ] … wherein the gradient-boosted tree machine-learning model has a depth of at least 2 and no more than 7, and wherein the gradient-boosted tree machine-learning model comprises from about 200 to about 600 weighted decision trees;
Dawer teaches [ … ] … wherein the gradient-boosted tree machine-learning model has a depth of at least 2 and no more than 7, and wherein the gradient-boosted tree machine-learning model comprises from about 200 to about 600 weighted decision trees [Dawer at Page 5 teaches RET : We use Single Chain Single Depth (SCSD) pool generation approach as described in II-C to obtain an initial pool of M = 400 trees of depth 2. We then invoke FSA on leaves as detailed in II-E to select just one tree];
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine the references of WALL, Feczko with the tree depth of Dawer with the motivation to lower misclassification errors in obtained models.
Regarding Claim 3
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer further teach wherein the classification of the ASD is one of autistic disorder, Asperger syndrome, or pervasive developmental disorder - not otherwise specified (PDD-NOS) [WALL at Para. 0339 teaches FIGS. 1A and 1B show some developmental disorders that may be evaluated using the assessment procedure as described herein. The assessment procedure can be configured to evaluate a subject's risk for having one or more developmental disorders, such as two or more related developmental disorders. The developmental disorders may have at least some overlap in symptoms or features of the subject. Such developmental disorders may include pervasive development disorder (PDD), autism spectrum disorder (ASD), social communication disorder, restricted repetitive behaviors, interests, and activities (RRBs), autism (“classical autism”), Asperger's Syndrome (“high functioning autism), PDD—not otherwise specified (PDD-NOS, “atypical autism”), attention deficit and hyperactivity disorder (ADHD), speech and language delay, obsessive compulsive disorder (OCD), intellectual disability, learning disability, or any other relevant development disorder, such as disorders defined in any edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM) (includes asperger syndrome)].
Regarding Claim 4
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer further teach wherein the data associated with the subject comprises the demographic data, wherein the demographic data comprises age data, intelligence quotient (IQ) data, sex data, handedness data, or combinations thereof [WALL at Para. 0170 teaches the subject-specific dimensions may comprise a subject's gender, the geographic region where a subject resides, and a subject's age].
Regarding Claim 5
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer further teach wherein the data associated with the subject comprises the comorbidity data, wherein the comorbidity data comprises an indication of the presence or absence of attention deficit hyperactivity disorder (ADHD), a phobia, oppositional defiant disorder (ODD), obsessive-compulsive disorder (OCD), anxiety, a language disorder, generalized anxiety disorder (GAD), or combinations thereof [WALL at Para. 0339 (includes obsessive compulsive disorder)].
Regarding Claim 6
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer further teach wherein the data associated with the subject comprises the observational assessment and interview data, wherein the observational assessment and interview data comprises Autism Diagnostic Instrument-Revised (ADI-R) data, Autism Diagnostic Observation Schedule (ADOS) 1t and/or 2nd Edition(ADOS and/or ADOS-2) data, Social Responsiveness Scale (SRS) data, Social Communication Questionnaire (SCQ) data, Autism Screening Questionnaire (ASQ) data, Vineland Adaptive Behavior Scale (VABS) data, Behavior Rating Inventory of Executive Function (BRIEF) data, or combinations thereof [WALL at Para. 0342 teaches the training data may comprise datasets available from large data repositories, such as Autism Diagnostic Interview-Revised (ADI-R) data and/or Autism Diagnostic Observation Schedule (ADOS) data available from the Autism Genetic Resource Exchange (AGRE), or any datasets available from any other suitable repository of data (e.g., Boston Autism Consortium (AC), Simons Foundation, National Database for Autism Research, etc.)].
Regarding Claim 14
WALL teaches a computing system for evaluating a subject with respect to autism spectrum disorder (ASD), the system comprising:
a computing device, the computing device comprising a processor and a non-transitory computer-readable medium, wherein the non-transitory computer-readable medium includes instructions configured to cause the processor to implement an ASD model, wherein the ASD model has been trained via a method comprising comprising [WALL at Para. 0032 teaches described herein is a platform for assessing and providing treatment to an individual with respect to a behavioral disorder, a developmental delay, or a neurologic impairment, said platform comprising a computing device comprising: a processor; a non-transitory computer-readable medium that stores a computer program configured to cause said processor to]:
[ … ] …, wherein each decision tree of the weighted decision trees is assigned a probability- weighted association with one or more classifications of data in a dataset [WALL at Para. 0346 teaches for example, in the aforementioned example of a feature comprising the ability of the subject to engage in imaginative or pretend play, the feature value of “3” or “no variety of pretend play” may have a high predictive utility for classifying autism, while the same feature value may have low predictive utility for classifying ADHD. Accordingly, for each feature value, a probability distribution may be extracted that describes the probability of the specific feature value for predicting each of the plurality of developmental disorders to be screened by the assessment procedure. The machine learning algorithm can be used to extract these statistical relationships from the training data and build an assessment model that can yield an accurate prediction of a developmental disorder when a dataset comprising one or more feature values is fitted to the model],
and wherein the ASD model, when implemented via the processor, causes the computing device to:
receive data associated with a subject, at least a portion of the data associated with the subject being structured data [WALL at Para. 0287, 0299 (see Claim 1 for explanation)];
evaluate the data associated with the subject via the ASD model by transforming the data associated with the subject into discrete numerical vectors, wherein the discrete numerical vectors are input to the ASD model, and wherein the ASD model operates on the discrete numerical vectors, wherein the data associated with the subject that is evaluated via the ASD model consists of two or more of demographic data, comorbidity data, observational assessment and interview data, and medication data, wherein the ASD model is configured to [WALL at Para. 0168, 0164, 0407 (see Claim 1 for explanation)]:
(i) evaluate the data associated with the subject to determine the presence or absence of an ASD and [WALL at Para. 0511 (see Claim 1 for explanation)] … [ … ]
[ … ] … wherein the subject corresponds to a leaf node of a weighted decision tree of the gradient-boosted tree machine-learning model [WALL at Para. 0632 (see Claim 1 for explanation)], wherein the model follows an optimal branch direction from the node, wherein the model learned the optimal branch direction during model training, wherein the weighted decision trees provide for one or more probability-weighted associations between model inputs and outputs [WALL at Para. 0348, 0350 (see Claim 1 for explanation)], wherein the model outputs a score and compares the score to a tunable threshold selected to provide a sensitivity of at least about 0.75 for ASD versus non-ASD classification [WALL at Para. 0167 (see Claim 1 for explanation)], wherein evaluation of the data associated with the subject by the ASD model yields an evaluation result, wherein the evaluation result indicates the presence or absence of the ASD [WALL at Para. 0164 (see Claim 1 for explanation)].
WALL does not teach acquiring training data, wherein the acquired training data comprises structured data associated with a plurality of subjects including at least about 50% data associated with non-ASD subjects;
subjecting the training data to exploratory data analysis;
filtering the training data to remove subjects with data having a high degree of missing values, subjects with data having outlier values that deviate from a statistical distribution of the training data, or both;
and processing the filtered training data to input missing values, remove outliers, remove unidentified characters, remove highly correlated features, remove features with a high degree of missing values, remove features that are not important, or combinations thereof;
and training the ASD model with at least a portion of the processed training data, wherein the ASD model is a gradient-boosted tree machine-learning model having a depth of at least 2 and not more than 7, wherein the gradient-boosted tree machine-learning model comprises at least 200 weighted decision trees and not more than 600 weighted decision trees
(ii) only in response to a determination of the presence of an ASD, perform a machine-learning multiclassification to classify the ASD,
Feczko teaches acquiring training data, wherein the acquired training data comprises structured data associated with a plurality of subjects including at least about 50% data associated with non-ASD subjects [Feczko at Para. 0285 (see Claim 1 for explanation)];
subjecting the training data to exploratory data analysis [Feczko at Para. 0285 (see Claim 1 for explanation)];
filtering the training data to remove subjects with data having a high degree of missing values, subjects with data having outlier values that deviate from a statistical distribution of the training data, or both [Feczko at Para. 0285 (see Claim 1 for explanation)];
and processing the filtered training data to input missing values, remove outliers, remove unidentified characters, remove highly correlated features, remove features with a high degree of missing values, remove features that are not important, or combinations thereof [Feczko at Para. 0392 (see Claim 1 for explanation)];
(ii) only in response to a determination of the presence of an ASD, perform a machine-learning multiclassification to classify the ASD [Feczko at Para. 0164 (see Claim 1 for explanation)], … [ … ]
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine model of WALL with the data processing of Feczko with the motivation to improve the technical field by identifying more clinically relevant subtypes than previous techniques.
WALL/Feczko do not teach and training the ASD model with at least a portion of the processed training data, wherein the ASD model is a gradient-boosted tree machine-learning model having a depth of at least 2 and not more than 7, wherein the gradient-boosted tree machine-learning model comprises at least 200 weighted decision trees and not more than 600 weighted decision trees
Dawer teaches and training the ASD model with at least a portion of the processed training data, wherein the ASD model is a gradient-boosted tree machine-learning model having a depth of at least 2 and not more than 7, wherein the gradient-boosted tree machine-learning model comprises at least 200 weighted decision trees and not more than 600 weighted decision trees [Dawer at Page 5 (see Claim 1 for explanation)]
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine the references of WALL, Feczko with the tree depth of Dawer with the motivation to lower misclassification errors in obtained models.
Regarding Claim 15
Claim(s) 15 is/are analogous to Claim(s) 3, thus Claim(s) 15 is/are similarly analyzed and rejected in a manner consistent with the rejection of Claim(s) 3.
Regarding Claim 18
WALL teaches a method implemented via a computing device, the method comprising:
receiving, by a processor of an autism spectrum disorder (ASD) model associated with the computing device, wherein anon-transitory machine-readable medium of the computing device causes the processor to implement the ASD model, data associated with a subject, at least a portion of the data associated with the subject being structured data [WALL at Para. 0032, 0287, 0299 (see Claim 1, 14 for explanation)], wherein the ASD model has been trained via a method comprising:
and evaluating, by the ASD model associated with the computing device, the data associated with the subject by transforming the data associated with the subject into discrete numerical vectors, wherein the discrete numerical vectors are input to the ASD model, and wherein the ASD model operates on the discrete numerical vectors, wherein the data associated with the subject that is evaluated via the ASD model consists of two or more of demographic data, comorbidity data, observational assessment and interview data, and medication data, wherein the ASD model is configured to [WALL at Para. 0032, 0168, 0164, 0407 (see Claim 1 for explanation)]:
(i) evaluate the data associated with the subject to determine the presence or absence of an ASD [WALL at Para. 0511 (see Claim 1 for explanation)];
[ … ] … wherein the subject corresponds to a leaf node of a weighted decision tree of the gradient-boosted tree machine-learning model [WALL at Para. 0632 (see Claim 1 for explanation)], wherein the model follows an optimal branch direction from the node, wherein the model learned the optimal branch direction during model training [WALL at Para. 0348, 0350 (see Claim 1 for explanation)], wherein the weighted decision trees provide for one or more probability-weighted associations between model inputs and outputs, wherein the model outputs a score and compares the score to a tunable threshold selected to provide a sensitivity of at least about 0.75 for ASD versus non-ASD classification [WALL at Para. 0167 (see Claim 1 for explanation)], wherein the evaluation of the data associated with the subject yields an evaluation result [WALL at Para. 0164 (see Claim 1 for explanation)], and wherein the evaluation result indicates a finding of non-ASD, a finding of autistic disorder, a finding of Asperger syndrome, or a finding of pervasive developmental disorder - not otherwise specified (PDD-NOS) for the subject [WALL at Para. 0339 (see Claim 1 for explanation)].
WALL does not teach acquiring training data, wherein the acquired training data comprises structured data associated with a plurality of subjects including at least about 50% data associated with non-ASD subjects;
subjecting the training data to exploratory data analysis;
filtering the training data to remove subjects with data having a high degree of missing values, subjects with data having outlier values that deviate from a statistical distribution of the training data, or both;
and processing the filtered training data to input missing values, remove outliers, remove unidentified characters, remove highly correlated features, remove features with a high degree of missing values, remove features that are not important, or combinations thereof;
and training the ASD model with at least a portion of the processed training data, wherein the ASD model is a gradient-boosted tree machine-learning model having a depth of at least 2 and not more than 7, wherein the gradient- boosted tree machine-learning model comprises at least 200 weighted decision trees and not more than 600 weighted decision trees, wherein each decision tree of the weighted decision trees is assigned a probability- weighted association with one or more classifications of data in a dataset;
and (ii) only in response to a determination of the presence of an ASD, perform a machine-learning multiclassification to classify the ASD, … [ … ]
feczko teaches acquiring training data, wherein the acquired training data comprises structured data associated with a plurality of subjects including at least about 50% data associated with non-ASD subjects [Feczko at Para. 0285 (see Claim 1 for explanation)];
subjecting the training data to exploratory data analysis [Feczko at Para. 0285 (see Claim 1 for explanation)];
filtering the training data to remove subjects with data having a high degree of missing values, subjects with data having outlier values that deviate from a statistical distribution of the training data, or both [Feczko at Para. 0285 (see Claim 1 for explanation)];
and processing the filtered training data to input missing values, remove outliers, remove unidentified characters, remove highly correlated features, remove features with a high degree of missing values, remove features that are not important, or combinations thereof [Feczko at Para. 0392 (see Claim 1 for explanation)];
and (ii) only in response to a determination of the presence of an ASD, perform a machine-learning multiclassification to classify the ASD [Feczko at Para. 0164 (see Claim 1 for explanation)], … [ … ]
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine model of WALL with the data processing of Feczko with the motivation to improve the technical field by identifying more clinically relevant subtypes than previous techniques.
WALL/Feczko do not teach and training the ASD model with at least a portion of the processed training data, wherein the ASD model is a gradient-boosted tree machine-learning model having a depth of at least 2 and not more than 7, wherein the gradient- boosted tree machine-learning model comprises at least 200 weighted decision trees and not more than 600 weighted decision trees, wherein each decision tree of the weighted decision trees is assigned a probability- weighted association with one or more classifications of data in a dataset
Dawer teaches and training the ASD model with at least a portion of the processed training data, wherein the ASD model is a gradient-boosted tree machine-learning model having a depth of at least 2 and not more than 7, wherein the gradient- boosted tree machine-learning model comprises at least 200 weighted decision trees and not more than 600 weighted decision trees, wherein each decision tree of the weighted decision trees is assigned a probability- weighted association with one or more classifications of data in a dataset [Dawer at Page 5 (see Claim 1 for explanation)]
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine the references of WALL, Feczko with the tree depth of Dawer with the motivation to lower misclassification errors in obtained models.
Regarding Claim 21
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer further teach wherein the ASD model is capable of providing an inconclusive evaluation result when data quality is insufficient [WALL at Para. 0618 teaches for every question, the analysts have the option of selecting: “The footage doesn't provide enough opportunity to assess reliably.” In addition, analysts may deem a submission un-scorable if one or more videos are unhelpful for any reason such as: poor lighting, poor video or audio quality, bad vantage point, child not present or identifiable within a group, insufficient interaction with the child. If un-scorable, caregivers will be notified and requested to upload additional video].
Regarding Claim 22
Claim(s) 22 is/are analogous to Claim(s) 3, thus Claim(s) 22 is/are similarly analyzed and rejected in a manner consistent with the rejection of Claim(s) 3.
Regarding Claim 23
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer further teach wherein the exploratory data analysis comprises identifying missing data patterns, identifying outlier values, identifying data distributions, or combinations thereof [Feczko at Para. 0285 (see Claim 1 for explanation)].
Regarding Claim 25
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer further teach wherein the sensitivity of at least about 0.75 comprises a sensitivity of from about 0.75 to about 0.95 [WALL at Para. 0167 (see Claim 1 for explanation)].
Regarding Claim 27
Claim(s) 27 is/are analogous to Claim(s) 21, thus Claim(s) 27 is/are similarly analyzed and rejected in a manner consistent with the rejection of Claim(s) 21.
Regarding Claim 28
WALL/Feczko/Dawer teach the system of claim 14,
WALL/Feczko/Dawer further teach wherein the computing device is further configured to output the evaluation result to a display, transmit the evaluation result to a remote device, store the evaluation result in a database, or combinations thereof [WALL at Para. 0380 teaches If yes, as shown at step 740, the feature recommendation module may select the next feature to be presented to the user, and steps 705-725 may be repeated until a final prediction (e.g., a specific developmental disorder or “no diagnosis”) can be displayed to the subject. If no additional features can be presented to the subject, “no diagnosis” may be displayed to the subject, as shown at step 745].
Regarding Claim 29
WALL/Feczko/Dawer teach the method of claim 18,
WALL/Feczko/Dawer further teach wherein the evaluation result further comprises a confidence score associated with the classification of the ASD [WALL at Para. 0380 teaches at step 725, a check can be performed to determine whether the fitting of the data can generate a prediction of a specific developmental disorder (e.g., autism, ADHD, etc.) sufficient confidence (e.g., within at least a 90% confidence interval). If so, as shown at step 730, the predicted developmental disorder can be displayed to the user].
Regarding Claim 30
WALL/Feczko/Dawer teach the method of claim 18,
WALL/Feczko/Dawer further teach wherein the data associated with the subject comprises demographic data, comorbidity data, and observational assessment and interview data [WALL at Para. 0170 teaches the subject-specific dimensions may comprise a subject's gender, the geographic region where a subject resides, and a subject's age].
Claim 7 rejected under 35 U.S.C. 103(a) as being unpatentable over WALL, Feczko, Dawer as applied to claims 1, 14, 18 above, and further in view of RAJAN et al (US Publication No. 20210383924).
Regarding Claim 7
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer do not teach wherein the data associated with the subject comprises the medication data, wherein the medication data comprises an indication of any medications used by the subject.
RAJAN teaches wherein the data associated with the subject comprises the medication data, wherein the medication data comprises an indication of any medications used by the subject [RAJAN at Para. 0056 teaches patient data may include, but is not limited to, the following: age, sex, region, ethnicity, birth age, birth weight, perinatal complications, current weight, body mass index, oropharyngeal status (e.g. allergic rhinitis), dietary restrictions, medications, chronic medical issues, immunization status, medical allergies, early intervention services, surgical history, and family psychiatric history. Given the prevalence of attention deficit hyperactivity disorder (ADHD) and gastrointestinal (GI) disturbance among children with ASD, for purposes of the embodiment directed to ASD, survey questions were included to identify these two common medical co-morbidities].
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine the references of WALL, Feczko, Dawer with the medication data of RAJAN with the motivation to better accurately predicting a medical condition in a patient characterized by feature values.
Claim 24 rejected under 35 U.S.C. 103(a) as being unpatentable over WALL, Feczko, Dawer as applied to claims 1, 14, 18 above, and further in view of Küpper et al (“Identifying Predictive Features of Autism Spectrum Disorders in a Clinical Sample of Adolescents and Adults Using Machine Learning.”).
Regarding Claim 24
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer do not teach wherein the training data comprises data associated with subjects across multiple age groups, wherein the multiple age groups include pediatric subjects and adult subjects, and wherein the ASD model is configured to evaluate subjects across the multiple age groups.
Küpper teaches wherein the training data comprises data associated with subjects across multiple age groups, wherein the multiple age groups include pediatric subjects and adult subjects, and wherein the ASD model is configured to evaluate subjects across the multiple age groups [Küpper at Discussion Section Para 2 teaches using an SVM-based approach, we identified a reduced subset of 5 behavioral features from the ADOS Module 4 that showed good specificity (83%) and sensitivity (71%) on our whole sample (SVM interpreted as machine learning model; whole sample interpreted to include pediatrics and adult subjects)].
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine the references of WALL, Feczko, Dawer with the age groups of Küpper with the motivation to improve ASD model diagnosis.
Claim 26 rejected under 35 U.S.C. 103(a) as being unpatentable over WALL, Feczko, Dawer as applied to claims 1, 14, 18 above, and further in view of Shoaran et al (US Publication No. 20200388397).
Regarding Claim 26
WALL/Feczko/Dawer teach the method of claim 1,
WALL/Feczko/Dawer do not teach wherein the gradient-boosted tree machine-learning model has a depth of from about 3 to about 5.
Shoaran teaches wherein the gradient-boosted tree machine-learning model has a depth of from about 3 to about 5 [Shoaran at Para. 0007 teaches in some embodiments, the selecting, the converting, the identifying, and the determining is performed for the one or more input channels that are selected without buffering data from the plurality of input channels other than the one or more input channels. In some embodiments, a number of the plurality of gradient boosted decision trees is up to eight, and each gradient boosted decision tree has a maximum pre-determined depth of four].
It would have been prima facie obvious skill in the art, at the time of effective filing, to combine the references of WALL, Feczko, Dawer with the depth of Shoaran with the motivation to improve predictive performance.
Response to Arguments
Rejection under 35 U.S.C. § 101
Regarding the rejection of Claims 1, 3-7, 14-15, 18, 21-30, the Examiner has considered the Applicant’s arguments; however, the arguments are not persuasive. Any arguments inadvertently not addressed are unpersuasive for at least the following reasons. Applicant argues:
Applicant respectfully submits that these additional elements integrate machine-learning principles into a specific practical application with concrete technical parameters. Applicant respectfully submits that the above-noted amendments define a specific machine-learning architecture with concrete technical parameters that directly impact computational performance and model accuracy. The claims recite how the model operates at a technical level-through leaf nodes, branch directions, weighted decision trees, and score-based threshold comparisons-not merely what result is achieved.
Also, the two-stage classification architecture represents a specific technical solution that minimizes false negatives by separating binary detection from multiclassification. This is not an abstract idea applied to a generic computer, but rather a particularized machine-learning implementation with defined architectural constraints. As disclosed in the specification, "the ASD model 120 may be configured to evaluate the data associated with the subject to determine the presence or absence of an ASD and, based upon a determination of the presence of an ASD, classify the ASD." Application [0032];
The sensitivity threshold of at least 0.75 represents a measurable technical performance parameter that is achieved through the specific model architecture claimed. The specification demonstrates this technical improvement through confusion matrices showing minimized false negatives. See Application at [0073]-[0078] and Figures 7A-7B (confusion matrices demonstrating model performance with minimized false negatives for ASD vs. non-ASD classification).
Regarding (a), the Examiner respectfully disagrees. The machine learning model is performing its normally functions. Training a machine-learning model with new data is not an improvement to the model. See Recentive Analytics, Inc. v. Fox Corp. which states that “[i]terative training using selected training material…are incident to the very nature of machine learning” and thus does not provide for an improvement. Recentive at 12. Technical parameters merely describe what the model is but do not functionally improve the model architecture; neither computational performance nor model accuracy are claimed to be improved, nor would a person having skill in the art recognize any such improvement.
Regarding (b), the Examiner respectfully disagrees. Binary classification and multiclassification being two separate steps only mean that two well-known methods are used separately. No improvement is made to the classification models. They are performing as they normally would function. Sticking well-known models together does not improve or change the function of either model. Therefore, no improvement is found.
Regarding (c), the Examiner respectfully disagrees. The sensitivity threshold does not provide a improvement in the functioning of the model. The model architecture is not doing anything that a generic model wouldn’t normally do, which is receiving, process and output data.
Rejection under 35 U.S.C. § 102/103
Regarding the rejection of Claims 1, 3-7, 14-15, 18, 21-30, the Examiner has considered the Applicant’s arguments; however, the arguments are not persuasive. Applicant argues:
With respect to determining the presence or absence of an ASD and, only in response to a determination of the presence of an ASD, classifying the ASD, the Final Office Action assert that Wall inherently teaches this limitation because "[c]lassification of a disease cannot occur if a detection of the disease does not occur." Office Action at 22. However, Wall teaches a simultaneous multiclassification between four determinations, including non-ASD as one determination. See Wall at [0164]. That is, Wall's approach appears to proceed directly to classification without any conditional determination, as claimed. Wall states that "the categorical determination may comprise a presence of the cognitive function attribute and an absence of the cognitive function attribute" and discusses classification of developmental disorders, but does not teach the hierarchical decision-making process where classification is conditional upon first determining ASD presence. Wall at [0164]. Therefore, Wall does not disclose the claimed sequential two-stage architecture where binary classification occurs first and multiclassification occurs conditionally only upon positive ASD determination.
Feczko, which the Final Office Action cites as disclosing "acquiring training data," discusses in pertinent part exploratory data analysis in which, before exploratory data analysis, there were 143 subjects (73 ASD, 70 typically developing (TD)), representing 49% non-ASD subjects. Feczko [0285]. After filtering to remove subjects with greater than 15% missing data, Feczko had 105 subjects (47 ASD, 58 TD), representing 55% non-ASD subjects. Id. The claimed method requires that the acquired training data (that is, when acquired, before filtering) comprises at least about 50% non-ASD subjects. Feczko's training data composition of at least 50% non- ASD subjects was achieved only after filtering, not at acquisition. As such, Feczko cannot disclose these elements.
Regarding (a), the Examiner respectfully disagrees. To clarify, even if Wall at Para. 0164 does not separate the two steps of binary classification and multiclassification, the combination of prior art of Feczko at Para. 0164 and at Para. 0198 teach the multiclassification step. Wall at Para. 0401 teaches the binary classification. Therefore, the combination of Wall and Feczko teach the limitation. Examiner respectfully points to the 101 Section of this Office Action for the full explanation.
Regarding (b), the Examiner respectfully disagrees. Even if Feczko at Para. 0285 does not teach the claimed limitation, Feczko at Para. 0221 discloses a study in which over 50 percent of the subjects in the study were typically developing, which is interpreted as non-ASD. See basis of rejection for further explanation.
Regarding the rejection of Claims 1, 3-7, 14-15, 18, 21-30, the Examiner has considered the Applicant’s arguments in regards to the amended portions of the Claims; however, these arguments are moot given the new grounds of rejection as afforded by the present RCE.
Conclusion
The prior art made of record and not relied upon in the present basis of rejection are noted in the attached PTO 892 and include:
Ironside et al (US Publication No. 10977737) discloses an apparatus provided for generating a generalized linear model structure definition.
TIRANOFF et al (US Publication No. 20180279936) discloses a digital media platform application including a video-based natural histories database and other tools for the study of rare or difficult-to-diagnose neurodevelopmental disorders.
WANG et al (Foreign Publication CN-114819186-A) discloses a method and a device for constructing a gradient boosting decision tree GBDT model.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN C EDOUARD whose telephone number is (571)270-0107. The examiner can normally be reached M-F 730 - 430.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Morgan can be reached on (571) 272 - 6773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JONATHAN C EDOUARD/Examiner, Art Unit 3683
/JASON S TIEDEMAN/Primary Examiner, Art Unit 3683