DETAILED ACTION
This action is in response to the application filed 12/08/2021. Claims 1, 3-8, 10-15, and 17-20 are pending and have been examined.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/16/2026 has been entered.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
Claims 8, 10-15, and 17-20 refer to a “computer readable storage medium”. Paragraph [0063] of the instant Specification states, “A computer-readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire”. Accordingly, the computer readable storage media is not interpreted to include transitory signals per se.
Claim Objections
Claim 1 is objected to because of the following informalities:
Limitation 3: “generating an outlier detection data subset from the labeled data by down sampling a plurality of rows of one class label within the labeled data” was present (with the exception of using ‘down sampling’ in lieu of ‘downsampling’ in the claims previously submitted on 08/28/2025, but is underlined in the instant amendments submitted on 12/15/2026.
Limitation 7: “generating a set of data metafeatures for the set of data subsets a set of pipeline metafeatures for the set of unsupervised machine learning pipelines” is improper grammatically.
Limitation 9: In “the metalearner using the input training dataset”, “the metalearner” is newly added in the instant amendments, but is not underlined to reflect this change.
Limitation 9: “the metalearner using the input training dataset to computer metrics” is improper grammar.
Appropriate correction is required.
Claim 8 is objected to because of the following informalities:
Limitation 5: “generating an outlier detection data subset from the labeled data by down sampling a plurality of rows of one class label within the labeled data” was present in the claims previously submitted on 08/28/2025, but is underlined in the instant amendments submitted on 12/15/2026.
Limitation 9: “generating a set of data metafeatures for the set of data subsets a set of pipeline metafeatures for the set of unsupervised machine learning pipelines” is improper grammatically.
Limitation 11: In “the metalearner using the input training dataset”, “the metalearner” is newly added in the instant amendments, but is not underlined to reflect this change.
Limitation 11: “the metalearner using the input training dataset to computer metrics” is improper grammar.
Appropriate correction is required.
Claim 15 is objected to because of the following informalities:
Limitation 3: “generating an outlier detection data subset from the labeled data by down sampling a plurality of rows of one class label within the labeled data” was present in the claims previously submitted on 08/28/2025, but is underlined in the instant amendments submitted on 12/15/2026.
Limitation 7: “generating a set of data metafeatures for the set of data subsets a set of pipeline metafeatures for the set of unsupervised machine learning pipelines” is improper grammatically.
Limitation 9: In “the metalearner using the input training dataset”, “the metalearner” is newly added in the instant amendments, but is not underlined to reflect this change.
Limitation 9: “the metalearner using the input training dataset to computer metrics” is improper grammar.
Appropriate correction is required.
Claim 12 is objected to because of the following informalities: “training an unsupervised machine learning pipeline for any pair of outlier detection data subsets” was present in the claims previously submitted on 08/28/2025, but previous changes in these claims are still marked in the instant amendments submitted 12/15/2026. Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 3-8, 10-15, and 17-20 are rejected under 35 U.S.C. 101 because the claimed inventions are directed to non-statutory subject matter without significantly more.
Claim 1
Step 1: The claim recites “A computer-implemented method”, and is therefore directed to the statutory category of process
Step 2A Prong 1: The claim recites the following judicial exception(s)
generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in a labeled data subset may be less than or equal to a number of rows within the labeled data set can be performed as a mental process. One can merely imagine the labeled data set, only including rows from two randomly selected classes.
generating an outlier detection data subset from the labeled data by downsampling a plurality of rows of one class label within the labeled data can be performed as a mental process. One can merely imagine a subset of the rows of one class label within the labeled data.
generating a training set from the set of data subsets and the set of unsupervised machine learning pipelines can be performed as a mental process. One can merely associate data subsets with machine learning pipelines arbitrarily.
generating a set of data metafeatures for the set of data subsets a set of pipeline metafeatures for the set of unsupervised machine learning pipelines can be performed as a mental process. One can merely identify aggregate features for each data subset for data metafeatures and identify model features for each pipeline for pipeline metafeatures.
combining data metafeatures of the set of data metafeatures, pipeline metafeatures of the set of pipeline metafeatures, and a pipeline performance metric to create a labeled training data set for the metalearner can be performed as a mental process. One can merely pair pipeline metafeatures and data metafeatures, then add a variable with a value proportional to the performance of the corresponding model to each pipeline metafeature record.
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset can be performed as a mental process. One can merely assign a performance score to each unsupervised pipeline after witnessing its performance processing the outlier detection data subset.
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through the following additional element(s)
receiving a labeled data set as an input amounts to mere reception of data and is insignificant extra-solution activity (MPEP 2106.05(g)).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset is mere instruction to apply judicial exceptions to a generation process in a generic manner (MPEP 2106.05(f)).
training a metalearner for unsupervised tasks based on the training set is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: This is mere instruction to train a machine learning component in a generic manner (MPEP 2106.05(f)).
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset: This is directed to mere output and is insignificant extra-solution activity (MPEP 2106.05(g)).
using the input training dataset to the metalearner to enable computation of performance metrics of unsupervised pipelines on the outlier detection data subset: This is mere instruction to apply generic computer components to a judicial exception (MPEP 2106.05(f)).
Step 2B: The following additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s)
receiving a labeled data set as an input is an instance of retrieving information from memory, a limitation considered well-understood, routine, and conventional (MPEP 2106.05(d) II. iv.).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset is mere instruction to apply judicial exceptions to a generation process in a generic manner (MPEP 2106.05(f)).
training a metalearner for unsupervised tasks based on the training set is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: This is mere instruction to train a machine learning component in a generic manner (MPEP 2106.05(f)).
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset: This is an instance of storing and / or retrieving information in memory, a limitation known to be well-understood, routine, and conventional (MPEP 2106.05(d) II. iv.)
using the input training dataset to the metalearner to enable computation of performance metrics of unsupervised pipelines on the outlier detection data subset: This is mere instruction to apply generic computer components to a judicial exception (MPEP 2106.05(f)).
Claim 3
Step 1: The claim recites a process, as in claim 1
Step 2A Prong 1: The claim recites the following further judicial exception(s)
generating a labeled data subset from the labeled data set can still be performed as a mental process. One can merely specify arbitrary subsets from the labeled data set, such that each record of these subsets contains the label variable.
generating an outlier detection data subset from the labeled data set can be performed as a mental process. One can merely specify arbitrary subsets containing outlier data from the labeled data set.
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through the additional element(s).
Step 2B: The additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s).
Claim 4
Step 1: The claim recites a process, as in claim 3
Step 2A Prong 1: The claim recites no further judicial exception(s)
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through the further additional element(s)
an outlier detection data subset is generated for each unsupervised machine learning pipeline: This merely links a judicial exception to a particular field of use (outlier detection) (MPEP 2106.05(h)).
Step 2B: The further additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s)
an outlier detection data subset is generated for each unsupervised machine learning pipeline: This merely links a judicial exception to a particular field of use (outlier detection) (MPEP 2106.05(h)).
Claim 5
Step 1: The claim recites a process, as in claim 4
Step 2A Prong 1: The claim recites no further judicial exception(s)
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through the further additional element(s)
training an unsupervised machine learning pipeline for any pair of outlier detection data subsets is mere instruction to apply a judicial exception to a generic data structure (unsupervised machine learning pipeline) (MPEP 2106.05(f)).
Step 2B: The further additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s)
training an unsupervised machine learning pipeline for any pair of outlier detection data subsets is mere instruction to apply a judicial exception to a generic data structure (unsupervised machine learning pipeline) (MPEP 2106.05(f)).
Claim 6
Step 1: The claim recites a process, as in claim 5
Step 2A Prong 1: The claim recites the following further judicial exception(s)
separating the training set into a training data subset and an evaluation data subset can be performed as a mental process. One can merely partition the training set into two subsets.
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through the further additional element(s)
training the metalearner based on the training data subset is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
evaluating the metalearner based on the evaluation data subset is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
Step 2B: The further additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s)
training the metalearner based on the training data subset is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
evaluating the metalearner based on the evaluation data subset is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
Claim 7
Step 1: The claim recites a process, as in claim 1
Step 2A Prong 1: The claim recites the following further judicial exception(s)
generating data set metafeatures for the labeled data set can be performed as a mental process. One can merely identify aggregate features for each data subset.
generating pipeline metafeatures for the set of unsupervised machine learning pipelines can be performed as a mental process. One can merely identify model features for each pipeline.
identifying … a subset of unsupervised machine learning pipelines can be performed as a mental process. One can arbitrarily partition the set of pipelines into subsets.
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through the further additional element(s)
applying the metalearner on the data set metafeatures and the pipeline metafeatures in the training set is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
Identifying a subset of unsupervised machine learning pipelines using the metalearner is mere instruction to apply a generic data structure (metalearner) to a judicial exception (MPEP 2106.05(f)).
Step 2B: The further additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s)
applying the metalearner on the data set metafeatures and the pipeline metafeatures in the training set is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
Identifying a subset of unsupervised machine learning pipelines using the metalearner is mere instruction to apply a generic data structure (metalearner) to a judicial exception (MPEP 2106.05(f)).
Claim 8
Step 1: The claim recites “A system”, and is therefore directed to the statutory category of article of manufacture
Step 2A Prong 1: The claim recites the following judicial exception(s)
generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in a labeled data subset may be less than or equal to a number of rows within the labeled data set can be performed as a mental process. One can merely imagine the labeled data set, only including rows from two randomly selected classes.
generating an outlier detection data subset from the labeled data by downsampling a plurality of rows of one class label within the labeled data can be performed as a mental process. One can merely imagine a subset of the rows of one class label within the labeled data.
generating a training set from the set of data subsets and the set of unsupervised machine learning pipelines can be performed as a mental process. One can merely associate data subsets with machine learning pipelines arbitrarily.
generating a set of data metafeatures for the set of data subsets a set of pipeline metafeatures for the set of unsupervised machine learning pipelines can be performed as a mental process. One can merely identify aggregate features for each data subset for data metafeatures and identify model features for each pipeline for pipeline metafeatures.
combining data metafeatures of the set of data metafeatures, pipeline metafeatures of the set of pipeline metafeatures, and a pipeline performance metric to create a labeled training data set for the metalearner can be performed as a mental process. One can merely pair pipeline metafeatures and data metafeatures, then add a variable with a value proportional to the performance of the corresponding model to each pipeline metafeature record.
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset can be performed as a mental process. One can merely assign a performance score to each unsupervised pipeline after witnessing its performance processing the outlier detection data subset.
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through the following additional element(s)
one or more processors; and a computer-readable storage medium, coupled to the one or more processors, storing program instructions that, when executed by the one or more processors, cause the one or more processors to perform operations: This is mere instruction to execute the judicial exceptions with a generic computing device (MPEP 2106.05(f)).
receiving a labeled data set as an input amounts to mere reception of data and is insignificant extra-solution activity (MPEP 2106.05(g)).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset is mere instruction to apply judicial exceptions to a generation process in a generic manner (MPEP 2106.05(f)).
training a metalearner for unsupervised tasks based on the training set is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: This is mere instruction to train a machine learning component in a generic manner (MPEP 2106.05(f)).
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset can be performed as a mental process: This is directed to mere output and is insignificant extra-solution activity (MPEP 2106.05(g)).
using the input training dataset to the metalearner to enable computation of performance metrics of unsupervised pipelines on the outlier detection data subset: This is mere instruction to apply generic computer components to a judicial exception (MPEP 2106.05(f)).
Step 2B: The following additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s)
one or more processors; and a computer-readable storage medium, coupled to the one or more processors, storing program instructions that, when executed by the one or more processors, cause the one or more processors to perform operations: This is mere instruction to execute the judicial exceptions with a generic computing device (MPEP 2106.05(f)).
receiving a labeled data set as an input is an instance of retrieving information from memory, a limitation considered well-understood, routine, and conventional (MPEP 2106.05(d) II. iv.).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset is mere instruction to apply judicial exceptions to a generation process in a generic manner (MPEP 2106.05(f)).
training a metalearner for unsupervised tasks based on the training set is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: This is mere instruction to train a machine learning component in a generic manner (MPEP 2106.05(f)).
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset: This is an instance of storing and / or retrieving information in memory, a limitation known to be well-understood, routine, and conventional (MPEP 2106.05(d) II. iv.)
using the input training dataset to the metalearner to enable computation of performance metrics of unsupervised pipelines on the outlier detection data subset: This is mere instruction to apply generic computer components to a judicial exception (MPEP 2106.05(f)).
Claims 10-14
Step 1: Claims 10-14 recite an article of manufacture, as in claim 8.
Step 2A Prong 1: Claims 10-14 recite the same judicial exception(s) as claims 3-7, respectively.
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through any additional elements. The analysis of claims 10-14 at this step mirrors that of claims 3-7, respectively, with the exception that claims 10-14 are directed to “A system, comprising: one or more processors; and a computer-readable storage medium, coupled to the one or more processors, storing program instructions that, when executed by the one or more processors, cause the one or more processors to perform operations”, said operations mirroring those of claims 3-7. This is a mere instruction to apply the exceptions using generic computer equipment (MPEP 2106.05(f)).
Step 2B: The additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s). The analysis of claims 10-14 at this step mirrors that of claims 3-7, with the exception that claims 10-14 are directed to “A system, comprising: one or more processors; and a computer-readable storage medium, coupled to the one or more processors, storing program instructions that, when executed by the one or more processors, cause the one or more processors to perform operations”, said operations mirroring those of claims 3-7. This is mere instruction to apply the exceptions using generic computer equipment (MPEP 2106.05(f)).
Claim 15
Step 1: The claim recites “A computer program product”, and is therefore directed to the statutory category of article of manufacture
Step 2A Prong 1: The claim recites the following judicial exception(s)
generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in a labeled data subset may be less than or equal to a number of rows within the labeled data set can be performed as a mental process. One can merely imagine the labeled data set, only including rows from two randomly selected classes.
generating an outlier detection data subset from the labeled data by downsampling a plurality of rows of one class label within the labeled data can be performed as a mental process. One can merely imagine a subset of the rows of one class label within the labeled data.
generating a training set from the set of data subsets and the set of unsupervised machine learning pipelines can be performed as a mental process. One can merely associate data subsets with machine learning pipelines arbitrarily.
generating a set of data metafeatures for the set of data subsets a set of pipeline metafeatures for the set of unsupervised machine learning pipelines can be performed as a mental process. One can merely identify aggregate features for each data subset for data metafeatures and identify model features for each pipeline for pipeline metafeatures.
combining data metafeatures of the set of data metafeatures, pipeline metafeatures of the set of pipeline metafeatures, and a pipeline performance metric to create a labeled training data set for the metalearner can be performed as a mental process. One can merely pair pipeline metafeatures and data metafeatures, then add a variable with a value proportional to the performance of the corresponding model to each pipeline metafeature record.
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset can be performed as a mental process. One can merely assign a performance score to each unsupervised pipeline after witnessing its performance processing the outlier detection data subset.
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through the following additional element(s)
a computer readable storage medium having program instructions embodied therewith, the program instructions being executable by one or more processors to cause the one or more processors to perform operations: This is mere instruction to execute the judicial exceptions with a generic computing device (MPEP 2106.05(f)).
receiving a labeled data set as an input amounts to mere reception of data and is thus insignificant extra-solution activity (MPEP 2106.05(g)).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset is mere instruction to apply judicial exceptions to a generation process in a generic manner (MPEP 2106.05(f)).
training a metalearner for unsupervised tasks based on the training set is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: This is mere instruction to train a machine learning component in a generic manner (MPEP 2106.05(f)).
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset: This is directed to mere output and is insignificant extra-solution activity (MPEP 2106.05(g)).
using the input training dataset to the metalearner to enable computation of performance metrics of unsupervised pipelines on the outlier detection data subset: This is mere instruction to apply generic computer components to a judicial exception (MPEP 2106.05(f)).
Step 2B: The following additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s)
a computer readable storage medium having program instructions embodied therewith, the program instructions being executable by one or more processors to cause the one or more processors to perform operations: This is mere instruction to execute the judicial exceptions with a generic computing device (MPEP 2106.05(f)).
receiving a labeled data set as an input is an instance of retrieving information from memory, a limitation considered well-understood, routine, and conventional (MPEP 2106.05(d) II. iv.).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset is mere instruction to apply judicial exceptions to a generation process in a generic manner (MPEP 2106.05(f)).
training a metalearner for unsupervised tasks based on the training set is mere instruction to apply a judicial exception to a generic data structure (metalearner) (MPEP 2106.05(f)).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: This is mere instruction to train a machine learning component in a generic manner (MPEP 2106.05(f)).
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset: This is an instance of storing and / or retrieving information in memory, a limitation known to be well-understood, routine, and conventional (MPEP 2106.05(d) II. iv.)
using the input training dataset to the metalearner to enable computation of performance metrics of unsupervised pipelines on the outlier detection data subset: This is mere instruction to apply generic computer components to a judicial exception (MPEP 2106.05(f)).
Claims 17-20
Step 1: Claims 17-20 recite an article of manufacture, as in claim 15.
Step 2A Prong 1: Claims 18-20 recite the same judicial exception(s) as claims 5-7, respectively. Claim 17 recites the judicial exception(s) of both claims 3 and 4.
Step 2A Prong 2: The judicial exception(s) are not integrated into a practical application through any additional elements. The analysis of claims 17-20 at this step mirrors that of claims 3-7, with the exception that claims 17-20 are directed to “A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions being executable by one or more processors to cause the one or more processors to perform operations”, said operations mirroring those of claims 3-7. This is a mere instruction to apply the exceptions using generic computer equipment (MPEP 2106.05(f)).
Step 2B: The additional element(s) of the claim, taken alone or in combination, do not amount to significantly more than the recited judicial exception(s). The analysis of claims 17-20 at this step mirrors that of claims 3-7, with the exception that claims 17-20 are directed to “computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions being executable by one or more processors to cause the one or more processors to perform operations”, said operations mirroring those of claims 3-7. This is mere instruction to apply the exceptions using generic computer equipment (MPEP 2106.05(f)).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1 & 3-7 are rejected under 35 U.S.C. 103 as being unpatentable over Zhao et al. (Automatic Unsupervised Outlier Model Selection, October 2021, 35th Conference on Neural Information Processing Systems (NeurIPS 2021)) in view of Panda et al. (NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search, published 2/12/2021, arXiv:2006.13314v2), and further in view of Wu et al. (MACHINE LEARNING PROCESSING PIPELINE OPTIMIZATION, filed 4/6/2020, US 20220180066 A1).
Regarding claim 1, Zhao teaches [a] computer-implemented method, comprising:
receiving a labeled data set as an input: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(labeled data set), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
” (Zhao, page 4, paragraph 2); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space M on all the n meta-train datasets
D
t
r
a
i
n
” (Zhao, page 4, paragraph 7).
generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in the labeled data subset may be less than or equal to a number of rows within the labeled data set: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(data subsets), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
(labeled data set)” (Zhao, page 4, paragraph 2).
generating an outlier detection data subset from the labeled data set by down sampling a plurality of rows of one class label within the labeled data: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(labeled data subset[s]), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
(labeled data set)” (Zhao, page 4, paragraph 2).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset: “We consider the model selection problem for unsupervised outlier detection, which we refer to as UOMS (unsupervised outlier model selection) hereafter. Given a new dataset, without any labels, the problem is to select both (i) a detector/algorithm and (ii) its associated hyperparameter(s) (HP)” (Zhao, page 3, paragraph 7); “we discretize the HP space for each candidate detector to make the search space tractable, which induces a finite pool of models denoted
M
=
{
M
1
,
…
,
M
m
}
(machine learning pipelines). Each model
M
∈
M
can be seen as a {detector, configuration} pair, where the configuration depicts a specific set of values for the detector’s HP(s)” (Zhao, page 3, paragraph 8); “our METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
(data subsets / outlier detection data subset)” (Zhao, page 4, paragraph 2).
generating a training set from the set of data subsets and the set of unsupervised machine learning pipelines: “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
(set of unsupervised machine learning pipelines) on all the n meta-train datasets
D
t
r
a
i
n
(set of data subsets)” (Zhao, page 4, paragraph 7). A pairing of a machine learning pipeline
M
k
and a subset of the training data
D
k
comprise a training set.
training a metalearner for unsupervised tasks based on the training set: “Our METAOD consists of two-phases: offline (meta-)training of the meta-learner on
D
t
r
a
i
n
” (Zhao, page 4, paragraph 4); “3.2.1 (Meta-)Training (Offline)” (Zhao, page 4, below paragraph 4); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
on all the n meta-train datasets
D
t
r
a
i
n
(training set[s])” (Zhao, page 4, paragraph 7).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: “our goal is to rank the models (unsupervised pipelines) for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature” (Zhao, page 6, paragraph 3) “Overall we optimize the smoothed criterion, sDCG, over all meta-train datasets
D
t
r
a
i
n
” (Zhao, page 7, paragraph 2).
generating
a set of data metafeatures for the set of data subsets: “To capture task similarity, it then extracts a set of d meta-features from each meta-train dataset, denoted by
M
=
ψ
(
X
1
,
…
,
X
n
)
∈
R
n
×
d
where
ψ
(
∙
)
depicts the feature extraction module” (Zhao, page 4, paragraph 7); “To this end, we extract meta-features that can be organized into two categories: (1) statistical features (data set metafeatures), and (2) landmarker features. Broadly speaking, the former captures statistical properties of the underlying data distributions; e.g., min, max, variance, skewness, covariance, etc. of the features and feature combinations. These kinds of meta-features have been commonly used in the AutoML literature.” (Zhao, page 5, paragraph 7 to page 6, paragraph 1).
a set of pipeline metafeatures for the set of unsupervised machine learning pipelines: “To this end, we extract meta-features that can be organized into two categories: (1) statistical features, and (2) landmarker features (pipeline metafeatures)” (Zhao, page 5, paragraph 7); “perhaps more important are the landmarker features, which are problem-specific, and aim to capture the outlying characteristics of a dataset. The idea is to apply a few of the fast, easy-to-construct OD models on a dataset and extract features from (i) the structure of the estimated OD model, and (ii) its output outlier scores” (Zhao, page 6, paragraph 2). The {detector, configuration} pairs described by each model M in Zhao are pipelines.
combining…
data metafeatures of the set of data metafeatures, pipeline metafeatures of the set of pipeline metafeatures: “To this end, we extract meta-features that can be organized into two categories: (1) statistical features (data metafeatures), and (2) landmarker features (pipeline metafeatures)” (Zhao, page 5, paragraph 7); “we discard least squares and instead optimize the rank-based (row- or dataset-wise) discounted cumulative gain (DCG),
PNG
media_image1.png
96
364
media_image1.png
Greyscale
” (Zhao, page 4, paragraph 10 to page 5, paragraph 1); “We find that initializing U, denoted
U
(
0
)
, based on meta-features facilitates stable training” (Zhao, page 5, paragraph 2).
, and a pipeline performance metric: “our METAOD relies on … the historical performances of the pool of candidate models,
M
, on the meta-train datasets. We denote by
P
∈
R
n
×
m
the performance matrix (pipeline performance metric), where
P
i
j
corresponds to the j-th model
M
j
'
s
performance on the i-th meta-train dataset
D
i
” (Zhao, page 4, paragraph 2).
to create a labeled training data set for the metalearner: “our goal is to rank the models for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature” (Zhao, page 6, paragraph 3). By assigning a rank to each model-dataset pair (this pairing mapped to ‘training data’), each set of training data is being labeled.
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset: “our goal is to rank the models (unsupervised pipelines) for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG (performance metric) from the information retrieval literature” (Zhao, page 6, paragraph 3) “Overall we optimize the smoothed criterion, sDCG, over all meta-train datasets
D
t
r
a
i
n
(input training dataset)” (Zhao, page 7, paragraph 2); “Finally, the model with the largest predicted performance is outputted as the selected model” (Zhao, page 5, paragraph 5)
Zhao relates to automatically generating unsupervised outlier detection pipelines and is analogous to the claimed invention.
While Zhao fails to disclose the further limitations of the claim, Panda discloses a method of generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in the labeled data subset may be less than or equal to a number of rows within the labeled data set:
“For proxy sets directly sampled from the target set, the random sampling of a subset of classes (subsets from the labeled data set) maintaining the same number of images per class is more beneficial than trying to keep all the classes from the target dataset and the reducing the number of examples per class” (Panda, page 7, right column, paragraph 1). By definition, a subset of a dataset has a number of rows less than or equal to the full dataset.
“we investigated the proxies listed in Table 1, which are of two types: randomly selected and uniformly selected. For random selection, we picked a list of N classes and used all of their images. This is particularly important when designing a proxy set for a non-uniform, imbalanced distribution such as the one of ImageNet22K. For example ImageNet22K Proxy 2 was designed to have the same overall distribution of the full dataset, but the same number of images of ImageNet22K Proxy 1 … We split each of those datasets (subsets) into a training, validation and testing subsets with proportions 40/40/20 and use standard data pre-processing and augmentation techniques.” (Panda, page 3, left column, paragraph 2).
PNG
media_image2.png
413
732
media_image2.png
Greyscale
”Proxy sets used in our experiments” (Panda, page 3, right column, Table 1). Each proxy set has a number of classes greater than two. Thus, each contains at least one pair of class labels.
Panda relates to machine learning architecture searches and is analogous to the claimed invention. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Zhao to use class subsets on candidate architectures instead of full datasets, as disclosed by Panda. Doing so would increase the efficiency of the architecture search. In particular, forming subsets of classes is better to maintain practical search times than subsets of class members for all classes. See Panda, page 1, right column, paragraph 1 and page 5, left column, paragraph 2.
While Zhao and Panda fail to disclose the further limitations of the claim, Wu discloses a method of generating an outlier detection data subset from the labeled data set by down sampling a plurality of rows of one class label within the labeled data: “Using document processing as an example, the informative down-sampling approach may determine major classes and minor classes based on counts of samples in different classes, and then down-sample the majority class(es) by detecting and keeping the most informative samples (rows)” (Wu, [0025])
Wu relates to machine learning pipeline searches and is analogous to the claimed invention. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Zhao and Panda to downsample majority class(es), as disclosed by Wu. This procedure can balanced unbalanced datasets, which can otherwise cause poor model performance on minority class classification. See Wu, [0024].
Regarding claim 3, the rejection of claim 1 in view of Zhao, Panda, and Wu is incorporated. Zhao also teaches a method, wherein generating the set of data subsets further comprises: generating a labeled data subset from the labeled data set; and generating an outlier detection data subset from the labeled data set: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(labeled data subset[s]), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
(labeled data set)” (Zhao, page 4, paragraph 2).
Regarding claim 4, the rejection of claim 3 in view of Zhao, Panda, and Wu is incorporated. Zhao also teaches a method, wherein an outlier detection data subset is generated for each unsupervised machine learning pipeline: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(data subset[s]), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
” (Zhao, page 4, paragraph 2); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
on all the n meta-train datasets
D
t
r
a
i
n
” (Zhao, page 4, paragraph 7). Each of the m models (pipelines) is evaluated with each outlier data subset.
Regarding claim 5, the rejection of claim 4 in view of Zhao, Panda, and Wu is incorporated. Zhao also teaches a method, wherein generating the training set further comprises: training an unsupervised machine learning pipeline for any pair of outlier detection data subsets: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(data subset[s]), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
” (Zhao, page 4, paragraph 2); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
(unsupervised machine learning pipeline[s]) on all the n meta-train datasets
D
t
r
a
i
n
(pair[s] of outlier detection data subsets)” (Zhao, page 4, paragraph 7).
Regarding claim 6, the rejection of claim 5 in view of Zhao, Panda, and Wu is incorporated. Zhao also teaches a method, wherein training the metalearner further comprises:
separating the training set into a training data subset and an evaluation data subset: “Our METAOD consists of two-phases: offline (meta-)training of the meta-learner on
D
t
r
a
i
n
(training data subset), and online prediction that enables unsupervised model selection at test time for
D
t
e
s
t
(evaluation data subset)” (Zhao, page 4, paragraph 4).
training the metalearner based on the training data subset: “3.2.1 (Meta-)Training (Offline)” (Zhao, page 4, below paragraph 4); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
on all the n meta-train datasets
D
t
r
a
i
n
” (Zhao, page 4, paragraph 7).
evaluating the metalearner based on the evaluation data subset: “We evaluate METAOD and the baselines on 2 testbeds introduced below, resp. with 100 and 62 datasets, via cross-validation where datasets are split into meta-train/test in each fold” (Zhao, page 7, paragraph 4).
Regarding claim 7, the rejection of claim 1 in view of Zhao, Panda, and Wu is incorporated. Zhao also teaches a method, comprising:
generating data set metafeatures for the labeled data set: “To capture task similarity, it then extracts a set of d meta-features from each meta-train dataset, denoted by
M
=
ψ
(
X
1
,
…
,
X
n
)
∈
R
n
×
d
where
ψ
(
∙
)
depicts the feature extraction module” (Zhao, page 4, paragraph 7); “To this end, we extract meta-features that can be organized into two categories: (1) statistical features (data set metafeatures), and (2) landmarker features. Broadly speaking, the former captures statistical properties of the underlying data distributions; e.g., min, max, variance, skewness, covariance, etc. of the features and feature combinations. These kinds of meta-features have been commonly used in the AutoML literature.” (Zhao, page 5, paragraph 7 to page 6, paragraph 1).
generating pipeline metafeatures for the set of unsupervised machine learning pipelines: “To capture task similarity, it then extracts a set of d meta-features from each meta-train dataset, denoted by
M
=
ψ
(
X
1
,
…
,
X
n
)
∈
R
n
×
d
where
ψ
(
∙
)
depicts the feature extraction module” (Zhao, page 4, paragraph 7); “perhaps more important are the landmarker features, which are problem-specific, and aim to capture the outlying characteristics of a dataset. The idea is to apply a few of the fast, easy-to-construct OD models on a dataset and extract features from (i) the structure of the estimated OD model, and (ii) its output outlier scores” (Zhao, page 6, paragraph 2). The {detector, configuration} pairs described by each model M in Zhao are pipelines.
applying the metalearner on the data set metafeatures and the pipeline metafeatures in the training set:
“To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
on all the n meta-train datasets
D
t
r
a
i
n
. To capture task similarity, it then extracts a set of d meta-features from each meta-train dataset, denoted by
M
=
ψ
(
X
1
,
…
,
X
n
)
∈
R
n
×
d
where
ψ
(
∙
)
depicts the feature extraction module” (Zhao, page 4, paragraph 7)
“we discard least squares and instead optimize the rank-based (row- or dataset-wise) discounted cumulative gain (DCG),
PNG
media_image1.png
96
364
media_image1.png
Greyscale
” (Zhao, page 4, paragraph 10 to page 5, paragraph 1)
“We find that initializing U, denoted
U
(
0
)
, based on meta-features facilitates stable training” (Zhao, page 5, paragraph 2); “We find that initializing U, denoted U(0), based on meta-features facilitates stable training” (Zhao, page 5, paragraph 2). Metafeatures during training are used to construct DCG.
“our goal is to rank the models for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature” (Zhao, page 6, paragraph 3). The metalearner uses DCG to rank models during training.
identifying, using the metalearner, a subset of unsupervised machine learning pipelines: “the model with the largest predicted performance is outputted as the selected model, that is,
PNG
media_image3.png
51
323
media_image3.png
Greyscale
(2)” (Zhao, page 5, paragraph 6); “Notice that model selection by Eq. (2) for a newcoming dataset is solely based on its meta-features and other pre-trained components from meta-learning” (Zhao, page 5, paragraph 7).
Claim(s) 8, 10-15, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Zhao et al. (Automatic Unsupervised Outlier Model Selection, October 2021, 35th Conference on Neural Information Processing Systems (NeurIPS 2021)) in view of Laszczuk (US 20240078473 A1), and further in view of Panda et al. (NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search, published 2/12/2021, arXiv:2006.13314v2), and Wu et al. (MACHINE LEARNING PROCESSING PIPELINE OPTIMIZATION, filed 4/6/2020, US 20220180066 A1)
Regarding claim 8, Zhao teaches [a] system, comprising:
receiving a labeled data set as an input: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(labeled data set), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
” (Zhao, page 4, paragraph 2); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space M on all the n meta-train datasets
D
t
r
a
i
n
” (Zhao, page 4, paragraph 7).
generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in the labeled data subset may be less than or equal to a number of rows within the labeled data set: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(data subsets), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
(labeled data set)” (Zhao, page 4, paragraph 2).
generating an outlier detection data subset from the labeled data set by down sampling a plurality of rows of one class label within the labeled data: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(labeled data subset[s]), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
(labeled data set)” (Zhao, page 4, paragraph 2).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset: “We consider the model selection problem for unsupervised outlier detection, which we refer to as UOMS (unsupervised outlier model selection) hereafter. Given a new dataset, without any labels, the problem is to select both (i) a detector/algorithm and (ii) its associated hyperparameter(s) (HP)” (Zhao, page 3, paragraph 7); “we discretize the HP space for each candidate detector to make the search space tractable, which induces a finite pool of models denoted
M
=
{
M
1
,
…
,
M
m
}
(machine learning pipelines). Each model
M
∈
M
can be seen as a {detector, configuration} pair, where the configuration depicts a specific set of values for the detector’s HP(s)” (Zhao, page 3, paragraph 8); “our METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
(data subsets / outlier detection data subset)” (Zhao, page 4, paragraph 2).
generating a training set from the set of data subsets and the set of unsupervised machine learning pipelines: “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
(set of unsupervised machine learning pipelines) on all the n meta-train datasets
D
t
r
a
i
n
(set of data subsets)” (Zhao, page 4, paragraph 7). A pairing of a machine learning pipeline
M
k
and a subset of the training data
D
k
comprise a training set.
training a metalearner for unsupervised tasks based on the training set: “Our METAOD consists of two-phases: offline (meta-)training of the meta-learner on
D
t
r
a
i
n
” (Zhao, page 4, paragraph 4); “3.2.1 (Meta-)Training (Offline)” (Zhao, page 4, below paragraph 4); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
on all the n meta-train datasets
D
t
r
a
i
n
(training set[s])” (Zhao, page 4, paragraph 7).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: “our goal is to rank the models (unsupervised pipelines) for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature” (Zhao, page 6, paragraph 3) “Overall we optimize the smoothed criterion, sDCG, over all meta-train datasets
D
t
r
a
i
n
” (Zhao, page 7, paragraph 2).
generating
a set of data metafeatures for the set of data subsets: “To capture task similarity, it then extracts a set of d meta-features from each meta-train dataset, denoted by
M
=
ψ
(
X
1
,
…
,
X
n
)
∈
R
n
×
d
where
ψ
(
∙
)
depicts the feature extraction module” (Zhao, page 4, paragraph 7); “To this end, we extract meta-features that can be organized into two categories: (1) statistical features (data set metafeatures), and (2) landmarker features. Broadly speaking, the former captures statistical properties of the underlying data distributions; e.g., min, max, variance, skewness, covariance, etc. of the features and feature combinations. These kinds of meta-features have been commonly used in the AutoML literature.” (Zhao, page 5, paragraph 7 to page 6, paragraph 1).
a set of pipeline metafeatures for the set of unsupervised machine learning pipelines: “To this end, we extract meta-features that can be organized into two categories: (1) statistical features, and (2) landmarker features (pipeline metafeatures)” (Zhao, page 5, paragraph 7); “perhaps more important are the landmarker features, which are problem-specific, and aim to capture the outlying characteristics of a dataset. The idea is to apply a few of the fast, easy-to-construct OD models on a dataset and extract features from (i) the structure of the estimated OD model, and (ii) its output outlier scores” (Zhao, page 6, paragraph 2). The {detector, configuration} pairs described by each model M in Zhao are pipelines.
combining…
data metafeatures of the set of data metafeatures, pipeline metafeatures of the set of pipeline metafeatures: “To this end, we extract meta-features that can be organized into two categories: (1) statistical features (data metafeatures), and (2) landmarker features (pipeline metafeatures)” (Zhao, page 5, paragraph 7); “we discard least squares and instead optimize the rank-based (row- or dataset-wise) discounted cumulative gain (DCG),
PNG
media_image1.png
96
364
media_image1.png
Greyscale
” (Zhao, page 4, paragraph 10 to page 5, paragraph 1); “We find that initializing U, denoted
U
(
0
)
, based on meta-features facilitates stable training” (Zhao, page 5, paragraph 2).
, and a pipeline performance metric: “our METAOD relies on … the historical performances of the pool of candidate models,
M
, on the meta-train datasets. We denote by
P
∈
R
n
×
m
the performance matrix (pipeline performance metric), where
P
i
j
corresponds to the j-th model
M
j
'
s
performance on the i-th meta-train dataset
D
i
” (Zhao, page 4, paragraph 2).
to create a labeled training data set for the metalearner: “our goal is to rank the models for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature” (Zhao, page 6, paragraph 3). By assigning a rank to each model-dataset pair (this pairing mapped to ‘training data’), each set of training data is being labeled.
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset: “our goal is to rank the models (unsupervised pipelines) for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG (performance metric) from the information retrieval literature” (Zhao, page 6, paragraph 3) “Overall we optimize the smoothed criterion, sDCG, over all meta-train datasets
D
t
r
a
i
n
(input training dataset)” (Zhao, page 7, paragraph 2); “Finally, the model with the largest predicted performance is outputted as the selected model” (Zhao, page 5, paragraph 5)
Zhao relates to automatically generating unsupervised outlier detection pipelines and is analogous to the claimed invention.
While Zhao fail to disclose the further limitations of the claim, Laszczuk teaches [a] system, comprising: one or more processors; and a computer-readable storage medium, coupled to the one or more processors, storing program instructions that, when executed by the one or more processors, cause the one or more processors to perform operations: “In an aspect, provided is a computer-implemented method for end-to-end machine learning, comprising: … (c) (i) generating and training a model using an Automated Machine Learning (AutoML) algorithm“ (Laszczuk, [0004]); “The non-transitory computer-readable medium comprises machine-executable code (instructions) that, upon execution by the one or more computer processors, implements any of the methods described above or elsewhere herein” (Laszczuk, [0010]).
Laszczuk relates to AutoML learning systems and is analogous to the claimed invention. Zhao teaches a method of training and testing an AutoML system for unsupervised learning. The claimed invention improves upon this method by storing it in the form of instructions on computer hardware. Laszczuk teaches a method of training and testing an AutoML system that can be stored in the form of instructions on computer hardware, applicable to Zhao. A person of ordinary skill in the art would have recognized that storing Zhao’s method as computer instructions on Laszczuk’s hardware would lead to the predictable result of the method being executable by a computing system, and would improve the known device by allowing it to be performed with real data (MPEP 2143 I. (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results).
While Zhao and Laszczuk fail to disclose the further limitations of the claim, Panda discloses a method of generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in the labeled data subset may be less than or equal to a number of rows within the labeled data set:
“For proxy sets directly sampled from the target set, the random sampling of a subset of classes (subsets from the labeled data set) maintaining the same number of images per class is more beneficial than trying to keep all the classes from the target dataset and the reducing the number of examples per class” (Panda, page 7, right column, paragraph 1). By definition, a subset of a dataset has a number of rows less than or equal to the full dataset.
“we investigated the proxies listed in Table 1, which are of two types: randomly selected and uniformly selected. For random selection, we picked a list of N classes and used all of their images. This is particularly important when designing a proxy set for a non-uniform, imbalanced distribution such as the one of ImageNet22K. For example ImageNet22K Proxy 2 was designed to have the same overall distribution of the full dataset, but the same number of images of ImageNet22K Proxy 1 … We split each of those datasets (subsets) into a training, validation and testing subsets with proportions 40/40/20 and use standard data pre-processing and augmentation techniques.” (Panda, page 3, left column, paragraph 2).
PNG
media_image2.png
413
732
media_image2.png
Greyscale
”Proxy sets used in our experiments” (Panda, page 3, right column, Table 1). Each proxy set has a number of classes greater than two. Thus, each contains at least one pair of class labels.
Panda relates to machine learning architecture searches and is analogous to the claimed invention. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Zhao and Laszczuk to use class subsets on candidate architectures instead of full datasets, as disclosed by Panda. Doing so would increase the efficiency of the architecture search. In particular, forming subsets of classes is better to maintain practical search times than subsets of class members for all classes. See Panda, page 1, right column, paragraph 1 and page 5, left column, paragraph 2.
While Zhao, Laszczuk, and Panda fail to disclose the further limitations of the claim, Wu discloses a method of generating an outlier detection data subset from the labeled data set by down sampling a plurality of rows of one class label within the labeled data: “Using document processing as an example, the informative down-sampling approach may determine major classes and minor classes based on counts of samples in different classes, and then down-sample the majority class(es) by detecting and keeping the most informative samples (rows)” (Wu, [0025])
Wu relates to machine learning pipeline searches and is analogous to the claimed invention. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Zhao, Laszczuk, and Panda to downsample majority class(es), as disclosed by Wu. This procedure can balanced unbalanced datasets, which can otherwise cause poor model performance on minority class classification. See Wu, [0024].
The analysis of claims 10-14 mirrors that of claims 3-7, with the exception that claims 10-14 are directed to generic computer hardware which executes the methods of claims 3-7. This generic hardware is taught by Laszczuk, as discussed regarding claim 8. Thus, claims 10-14 are rejected under the same rationales used for claims 3-7, respectively.
Regarding claim 15, Zhao teaches operations, comprising:
receiving a labeled data set as an input: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(labeled data set), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
” (Zhao, page 4, paragraph 2); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space M on all the n meta-train datasets
D
t
r
a
i
n
” (Zhao, page 4, paragraph 7).
generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in the labeled data subset may be less than or equal to a number of rows within the labeled data set: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(data subsets), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
(labeled data set)” (Zhao, page 4, paragraph 2).
generating an outlier detection data subset from the labeled data set by down sampling a plurality of rows of one class label within the labeled data: “METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
=
{
D
1
,
…
,
D
n
}
(labeled data subset[s]), namely, a meta-train database with ground truth labels, i.e.,
{
D
i
=
X
i
,
y
i
}
i
=
1
n
(labeled data set)” (Zhao, page 4, paragraph 2).
generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset: “We consider the model selection problem for unsupervised outlier detection, which we refer to as UOMS (unsupervised outlier model selection) hereafter. Given a new dataset, without any labels, the problem is to select both (i) a detector/algorithm and (ii) its associated hyperparameter(s) (HP)” (Zhao, page 3, paragraph 7); “we discretize the HP space for each candidate detector to make the search space tractable, which induces a finite pool of models denoted
M
=
{
M
1
,
…
,
M
m
}
(machine learning pipelines). Each model
M
∈
M
can be seen as a {detector, configuration} pair, where the configuration depicts a specific set of values for the detector’s HP(s)” (Zhao, page 3, paragraph 8); “our METAOD relies on a collection of historical outlier detection datasets
D
t
r
a
i
n
(data subsets / outlier detection data subset)” (Zhao, page 4, paragraph 2).
generating a training set from the set of data subsets and the set of unsupervised machine learning pipelines: “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
(set of unsupervised machine learning pipelines) on all the n meta-train datasets
D
t
r
a
i
n
(set of data subsets)” (Zhao, page 4, paragraph 7). A pairing of a machine learning pipeline
M
k
and a subset of the training data
D
k
comprise a training set.
training a metalearner for unsupervised tasks based on the training set: “Our METAOD consists of two-phases: offline (meta-)training of the meta-learner on
D
t
r
a
i
n
” (Zhao, page 4, paragraph 4); “3.2.1 (Meta-)Training (Offline)” (Zhao, page 4, below paragraph 4); “To capture prior experience, METAOD first constructs the performance matrix P by running/building and evaluating all the m models in our defined model space
M
on all the n meta-train datasets
D
t
r
a
i
n
(training set[s])” (Zhao, page 4, paragraph 7).
wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets: “our goal is to rank the models (unsupervised pipelines) for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature” (Zhao, page 6, paragraph 3) “Overall we optimize the smoothed criterion, sDCG, over all meta-train datasets
D
t
r
a
i
n
” (Zhao, page 7, paragraph 2).
generating
a set of data metafeatures for the set of data subsets: “To capture task similarity, it then extracts a set of d meta-features from each meta-train dataset, denoted by
M
=
ψ
(
X
1
,
…
,
X
n
)
∈
R
n
×
d
where
ψ
(
∙
)
depicts the feature extraction module” (Zhao, page 4, paragraph 7); “To this end, we extract meta-features that can be organized into two categories: (1) statistical features (data set metafeatures), and (2) landmarker features. Broadly speaking, the former captures statistical properties of the underlying data distributions; e.g., min, max, variance, skewness, covariance, etc. of the features and feature combinations. These kinds of meta-features have been commonly used in the AutoML literature.” (Zhao, page 5, paragraph 7 to page 6, paragraph 1).
a set of pipeline metafeatures for the set of unsupervised machine learning pipelines: “To this end, we extract meta-features that can be organized into two categories: (1) statistical features, and (2) landmarker features (pipeline metafeatures)” (Zhao, page 5, paragraph 7); “perhaps more important are the landmarker features, which are problem-specific, and aim to capture the outlying characteristics of a dataset. The idea is to apply a few of the fast, easy-to-construct OD models on a dataset and extract features from (i) the structure of the estimated OD model, and (ii) its output outlier scores” (Zhao, page 6, paragraph 2). The {detector, configuration} pairs described by each model M in Zhao are pipelines.
combining…
data metafeatures of the set of data metafeatures, pipeline metafeatures of the set of pipeline metafeatures: “To this end, we extract meta-features that can be organized into two categories: (1) statistical features (data metafeatures), and (2) landmarker features (pipeline metafeatures)” (Zhao, page 5, paragraph 7); “we discard least squares and instead optimize the rank-based (row- or dataset-wise) discounted cumulative gain (DCG),
PNG
media_image1.png
96
364
media_image1.png
Greyscale
” (Zhao, page 4, paragraph 10 to page 5, paragraph 1); “We find that initializing U, denoted
U
(
0
)
, based on meta-features facilitates stable training” (Zhao, page 5, paragraph 2).
, and a pipeline performance metric: “our METAOD relies on … the historical performances of the pool of candidate models,
M
, on the meta-train datasets. We denote by
P
∈
R
n
×
m
the performance matrix (pipeline performance metric), where
P
i
j
corresponds to the j-th model
M
j
'
s
performance on the i-th meta-train dataset
D
i
” (Zhao, page 4, paragraph 2).
to create a labeled training data set for the metalearner: “our goal is to rank the models for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG from the information retrieval literature” (Zhao, page 6, paragraph 3). By assigning a rank to each model-dataset pair (this pairing mapped to ‘training data’), each set of training data is being labeled.
generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset: “our goal is to rank the models (unsupervised pipelines) for each dataset row-wise, as model selection concerns with picking the best possible model to employ. Therefore, we use a rank-based criterion called DCG (performance metric) from the information retrieval literature” (Zhao, page 6, paragraph 3) “Overall we optimize the smoothed criterion, sDCG, over all meta-train datasets
D
t
r
a
i
n
(input training dataset)” (Zhao, page 7, paragraph 2); “Finally, the model with the largest predicted performance is outputted as the selected model” (Zhao, page 5, paragraph 5)
Zhao relates to automatically generating unsupervised outlier detection pipelines and is analogous to the claimed invention.
While Zhao fails to disclose the further limitations of the claim, Laszczuk teaches [a] computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions being executable by one or more processors to cause the one or more processors to perform operations: “In an aspect, provided is a computer-implemented method for end-to-end machine learning, comprising: … (c) (i) generating and training a model using an Automated Machine Learning (AutoML) algorithm“ (Laszczuk, [0004]); “Another aspect of the present disclosure provides a computer system comprising one or more computer processors and a non-transitory computer-readable medium coupled thereto. The non-transitory computer-readable medium comprises machine-executable code (instructions) that, upon execution by the one or more computer processors, implements any of the methods described above or elsewhere herein” (Laszczuk, [0010]).
Laszczuk relates to AutoML learning systems and is analogous to the claimed invention. Zhao teaches a method of training and testing an AutoML system for unsupervised learning. The claimed invention improves upon this method by storing it in the form of instructions on computer hardware. Laszczuk teaches a method of training and testing an AutoML system that can be stored in the form of instructions on computer hardware, applicable to Zhao. A person of ordinary skill in the art would have recognized that storing Zhao’s method as computer instructions on Laszczuk’s hardware would lead to the predictable result of the method being executable by a computing system, and would improve the known device by allowing it to be performed with real data (MPEP 2143 I. (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results).
While Zhao and Laszczuk fail to disclose the further limitations of the claim, Panda discloses a method of generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in the labeled data subset may be less than or equal to a number of rows within the labeled data set:
“For proxy sets directly sampled from the target set, the random sampling of a subset of classes (subsets from the labeled data set) maintaining the same number of images per class is more beneficial than trying to keep all the classes from the target dataset and the reducing the number of examples per class” (Panda, page 7, right column, paragraph 1). By definition, a subset of a dataset has a number of rows less than or equal to the full dataset.
“we investigated the proxies listed in Table 1, which are of two types: randomly selected and uniformly selected. For random selection, we picked a list of N classes and used all of their images. This is particularly important when designing a proxy set for a non-uniform, imbalanced distribution such as the one of ImageNet22K. For example ImageNet22K Proxy 2 was designed to have the same overall distribution of the full dataset, but the same number of images of ImageNet22K Proxy 1 … We split each of those datasets (subsets) into a training, validation and testing subsets with proportions 40/40/20 and use standard data pre-processing and augmentation techniques.” (Panda, page 3, left column, paragraph 2).
PNG
media_image2.png
413
732
media_image2.png
Greyscale
”Proxy sets used in our experiments” (Panda, page 3, right column, Table 1). Each proxy set has a number of classes greater than two. Thus, each contains at least one pair of class labels.
Panda relates to machine learning architecture searches and is analogous to the claimed invention. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Zhao and Laszczuk to use class subsets on candidate architectures instead of full datasets, as disclosed by Panda. Doing so would increase the efficiency of the architecture search. In particular, forming subsets of classes is better to maintain practical search times than subsets of class members for all classes. See Panda, page 1, right column, paragraph 1 and page 5, left column, paragraph 2.
While Zhao, Laszczuk, and Panda fail to disclose the further limitations of the claim, Wu discloses a method of generating an outlier detection data subset from the labeled data set by down sampling a plurality of rows of one class label within the labeled data: “Using document processing as an example, the informative down-sampling approach may determine major classes and minor classes based on counts of samples in different classes, and then down-sample the majority class(es) by detecting and keeping the most informative samples (rows)” (Wu, [0025])
Wu relates to machine learning pipeline searches and is analogous to the claimed invention. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Zhao, Laszczuk, and Panda to downsample majority class(es), as disclosed by Wu. This procedure can balanced unbalanced datasets, which can otherwise cause poor model performance on minority class classification. See Wu, [0024].
The analysis of claims 17-20 mirrors that of claims 3-7, with the exception that claims 17-20 are directed to generic computer hardware which executes the methods of claims 5-7. This generic hardware is taught by Laszczuk, as discussed regarding claim 15. Thus, claims 17-20 are rejected under the same rationales used for claims 3-7.
Response to Arguments
The following responses address arguments and remarks made in the instant remarks dated 12/15/2025.
Objections
Previous objections to the claims have been withdrawn in light of the instant amendments. However, new objections to the claims have been made.
112 Rejections
Previous claim rejections under 35 U.S.C. 112 have been withdrawn in light of the instant amendments.
101 Rejections
On page 9 of the instant remarks, the Applicant argues that the claimed invention improves upon issues related to testing, using, and training machine learning systems:
“This invention solves amongst other things the issues related to testing, using and training
machine learning systems. Machine learning systems and methods have proliferated in recent
years. Supervised machine learning uses labeled data sets to train machine learning algorithms.
Unsupervised machine learning uses unlabeled data sets to train machine learning algorithms.
Supervised machine learning algorithms and unsupervised machine learning algorithms are often
tested by predict labels on unlabeled test data sets for which suitable labels are known but not
provided to the machine learning algorithm under test. Automated machine learning (AutoML)
systems automate tasks of generating and testing machine learning algorithms to apply machine
learning to real world problems with fewer user interactions. However, teaching and completing
tasks using machine learning systems are lengthy and costly.
The present amended claims provide techniques for providing optimal pipelines for
unsupervised data sets that allow processes be resolved in order of seconds, rather than hours as
in existing evaluation-based approaches.”
In response to the Applicant’s arguments above, the Examiner respectfully disagrees. The improvement of a claimed invention must be sufficiently detailed, as noted in MPEP 2106.05(a): “If it is asserted that the invention improves upon conventional functioning of a computer, or upon conventional technology or technological processes, a technical explanation as to how to implement the invention should be present in the specification. That is, the disclosure must provide sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement. The specification need not explicitly set forth the improvement, but it must describe the invention such that the improvement would be apparent to one of ordinary skill in the art … After the examiner has consulted the specification and determined that the disclosed invention improves technology, the claim must be evaluated to ensure the claim itself reflects the disclosed improvement in technology. Intellectual Ventures I LLC v. Symantec Corp., 838 F.3d 1307, 1316, 120 USPQ2d 1353, 1359 (Fed. Cir. 2016) (patent owner argued that the claimed email filtering system improved technology by shrinking the protection gap and mooting the volume problem, but the court disagreed because the claims themselves did not have any limitations that addressed these issues). That is, the claim must include the components or steps of the invention that provide the improvement described in the specification.”
The Applicant’s argument lacks specificity regarding what issues in testing, using, and training machine learning are being addressed, what processes are being sped up, and how specific aspects of the invention address these issues. “allow[ing]” processes to be resolved in order of seconds, rather than hours as in existing evaluation-based approaches” does not clarify this.
On pages 9-10 of the instant remarks, the Applicant argues that the claimed invention cannot be executed as an abstract idea:
“The amended claims provide techniques that leverages
supervised data sets to build a meta learner for unsupervised data sets. None of this can be
accomplished in an abstract manner. The huge volume of data and the dynamic nature of having
to come up with a different methodology depending on circumstances and the nature of
unsupervised data sets does not allow for an abstract method. In fact, the methodology suggested
on a case by case basis not only varies depending on optimization but has to also be easily
incorporated into the meta learner using unsupervised methods that are scalable.
In one or more embodiments case by case optimization solutions have to be provided
where a set of unsupervised pipelines are selected for metalearner training and where the
metalearner is trained for ranking the unsupervised pipelines in l(a) given a plurality of input
training datasets. The meta learner will then outputs such as a top-k ranked pipeline from a given
input that is going to be different each time. This can never be performed on an abstract bases.
Training the meta learner is even more complicated. A set of labelled datasets are
collected that are very different in each case. A set of unsupervised pipelines are then chosen on
a case by case basis. The data sets are going to have different metafeatures each time that are
extracted and according to their features are executed accordingly. The performance of these
very differing results are then measured using the labels. The metafeatures are used as covariates
and the performance from are used as target for training metal earning model that predicts the
performance. This is the reason that this very dynamic and each time different input and output
cannot be performed using an abstract idea. This is reflected in claim 1”
In regards to the Applicant’s arguments above, the Examiner respectfully disagrees that the claimed invention, as amended, recites no abstract ideas. As stated in MPEP 2106.04(a)(2)(III), The courts do not distinguish between mental processes that are performed entirely in the human mind and mental processes that require a human to use a physical aid (e.g., pen and paper or a slide rule) to perform the claim limitation. See, e.g., Benson, 409 U.S. at 67, 65, 175 USPQ at 674-75, 674 … Nor do the courts distinguish between claims that recite mental processes performed by humans and claims that recite mental processes performed on a computer. As the Federal Circuit has explained, "[c]ourts have examined claims that required the use of a computer and still found that the underlying, patent-ineligible invention could be performed via pen and paper or in a person’s mind." Versata Dev. Group v. SAP Am., Inc., 793 F.3d 1306, 1335, 115 USPQ2d 1681, 1702 (Fed. Cir. 2015). See also Intellectual Ventures I LLC v. Symantec Corp., 838 F.3d 1307, 1318, 120 USPQ2d 1353, 1360 (Fed. Cir. 2016) (‘‘[W]ith the exception of generic computer-implemented steps, there is nothing in the claims themselves that foreclose them from being performed by a human, mentally or with pen and paper.’’); Mortgage Grader, Inc. v. First Choice Loan Servs. Inc., 811 F.3d 1314, 1324, 117 USPQ2d 1693, 1699 (Fed. Cir. 2016) (holding that computer- implemented method for "anonymous loan shopping" was an abstract idea because it could be "performed by humans without a computer").
Amended claim 1 recites limitations amounting to mental processes performed with generic computing machines. The preamble of this claim specifies that its method is “computer-implemented”. A mental process, such as “generating an outlier detection data subset from the labeled data by downsampling a plurality of rows of one class label within the labeled data”, performed on a computer is still considered a mental process.
The Examiner asserts that claim 1, as amended, recites mental processes, and maintains its rejection on the basis of the Alice/Mayo tests performed (See 101 rejections). Similar arguments are applicable to all other claims.
Thus, no rejections are withdrawn on these grounds.
On pages 14-16 of the instant remarks, the Applicant argues that the claimed invention improves on medical technology, and thus is practically integrated into a technical solution, isn’t directed to a judicial exception, and amounts to significantly more than any recited judicial exceptions:
“ Step 2A, Prong 1: The Claimed Invention is not Directed to an Abstract Idea
Applicant's claimed invention is not simply directed to an abstract idea falling within the
category of Certain Methods of Organizing Human Activity." When the recitations of the
claimed invention are viewed as a whole in light of the specification, it is clear that the claimed
invention is directed to provide a practical application that includes a technical solution to
overcome issues associated with overuse of computer resources when automatically mapping
medical codes to extracted information from text in a narrative form.
National and private healthcare systems around the world are supporting increasingly
complex and expensive treatments. In any case, the insurance companies bear the burden of
paying the amount paid to hospitals and doctors. Technically, however, there is no closed
information supply chain from diagnosis through one or more treatments, which are usually part
of a frequently handwritten medical record, to insurance companies. Natural language processing
(NLP) has been used to try to automate medical coding. The medical codes are typically
organized hierarchically, i.e., as a sequence of characters comprising a main code and a
respective sub-code. Although NLP technology may help with identifying some main codes, it
often lacks accuracy to gain the more detailed sub-code. This is due to lack of data to train the
NLP engine, and lack of context beyond sentence and/or paragraphs that the NLP engine looks
at. In addition, individual hospitals and/or individual doctors may have their own abbreviations
for specific treatments. In enabling that computer functionality, a system that performs this
hierarchical and artificial intelligence type teaching provides great improvements for patients in
their treatment. The current amended claims provide includes understanding and applying of an
amount of data beyond what may be comprehensible by a single person. (See paragraph [0022]
of Applicant's specification). Therefore, the claimed invention is not directed to a judicial exception and based on the first prong of the Alice framework, the claimed invention is directed”
to patent eligible subject matter. Furthermore, beside abstract ideas, there are no mathematical
formula involved in the present invention as reflected by the amended claims.
Step 2A, Prong 2: The Claimed Invention Integrates the Alleged Exception into a
Practical Application of the Exception
Without conceding that Applicant's claimed invention recites an abstract idea, Applicant
submits that the claimed invention is integrated into a practical application of the alleged mental
process by including additional elements that apply or use the judicial exception in some other
meaningful way (described by the 2019 Guidance as an example limitation indicative of
integration into a practical application). Applicant submits that the steps of the claimed invention
have been narrowly tailored to illustrate elements which apply and use the judicial exception in a
meaningful way.
Additionally, Federal Circuit court decisions and USPTO direction have provided further
guidance regarding the rejection of claims under 35 U.S.C. § 101. Specifically, McRO, Inc. dba
Planet Blue v. Bandai Namco Games America Inc., 120 USPQ2d 1091 (Fed. Cir. 2016) held the
claimed methods of automatic lip synchronization and facial expression animation using
computer-implemented rules patent eligible under 35 U.S.C. § 101, because they were not
directed to an abstract idea (Step 2A of the USPTO's SME guidance). TheMcRO court relied on
how the claimed rules within the McRO invention enabled the automation of specific animation
tasks that previously could not be automated when determining that the claims were directed to
improvements in computer animation instead of an abstract idea.
Specifically, the claims inMcRo were deemed patent eligible under 35 U.S.C. § 101
based on the fact that they outlined a specific way of improving computer technology which
"allow[ed] for the improvement realized by the invention." Similarly, Applicant's claimed
method is similar in that it improves a method to obtain medical data which allows for how
information can be used from a plurality of sources to build a complex network of nodes and
relationships, thereby delivering a sorted list of potential paths of medical diagnosis codes and
related procedural codes - in particular, main and/or secondary diagnosis codes, as well as, main
procedure codes, as well as, secondary procedure codes - as a result of a query. Thus, "[a]n
'improvement in computer-related technology' is not limited to improvements in the operation of a computer or a computer network per se, but may also be claimed as a set of 'rules' (basically
mathematical relationships) that improve computer-related technology by allowing computer
performance of a function not previously performable by a computer." (Memorandum Regarding
Recent Subject Matter Eligibility Decisions, issued November 2, 2016, pp. 2-3)
…
As such, the claims are directed to patent-eligible subject matter under McRO.
Accordingly, Applicant respectfully submits that the claimed invention should be considered a
practical application of the alleged abstract idea, and therefore is patent eligible.
Step 2B: The Claimed Invention Amounts to Significantly More than the Alleged
Judicial Exception
As held in the BASCOM Global Internet Services, Inc. v. AT&T Mobility LLC. Fed. Cir.,
No 2015-1763, 6/27/16 decision, when the patent claim seeks to cover a judicial exception to
patent eligibility, the final question asks whether the inventive concept covered in the claimed
invention was "significantly more" than merely the judicial exception. In this case, the question
was whether the claim added significantly more, such that more than a mere abstract idea would
be captured. The Federal Circuit ruled that the claims did add significantly more and, therefore,
the claims are patent eligible and stated, "[a]s is the case here, an inventive concept can be found
in the non-conventional and non-generic arrangement of known, conventional pieces." Applying
BASCOM to amended claims, the claimed subject matter improves the technology of medical
technology. Therefore, for at least the above reasons, Applicant respectfully requests that the
rejection under 35 U.S.C. § 101 be reconsidered and withdrawn.”
Regarding the Applicant’s argument that the claimed invention provides a technical solution to problems in medical technology, it is noted that the NLP-based medical technology upon which the Applicant relies is neither recited in the rejected claim(s) nor present in the instant specification. The improvement of a claimed invention must be sufficiently detailed, as noted in MPEP 2106.05(a): “Conversely, if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology. An indication that the claimed invention provides an improvement can include a discussion in the specification that identifies a technical problem and explains the details of an unconventional technical solution expressed in the claim, or identifies technical improvements realized by the claim over the prior art.”
It's insufficient to merely claim an invention improves on prior technology. The Applicant must identify what specific problem(s) are being addressed, identify how specific aspects of the claimed invention address those problems, and must ensure the aspects of the invention purported to address the problem(s) are represented in the claim(s) for judicial exceptions to be practically integrated on this basis. The Applicant’s argument lacks specificity regarding how the claimed invention, which lacks any mention of natural language processing, medical technology, or hierarchical coding, improves on specific problems in medical technology. No rejections are withdrawn on these grounds.
Regarding the Applicant’s argument that the claimed invention is not directed to a judicial exception, the Examiner respectfully disagrees, and points to arguments previously made regarding mental processes recited by claim 1 and other claims. No rejections are withdrawn on these grounds.”
Regarding the Applicant’s argument that the claimed invention provides a technical solution to problems in medical technology, it is noted that the NLP-based medical technology upon which the Applicant relies is neither recited in the rejected claim(s) nor present in the instant specification. The improvement of a claimed invention must be sufficiently detailed, as noted in MPEP 2106.05(a): “Conversely, if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology. An indication that the claimed invention provides an improvement can include a discussion in the specification that identifies a technical problem and explains the details of an unconventional technical solution expressed in the claim, or identifies technical improvements realized by the claim over the prior art.”
It's insufficient to merely claim an invention improves on prior technology. The Applicant must identify what specific problem(s) are being addressed, identify how specific aspects of the claimed invention address those problems, and must ensure the aspects of the invention purported to address the problem(s) are represented in the claim(s) for judicial exceptions to be practically integrated on this basis. The Applicant’s argument lacks specificity regarding how the claimed invention, which lacks any mention of natural language processing, medical technology, or hierarchical coding, improves on specific problems in medical technology. No rejections are withdrawn on these grounds.
Regarding the Applicant’s argument that the claimed invention is not directed to a judicial exception, the Examiner respectfully disagrees, and points to arguments previously made regarding mental processes recited by claim 1 and other claims. No rejections are withdrawn on these grounds.
103 Rejections
On pages 16-19 of the instant remarks, the Applicant argues that Zhao, Panda, and Wu fail to disclose the limitations of amended claim 1:
“
V. Rejections under 35 U.S.C. 103
Claim(s) 1-7 are rejected under 35 U.S.C. 102 as being anticipated by Zhao et al.
(Automatic Unsupervised Outlier Model Selection, October 2021, 35th Conference on Neural
Information Processing Systems (NeurlPS 2021) and herein after Zhao in view of Panda et al.
(NASTransfer: Analyzing Archetecture Transferability in Large Scale Neural Architecture
Search, Published 2/12/2021) and further in view of Wu (Machine Leaming Processing Pipeline
Optimization, US 2022/018066Al).
In addition, Claim(s) 8-20 are rejected under 35 U.S.C. 103 as being unpatentable over
Zhao et al. (Automatic Unsupervised Outlier Model Selection, October 2021, 35th Conference
on Neural Information Processing Systems (NeurlPS 2021)) in view ofLaszczuk (US
20240078473 Al) and in further view of Wu.
Applicant respectfully traverses the rejections under 103 in view of the arguments
extended below and the amendments made to the claims.
Zhao is a general publication about how an unsupervised outlier detection task on a new
dataset can be used to automatically select a good outlier detection algorithm. Their task is
tackle the unsupervised outlier model selection (UOMS) problem and they suggest metaleaming.
While they discuss meta-learning, they do not discuss anything further that is the scope
of novelty with the present amended claims. They discuss how METAOD capitalizes on the
performances of a large body of detection models on historical outlier detection benchmark
datasets. They then discuss how to capture task similarity within the meta-learning framework,
outlying characteristics of a dataset can be used.
In no place in Zhao there is at least a mention of the following language of claim 1 which
provides:
receiving a labeled data as an input;
generating a set of data subsets from the labeled data set to use as input training
dataset by selecting a random pair of class labels in the labeled data set, wherein a
plurality of rows in the labeled data subset may be less than or equal to a number of rows
within the labeled data set;
generating an outlier detection data subset from the labeled data by
downsampling a plurality of rows of one class label within the labeled data;
generating a set of unsupervised machine learning pipelines using the set of data
subsets and outlier detection data subset;
generating a training set from the set of data subsets and the set of unsupervised
machine learning pipelines;
training a meta/earner for unsupervised tasks based on the training set, wherein
the metalearner is trained for ranking the set of unsupervised pipelines given a plurality
of input training datasets; and
generating output using the meta/earner, using the input training dataset to the
meta/earner to enable computation of performance metrics of unsupervised pipelines on
the outlier detection data subset.
Claims 8 and 15 provide a similar language.
Wu does not cure the deficiencies of Zhao as it is very different. Wu deals with a system
and method for machine learning training that very specifically provide a master AI subsystem to
process an input document. Each of at least two of the candidate machine learning components is
provided with at least two candidate implementations, and the master AI subsystem is to train the
machine learning processing pipeline by selectively deploying the at least two candidate
implementations for each of the at least two of the machine learning components. None of these
limitations are present in the amended claims that have a different objective to achieve.
Panda does not cure the deficiencies of Wu or Zhao as it concentrates on the problem and
subsequent experimentation of a series of large scale benchmarks samples such as ImageNetlK
and ImageNet22K to provide improvements that are very narrowly concentrate on transfer
performance of architectures so as to improve performance metrics for a large dataset through
providing empirical analysis for future design of NAS algorithms. This has nothing to do with
the amended claims solutions.
Laszczuk does not cure the deficiencies of Panda, Wu or Zhao as it seemingly provides
systems and methods fix machine learning but its teachings are only limited to operations of data
ingestion and preparation for feature storage and model building. The technique may use an
Automated Machine Learning (AutoML) algorithm and explainable Artificial Intelligence (XAJ)
but it does not train or handle the data as provided in Claim l below:
receiving a labeled data as an input;
generating a set of data subsets from the labeled data set to use as input training
dataset by selecting a random pair of class labels in the labeled data set, wherein a
plurality of rows in the labeled data subset may be less than or equal to a number of rows
within the labeled data set;
generating an outlier detection data subset from the labeled data by
downsampling a plurality of rows of one class label within the labeled data;
generating a set of unsupervised machine learning pipelines using the set of data
subsets and outlier detection data subset;
generating a training set from the set of data subsets and the set of unsupervised
machine learning pipelines;
training a meta/earner for unsupervised tasks based on the training set, wherein
the metalearner is trained for ranking the set of unsupervised pipelines given a plurality
of input training datasets; and
generating output using the meta/earner, using the input training dataset to the
meta/earner to enable computation of performance metrics of unsupervised pipelines on
the outlier detection data subset.
Claims 8 and 15 provide a similar language.
Consequently, neither Zhao, Wu, Panda nor Laszczuk do not anticipate, make obvious or
teach the amended claims. Applicant respectfully request the allowance of the pending claims
and withdrawal of the rejection.
VI. DEPENDENT CLAIMS
The argument also extends to the dependent claims that depend upon one of the amended
independent claim discussed for at least the same reasons. Since each dependent claim is also
deemed to define an additional aspect of the invention, however, the individual reconsideration
of the patentability of each on its own merits is respectfully requested.
Applicant maintains that all claims are allowable for at least the reasons presented
hereinabove. However, in the interests of brevity, this response does not comment on each and
every comment made by the Examiner in the Office Action. This should not be taken as
acquiescence of the substance of those comments, and Applicant reserves the right to address
such comments.”
The Examiner notes that while the above argument refers to claim 1’s limitations in a previous set of amendments, the response will address the relevance of Zhao, Wu, Panda, and Laszczuk to the amended claims.
Regarding the Applicant’s arguments above, the Examiner respectfully disagrees. Amended claim 1 brings in limitations of previous claims 2 and 5, which have already been found to be obvious over the prior art. The limitations of amended claim 1 are disclosed thusly:
Regarding “receiving a labeled data as an input”, Zhao discloses a method of receiving labeled data, which is used as input for models (Zhao, page 4, paragraph 7).
Regarding “generating a set of data subsets from the labeled data set to use as input training dataset by selecting a random pair of class labels in the labeled data set, wherein a plurality of rows in a labeled data subset may be less than or equal to a number of rows within the labeled data set”, Zhao discloses a method of generating an input training dataset from subsets of the labeled data (Zhao, page 4, paragraph 2). While Zhao fails to disclose the rest of this limitation, Panda discloses a method of randomly sampling a subset of classes from labeled data, with at least a pair of classes and number of rows less than or equal to the number of rows in the original labeled dataset (Panda, page 7, right column, paragraph 1; page 3, left column, paragraph 2; page 3, right column, Table 1)
Regarding “generating an outlier detection data subset from the labeled data by down sampling a plurality of rows of one class label within the labeled data”, Zhao discloses a method of generating an outlier detection dataset from a labeled data set (Zhao, page 4, paragraph 2). While Zhao fails to disclose the rest of this limitation, Wu teaches a method of generating a data subset from labeled data by downsampling it (Wu, [0025]).
Regarding “generating a set of unsupervised machine learning pipelines using the set of data subsets and outlier detection data subset”, Zhao discloses a method of generating a set of unsupervised machine learning pipelines using the outlier detection data subsets (Zhao, page 3, paragraphs 7-8).
Regarding “generating a training set from the set of data subsets and the set of unsupervised machine learning pipelines”, Zhao discloses a method of forming training pairs of models and data subsets (Zhao, page 4, paragraph 7).
Regarding “training a metalearner for unsupervised tasks based on the training set, wherein the metalearner is trained for ranking the set of unsupervised pipelines given a plurality of input training datasets”, Zhao teaches a method of training a metalearner for unsupervised tasks with training data (Zhao, page 4, paragraphs 4 & 7), wherein unsupervised pipelines are ranked (Zhao, page 6, paragraph 3 & page 7, paragraph 2)
Regarding “generating a set of data metafeatures for the set of data subsets a set of pipeline metafeatures for the set of unsupervised machine learning pipelines”, Zhao discloses a method of generating two types of metafeatures from the data subsets (Zhao, page 5, paragraph 7).
Regarding “combining data metafeatures of the set of data metafeatures, pipeline metafeatures of the set of pipeline metafeatures, and a pipeline performance metric to create a labeled training data set for the metalearner”, Zhao discloses a method of using data and pipeline metafeatures, in addition to a performance matrix indicating pipeline performance, to create labeled training data (Zhao, page 4, paragraph 1; page 5, paragraphs 1-2 & 7; page 6, paragraph 3)
Regarding “generating output using the metalearner, the metalearner using the input training dataset to computer metrics used for performance of unsupervised pipelines on the outlier detection data subset”, Zhao discloses a method of ranking pipelines using DCG performance metrics, after optimizing DCG with an input training dataset (Zhao, page 6, paragraph 3). Model performance is output for the best predicted model (Zhao, page 5, paragraph 5).
Thus, the combination of Zhao, Pandas, and Wu discloses the entirety of amended claim 1, and no rejections are withdrawn on these grounds. No rejections of the substantially similar independent claims 8 and 15, or any dependent claims are withdrawn on this basis, either. See the 103 rejections section for more detail.
Art Made of Record
The Examiner has made the following art of record to demonstrate the state of the art pertaining to the present application at the time of its effective filing date. These references are not necessarily deemed prior art.
Vu et al. (Instance-Level Metalearning for Outlier Detection, 2024, Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI-24), pp. 2379-2387), published after the effective filing date of the claimed invention, describes a very similar metalearning system that automatically performs model selection for unsupervised outlier detection. It describes many instances of relevant prior art to the claimed invention.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
(Zhao et al., Supplementary Material: Automatic Unsupervised Outlier Model Selection, October 2021, 35th Conference on Neural Information Processing Systems (NeurIPS 2021)) teaches further details about the methods of Zhao cited elsewhere in this office action
(Burnaev et al., Model Selection for Anomaly Detection, 2017, arXiv:1707.03909v1) teaches a method of model selection for outlier detection
(Aggarwal, Outlier Ensembles, 2013, ACM SIGKDD Explorations Newsletter, Volume 14, Issue 2, Pages 49 - 58, https://doi.org/10.1145/2481244.2481252) teaches methods of combining different models for outlier detection
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Aaron P Gormley whose telephone number is (571)272-1372. The examiner can normally be reached Monday - Friday 12:00 PM - 8:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle T Bechtold can be reached at (571) 431-0762. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AG/Examiner, Art Unit 2148 /MICHELLE T BECHTOLD/Supervisory Patent Examiner, Art Unit 2148