Notice of Pre-AIA or AIA Status
The present application, filed on or after April 15, 2022, is being examined under the first inventor to file provisions of the AIA .
Examiner notes the entry of the following pages:
Amended claims filed 4/10/2025.
Applicant’s arguments/remarks made in amendment filed 4/10/2025.
Claims 1-7 are amended. Claims 1-7 are presented for examination.
Response to Arguments
Applicant presents several arguments. Each is addressed.
The rejections of claims under 35 U.S.C § 112(b) are hereby withdrawn, as necessitated by applicant’s amendments.
Applicant argues “When considering claim 1 as a whole, as required, the claim sets forth an improvement to the field of machine learning.” (Remarks, page 10, paragraph 3, line 1).
However, MPEP 2106.05(a) recites “it is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements.” Here, Applicant has not provided objective evidence of a technical improvement to the additional elements. Instead Applicant is relying entirely on the judicial exception to provide a technical improvement.
Applicant argues that " The combination of Sazvar, James and Shipe does not render claim 1 obvious. Sazvar discloses deriving data from statistics of confirmed cases of COVID- 19 (p. 108) and using data to develop an ANN for detecting the trend of COVID-19 virus outbreak in China (p. 108)." (Remarks, page 17, paragraph 1). The argument is moot because of a new grounds of rejection set forth below.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-7 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Subject Matter Eligibility Analysis Step 1:
Claim 1 recites a data analysis apparatus (comprising a processor and a storage device), which is an article of manufacture, one of the four statutory categories of patentable subject matter.
Subject Matter Eligibility Analysis Step 2A Prong 1:
Claim 1 further recites the processor is configured to execute 1) …acquiring a first statistical model which is generated based on curve fitting of first histogram data of a distribution of actual measurement results of a group and a second statistical model which is generated based on curve fitting of second histogram data of a distribution of a first actual measurement result of first samples having a smaller number of samples than a number of samples of the group,… 2) …calculating correction information indicating a difference between the first statistical model and the second statistical model acquired by the acquisition processing,… and 3) …correcting a first prediction result output by inputting second feature amount data of second samples different from the first samples to the first prediction model generated by the learning processing using the correction information calculated by the calculation processing, which are mental processes that can practically be performed by a human, which falls within the mental process grouping of abstract ideas. One can acquire or generate statistical models based on distributions using math by defining a problem, collecting data, and building a model that can be evaluated. One can calculate correction information indicating a difference between the models acquired by the acquisition processing by observing and comparing the models. One can also correct or adjust an output prediction result using correction information, using pen and paper, to adjust values based upon the correction information.
Subject Matter Eligibility Analysis Step 2A Prong 2:
a storage device that stores the program, wherein the processor is configured to execute… [the claimed method] (implements the abstract idea on a computer or “other machinery” to carry out the abstract method (using a computer to correct the output result of a prediction model) (see MPEP 2106.05(f) (“Apply It”)))
…generating a first prediction model by performing machine learning using the first actual measurement result and first feature amount data corresponding to the first actual measurement result (implements the abstract idea on a computer or “other machinery” to carry out an abstract idea (using a computer to make predictions on some data (a human can make predictions on a given set of data)) (see MPEP 2106.05(f) (“Apply It”)))
outputting a second prediction result; (insignificant extra solution activity (outputting a generated result) (see MPEP 2106.05(g) (“Insignificant Extra-Solution Activity”))
Subject Matter Eligibility Analysis Step 2B:
The additional elements of claim 1 do not provide significantly more than the abstract idea itself. Therefore, claim 1 is subject matter ineligible.
Claim 2, dependent upon claim 1, recites mental processes and additional elements wherein the processor is configured to execute additional learning processing of updating the first prediction model… (implements the abstract idea on a computer or “other machinery” to carry out an abstract idea (using a computer to update/fine tune a prediction model) (see MPEP 2106.05(f) (“Apply It”)), and processor updates the model by performing additional learning using a loss function based on the second prediction result output by the correction processing and a second actual measurement result relating to the second samples (which is a mathematical process, calculating a loss), correct, in the correction processing, using the correction information, a third prediction result output… (values of a output result can be corrected/adjusted using pen and paper) and output a fourth prediction result (insignificant extra solution activity (outputting a generated result) (see MPEP 2106.05(g) (“Insignificant Extra-Solution Activity”)). Thus, the claim remains an abstract idea, and the additional elements of the claim do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea itself.
Claim 3, dependent upon claim 1, recites a mental process and an additional element. The claim recites in the acquisition processing, the processor acquires an updated second statistical model based on a distribution of the first actual measurement result and a distribution of a second actual measurement result regarding the second samples (a human can acquire/generate updated statistical models based on distributions using math), wherein in the calculation processing, the processor calculates updated correction information indicating a difference between the first statistical model and the updated second statistical model (a human can calculate updated correction information through comparison/observation using pen and paper), wherein in the correction processing, the processor corrects a third prediction result output… (values of a output result can be corrected/adjusted using correction information by pen and paper) and outputs a fourth prediction result (insignificant extra solution activity (outputting a generated result) (see MPEP 2106.05(g) (“Insignificant Extra-Solution Activity”)). Thus, the claim remains an abstract idea, and the additional elements of the claim do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea itself.
Claim 4, dependent upon claim 1, recites a mental process and additional elements. The claim recites in the acquisition processing, the processor acquires an updated second statistical model based on a distribution of the first actual measurement result and a distribution of a second actual measurement result regarding the second samples, (a human can acquire/generate updated statistical models based on distributions using math), wherein in the learning processing, the processor generates an updated first prediction model… (a human can perform fine tuning of a prediction model by inputting data to it), wherein in the correction processing, the processor corrects a third prediction result output… (values of a output result can be corrected/adjusted using correction information by pen and paper) and outputs a fourth prediction result (insignificant extra solution activity (outputting a generated result) (see MPEP 2106.05(g) (“Insignificant Extra-Solution Activity”)). Thus, the claim remains an abstract idea, and the additional elements of the claim do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea itself.
Claim 5, dependent upon claim 3, recites mental processes and additional elements in the learning processing, the processor generates an updated first prediction model… (implements the abstract idea on a computer or “other machinery” to carry out an abstract idea (using a computer to update/fine tune a prediction model) (see MPEP 2106.05(f) (“Apply It”)), wherein in the correction processing, the processor corrects a third prediction result output… (values of a output result can be corrected/adjusted using correction information by pen and paper), and outputs the fourth prediction result (insignificant extra solution activity (outputting a generated result) (see MPEP 2106.05(g) (“Insignificant Extra-Solution Activity”)). Thus, the claim remains an abstract idea, and the additional elements of the claim do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea itself.
Independent claim 6, recites a method, which is a process, one of the four statutory categories of patentable subject matter. Claim 6 further recites that the data analysis method, when executed by a processor, performs the same method included in the stored instructions of claim 1, including the same mental processes and additional elements (which is no more than implementing the idea on a computer or “other machinery” to carry out the abstract method, which by MPEP 2106.05(f) (“Apply It”) does not integrate the abstract idea into a practical application, and does not amount to significantly more than the abstract idea itself).
Independent claim 7, recites a non-transitory computer readable medium, which is an article of manufacture, one of the four statutory categories of patentable subject matter. Claim 7 further recites that the non-transitory computer readable medium stores a data analysis program that, when executed by a processor, performs the same method steps included in the program of claim 1, including the same mental processes and additional elements (which is no more than implementing the idea on a computer or “other machinery” to carry out the abstract method, which by MPEP 2106.05(f) (“Apply It”) does not integrate the abstract idea into a practical application, and does not amount to significantly more than the abstract idea itself).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The following are the references relied upon in the rejections below:
Brody et al. “The age specific incidence anomaly suggests that cancers originate during development.” (2013) (hereinafter Brody)
Soeane et al. “A scaling approach to estimate the age-dependent COVID-19 infection fatality ratio from incomplete data” (2021) (hereinafter Soeane)
Zhou et al. “Do not forget interaction: Predicting fatality of COVID-19 patients using logistic regression ” (2020) (hereinafter Zhou)
Zhang et al. “Coping with Label Shift via Distributionally Robust Optimisation” (2020) (hereinafter Zhang)
Li et al. “Learning Without Forgetting” (2017) (hereinafter Li)
Claims 1, and 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Brody in view of Seoane, Zhou, and Zhang.
Claim 1:
Regarding claim 1, Brody discloses: A data analysis apparatus comprising:
a processor that executes a program; and
a storage device that stores the program, wherein the processor is configured to execute
Brody, pg. 2, Paragraph 4 “…First, we record the age of all patients diagnosed with a specific cancer within a large geographic area in one year. Second, we record the age of each member in the entire population....”
Brody, pg. 5, Understanding the age-specific incidence data, Paragraph 4 “…We computed the best fit theoretical shape to the 0-84 year old data in two cases.....”
Discloses collecting population age data and computing a best fit theoretical shape to the observed age data. It is implicit that performing these computations and curve fitting would be implemented by processor executing stored program instructions.
an acquisition processing of acquiring a first statistical model which is generated based on curve fitting of first histogram data of a distribution of actual measurement results of a group and a second statistical model which is generated based on curve fitting of second histogram data of a distribution of a first actual measurement result of first samples having a smaller number of samples than a number of samples of the group, the first histogram being a plot of age class identifications (IDs) of age groups and affection rates of diseases, and the second histogram being a plot of age class identifications of age groups and affection rates of diseases,
Brody, pg.1-2, Thought Experiments, Paragraph 1 “We can begin with a gedanken experiment. Take 100,000 newborn human infants and put them in an isolated box. …When a subject is first diagnosed with a colon cancer, the exact age is recorded. The experiment runs for hundreds of years, then we make a histogram of the number of colon cancers diagnosed as a function of years since the beginning of the experiment.”
Brody, pg. 2, Paragraph 4 “…First, we record the age of all patients diagnosed with a specific cancer within a large geographic area
in one year. Second, we record the age of each member in the entire population. Finally, for each age group, we divide the number of patients who had a tumor diagnosed in that year by the total number of people with that age in the population. By convention, these numbers are multiplied by 100,000 and are called the age-specific incidence.”
Discloses first histogram data representing a distribution of actual measurement results of a group (population), where the distribution is expressed by age group and the “affection rate” is the age-specific incidence (a disease rate computed for each group).
Brody, pg. 2, Paragraph 3 “A second gedanken experiment is also performed. In this case, the population of 100,000 infants is composed of two apparently indistinguishable sub populations, one of which can develop colon cancer (20%), and the other of which cannot(80%). The same process is followed. At the end of the experiment, the integral of the histogram is calculated and it will equal 20,000.”
Brody, pg. 1, Abstract “…This analysis leads to the interpretation that for colon cancer two sub-populations exist in the general population: a susceptible population and an immune population…”
Discloses first samples having a smaller number of samples than the group (a sub-population within the full population), including an example where the susceptible population corresponds to 20,000 out of 100,000 subjects (fewer samples than the group), which supports the “second histogram data” derived from a smaller-sample distribution that the group (population) distribution.
Brody, pg. 5, Understanding the age-specific incidence data, A theory of the age-specific incidence curve, Paragraph 3 “We compared the theoretical shape of the age-specific incidence data with observational data collected by the SEER-17 registries in 2000 for colon cancer. We computed the best fit theoretical shape to the 0-84 year old data in two cases...”
Discloses generating two fitted curve models (best theoretical shape…in two cases) from the age-specific incidence data (curve fitting the age-binned distribution to produce a first and second statistical model).
a calculation processing of calculating correction information [indicating a difference between the first statistical model and the second statistical model acquired by the acquisition processing, the correction information being a difference between a value of the first statistical model and a value of the second statistical model at a same age class ID],
Brody, pg. 4, Birth cohort effects on age-specific incidence data., Paragraph 1 “…The expected value, if 100% of the population were susceptible, for age 99 is about 850 per 100,000, as shown in data from 200 shown Figure 4. The observed value is about 227 with a 95% confidence interval of 96 to 357. The observed value is about one quarter of the expected value of colon carcinoma incidence, if 100% were susceptible to colorectal carcinoma....”
Discloses correction information (indicating a difference between two values evaluated at the same age class (age 99)).
[a learning processing of generating a first prediction model [by performing machine learning using the first actual measurement result and first feature amount data corresponding to the first actual measurement result, and]
[a correction processing of correcting a first prediction result output by inputting second feature amount data of second samples different from the first samples to the first prediction model generated by the learning processing using the correction information calculated by the calculation processing, and outputting a second prediction result.]
Thus far, Brody does not explicitly teach a calculation processing of calculating correction information indicating a difference between the first statistical model and the second statistical model acquired by the acquisition processing, the correction information being a difference between a value of the first statistical model and a value of the second statistical model at a same age class ID
Seoane teaches indicating a difference between the first statistical model and the second statistical model acquired by the acquisition processing, the correction information being a difference between a value of the first statistical model and a value of the second statistical model at a same age class ID
Seoane, pg. 8, Paragraph 1 “…In other words, this is the “apparent” fatality since it weights how deadly the virus is (statistically) for a patient in an age group, with the relative risk of getting infected at that particular age. For this reason, we refer to ˆfα as the uniform infection fatality rate (UIFR) (i.e. the IFR under the assumption of uniform attack rate between ages), as compared to fα, which is the real (potentially non-uniform) IFR associated to the disease. Both measures are only equal if rα = 1 for all α.”
Seoane, pg. 8, Paragraph 4 “…we obtain a very good fit of the data to ˆfα ∝ exp (A × ageα) (8)…”
Discloses two different statistical quantities/functions, and fitting of an age-dependent model from to the data.
Soeane, pg. 4, Under-reporting of deaths, Paragraph 3 “…In order to compare data between age groups, we normalize this difference by the number of confirmed deaths, that is: Fraction of under-counting = Deaths (COVID-19 suspected & confirmed) - Deaths (confirmed)/Deaths (confirmed) (1).”
Discloses a difference correction quantity for age-stratified comparison.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Seoane’s difference correction computation to Brody’s two fitted age-dependent models (group (population) vs. sub-population) to produce a standardized correction quantity (the difference of the same age bin). Brody identifies age-dependent divergence between expected and observed incidence, but lacks a general computational method for generating per-age correction information suitable for automated adjustment. Seoane provides this method with predictable results when applied to age-binned rate functions. So, the combination would yield per-age correction information that is scale consistent across ages, since both operate on age-binned rate functions and compute discrepancy at corresponding age bins.
Also, thus far, the combination of Brody and Soeane does not explicitly teach a learning processing of generating a first prediction model [by performing machine learning using the first actual measurement result and first feature amount data corresponding to the first actual measurement result
Zhou teaches a learning processing of generating a first prediction model [by performing machine learning using the first actual measurement result and first feature amount data corresponding to the first actual measurement result
Zhou, pg. 1, Abstract “…In this research, we reported an explainable, intuitive, and accurate machine learning model based on logistic regression to predict the fatality rate of COVID-19 patients using only three important blood biomarkers, including lactic dehydrogenase, lymphocyte (%) and high-sensitivity C-reactive protein, and their interactions…”
Zhou, pg. 1, Column 2, Results, Fitting a Logistic Regression Model., Paragraph 1 “We used the training dataset with 351 patients without missing any values of the three important features (i.e., LDH, lymphocyte, and hs-CRP) to fit a step-wise logistic regression model with all the secondorder interaction items…”
Discloses a prediction model (logistic regression) produced via machine learning (learning processing), and “blood biomarkers” (measured numeric quantities (feature values or feature amount data)).
Zhou, pg. 2, Column 1, An Explainable Logistic Regression Model., Paragraph 1 “…Using Y = 1 and Y = 0 to indicate death and survival, respectively, we can formulate the logistic regression model in the following way (7)…”
Discloses an actual measurement result corresponding to the measured patient outcome (death vs. survival) which is the label used for learning (the dependent variable).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Zhou’s routine supervised-learning training into Brody/Seoane’s age distribution framework to generate a prediction model from measured features and outcomes because they are standard quantitative disease risk modeling approaches using measured data.
Also, thus far, the combination of Brody/Soeane/Zhou does not explicitly teach a correction processing of correcting a first prediction result output by inputting second feature amount data of second samples different from the first samples to the first prediction model generated by the learning processing using the correction information calculated by the calculation processing, and outputting a second prediction result.
Zhang teaches a correction processing of correcting a first prediction result output by inputting second feature amount data of second samples different from the first samples to the first prediction model generated by the learning processing using the correction information calculated by the calculation processing, and outputting a second prediction result
Zhang, pg. 2-3, Section 2, Paragraph 1 “…In the special case of label shift, one posits that pte(x | y) = ptr(x | y), but the label distribution pte(y) 6= ptr(y) (Saerens et al., 2002); i.e., the test distribution satisfies pte(x, y) = pte(y)ptr(x | y)...”
Discloses that the test label distribution differs from the training label distribution.
Zhang, pg. 3, (1) Fixed label shift, Paragraph 1 “Here, one assumes a-priori knowledge of pte(y). One may then adjust the outputs of a probabilistic classifier post-hoc to improve test performance (Elkan, 2001). Even when the precise distribution is unknown, it is common to posit a uniform pte(y)…”
Discloses output correction/adjustment of a probabilistic classifier post-hoc under distribution shift.
So, correcting a first prediction result (taught by Zhou) using correction information (taught by Seoane) and outputting a second result, where the correction information corresponds to the estimated/known shift between training and test label distributions and the second prediction result is the post-hoc adjusted output.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Zhang’s post-hoc output adjustment to the probabilistic outputs of Zhou’s logistic regression predictor because Zhang specifically addresses improving test performance under label shift by adjusting classifier outputs post-hoc. Applying Zhou’s model to a different cohort can involve a different outcome rate due to demographic differences like age (reflected by age stratified analyses like those in Brody/Seoane) Using the age-dependent model using the age-dependent model differences as an estimate of the shift provides an expected benefit of the shift provides an expected benefit of improved calibration/accuracy of predicted fatality risk for the second population.
Claim 6:
Claim 6 is a method claim, including a processor, corresponding to machine learning apparatus claim 1. Otherwise, they include the same limitations. The processor executes a program with a method that includes the same operations recited in claim 1. Therefore, claim 6 is rejected for the same reasons stated in the 35 U.S.C. 103 rejection of claim 1 above.
Claim 7:
Claim 7 is a program claim corresponding to machine learning apparatus claim 1. Otherwise, they include the same limitations. The program contains a method that includes the same operations recited in claim 1. Therefore, claim 6 is rejected for the same reasons stated in the 35 U.S.C. 103 rejection of claim 1 above.
Claims 2-5 are rejected under 35 U.S.C. 103 as being unpatentable over Brody/Seoane/Zhou/Zhang in view of Li.
Claim 2:
Regarding claim 2, the combination of Brody/Seoane/Zhou/Zhang/Li discloses: The data analysis apparatus according to claim 1, wherein
the processor is configured to execute additional learning processing of updating the first prediction model by performing
additional learning using a loss function based on the second prediction result output by the correction processing and a second actual measurement result relating to the second samples, and
Li, pg. 4, Column 1-2, Section 3, Paragraph 2 “First, we record responses yo on each new task image from the original network for outputs on the old tasks (defined by θs and θo). Our experiments involve classification, so the responses are the set of label probabilities for each training image…”
Li, pg. 4, Column 2, Paragraph 3 “…For new tasks, the loss encourages predictions ˆyn to be consistent with the ground truth yn. The tasks in our experiments are multiclass classification, so we use the common [3], [27] multinomial logistic loss: Lnew(yn, ˆyn) = −yn · log ˆyn (1) where ˆyn is the softmax output of the network and yn is the one-hot ground truth label vector…”
Discloses additional learning that updates a prediction model by minimizing a loss including a prediction output based term and an actual measurement/ground truth based term.
Li, pg. 1, Column 2, Learning without Forgetting (LwF)., Paragraph 1 “Using only examples for the new task, we optimize both for high accuracy for the new task and for preservation of responses on the existing tasks from the original network...”
Discloses using only new-task inputs for additional learning, which encompasses a new/second prediction result.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Li to the system of Brody/Seoane/Zhou/Zhang system because Li teaches a standard way to update an existing trained prediction model with new labeled data using a combined loss, while preserving prior behavior. Li’s update method would be a predictable improvement as the Zhou predictor would routinely be updated as new outcome data becomes available.
correct, in the correction processing, using the correction information, a third prediction result output by inputting third feature amount data of third samples different from the first samples and the second samples to the first prediction model updated by the additional learning processing, and output a fourth prediction result.
Li, pg. 5, Figure 3 “…Train: Define Yˆo ≡ CNN(Xn, ˆθs, ˆθo) // old task output Define Yˆn ≡ CNN(Xn, ˆθs, ˆθn) // new task output…”
Discloses an updated model produces outputs by applying network to input data.
Zhang, pg. 3, (1) Fixed label shift, Paragraph 1 “Here, one assumes a-priori knowledge of pte(y). One may then adjust the outputs of a probabilistic classifier post-hoc to improve test performance (Elkan, 2001). Even when the precise distribution is unknown, it is common to posit a uniform pte(y)…”
Discloses output correction/adjustment of a probabilistic classifier post-hoc under distribution shift.
In combination, they disclose/encompass a third iteration.
Claim 3:
Regarding claim 3, the combination of Brody/Seoane/Zhou/Zhang/Li discloses: The data analysis apparatus according to claim 1,
wherein in the acquisition processing, the processor acquires an updated second statistical model based on a distribution of the first actual measurement result and a distribution of a second actual measurement result regarding the second samples,
Brody, pg.1-2, Thought Experiments, Paragraph 1 “We can begin with a gedanken experiment. Take 100,000 newborn human infants and put them in an isolated box. …When a subject is first diagnosed with a colon cancer, the exact age is recorded. The experiment runs for hundreds of years, then we make a histogram of the number of colon cancers diagnosed as a function of years since the beginning of the experiment.”
Brody, pg. 2, Paragraph 4 “…First, we record the age of all patients diagnosed with a specific cancer within a large geographic area
in one year. Second, we record the age of each member in the entire population. Finally, for each age group, we divide the number of patients who had a tumor diagnosed in that year by the total number of people with that age in the population. By convention, these numbers are multiplied by 100,000 and are called the age-specific incidence.”
wherein in the calculation processing, the processor calculates updated correction information indicating a difference between the first statistical model and the updated second statistical model, and
Brody, pg. 4, Birth cohort effects on age-specific incidence data., Paragraph 1 “…The expected value, if 100% of the population were susceptible, for age 99 is about 850 per 100,000, as shown in data from 200 shown Figure 4. The observed value is about 227 with a 95% confidence interval of 96 to 357. The observed value is about one quarter of the expected value of colon carcinoma incidence, if 100% were susceptible to colorectal carcinoma....”
Soeane, pg. 4, Under-reporting of deaths, Paragraph 3 “…In order to compare data between age groups, we normalize this difference by the number of confirmed deaths, that is: Fraction of under-counting = Deaths (COVID-19 suspected & confirmed) - Deaths (confirmed)/Deaths (confirmed) (1).”
wherein in the correction processing, the processor corrects a third prediction result output by inputting third feature amount data of third samples different from the first samples and the second samples to the first prediction model using the updated correction information, and outputs a fourth prediction result.
Li, pg. 5, Figure 3 “…Train: Define Yˆo ≡ CNN(Xn, ˆθs, ˆθo) // old task output Define Yˆn ≡ CNN(Xn, ˆθs, ˆθn) // new task output…”
Discloses an updated model produces outputs by applying network to input data.
Zhang, pg. 3, (1) Fixed label shift, Paragraph 1 “Here, one assumes a-priori knowledge of pte(y). One may then adjust the outputs of a probabilistic classifier post-hoc to improve test performance (Elkan, 2001). Even when the precise distribution is unknown, it is common to posit a uniform pte(y)…”
Discloses output correction/adjustment of a probabilistic classifier post-hoc under distribution shift.
Claim 4:
Regarding claim 4, the combination of Brody/Seoane/Zhou/Zhang/Li discloses: The data analysis apparatus according to claim 1,
wherein in the acquisition processing, the processor acquires an updated second statistical model based on a distribution of the first actual measurement result and a distribution of a second actual measurement result regarding the second samples,
Brody, pg.1-2, Thought Experiments, Paragraph 1 “We can begin with a gedanken experiment. Take 100,000 newborn human infants and put them in an isolated box. …When a subject is first diagnosed with a colon cancer, the exact age is recorded. The experiment runs for hundreds of years, then we make a histogram of the number of colon cancers diagnosed as a function of years since the beginning of the experiment.”
Brody, pg. 2, Paragraph 4 “…First, we record the age of all patients diagnosed with a specific cancer within a large geographic area
in one year. Second, we record the age of each member in the entire population. Finally, for each age group, we divide the number of patients who had a tumor diagnosed in that year by the total number of people with that age in the population. By convention, these numbers are multiplied by 100,000 and are called the age-specific incidence.”
wherein in the learning processing, the processor generates an updated first prediction model by performing re-machine learning using the first actual measurement result, the second actual measurement result, the first feature amount data, and the second feature amount data, and
Li, pg. 4, Column 1-2, Section 3, Paragraph 2 “First, we record responses yo on each new task image from the original network for outputs on the old tasks (defined by θs and θo). Our experiments involve classification, so the responses are the set of label probabilities for each training image…”
Li, pg. 4, Column 2, Paragraph 3 “…For new tasks, the loss encourages predictions ˆyn to be consistent with the ground truth yn. The tasks in our experiments are multiclass classification, so we use the common [3], [27] multinomial logistic loss: Lnew(yn, ˆyn) = −yn · log ˆyn (1) where ˆyn is the softmax output of the network and yn is the one-hot ground truth label vector…”
Li, pg. 1, Column 2, Learning without Forgetting (LwF)., Paragraph 1 “Using only examples for the new task, we optimize both for high accuracy for the new task and for preservation of responses on the existing tasks from the original network...”
wherein in the correction processing, the processor corrects a third prediction result output by inputting third feature amount data of third samples different from the first samples and the second samples to the updated first prediction model using the correction information, and outputs a fourth prediction result.
Li, pg. 5, Figure 3 “…Train: Define Yˆo ≡ CNN(Xn, ˆθs, ˆθo) // old task output Define Yˆn ≡ CNN(Xn, ˆθs, ˆθn) // new task output…”
Discloses an updated model produces outputs by applying network to input data.
Zhang, pg. 3, (1) Fixed label shift, Paragraph 1 “Here, one assumes a-priori knowledge of pte(y). One may then adjust the outputs of a probabilistic classifier post-hoc to improve test performance (Elkan, 2001). Even when the precise distribution is unknown, it is common to posit a uniform pte(y)…”
Discloses output correction/adjustment of a probabilistic classifier post-hoc under distribution shift.
Claim 5:
Regarding claim 5, the combination of Brody/Seoane/Zhou/Zhang/Li discloses: The data analysis apparatus according to claim 3, wherein
in the learning processing, the processor generates an updated first prediction model by performing re-machine learning using the first actual measurement result, the second actual measurement result, the first feature amount data, and the second feature amount data, and
Li, pg. 4, Column 1-2, Section 3, Paragraph 2 “First, we record responses yo on each new task image from the original network for outputs on the old tasks (defined by θs and θo). Our experiments involve classification, so the responses are the set of label probabilities for each training image…”
Li, pg. 4, Column 2, Paragraph 3 “…For new tasks, the loss encourages predictions ˆyn to be consistent with the ground truth yn. The tasks in our experiments are multiclass classification, so we use the common [3], [27] multinomial logistic loss: Lnew(yn, ˆyn) = −yn · log ˆyn (1) where ˆyn is the softmax output of the network and yn is the one-hot ground truth label vector…”
Li, pg. 1, Column 2, Learning without Forgetting (LwF)., Paragraph 1 “Using only examples for the new task, we optimize both for high accuracy for the new task and for preservation of responses on the existing tasks from the original network...”
wherein in the correction processing, the processor corrects the third prediction result output by inputting the third feature amount data of the third samples different from the first samples and the second samples to the updated first prediction model using the updated correction information, and outputs the fourth prediction result.
Li, pg. 5, Figure 3 “…Train: Define Yˆo ≡ CNN(Xn, ˆθs, ˆθo) // old task output Define Yˆn ≡ CNN(Xn, ˆθs, ˆθn) // new task output…”
Discloses an updated model produces outputs by applying network to input data.
Zhang, pg. 3, (1) Fixed label shift, Paragraph 1 “Here, one assumes a-priori knowledge of pte(y). One may then adjust the outputs of a probabilistic classifier post-hoc to improve test performance (Elkan, 2001). Even when the precise distribution is unknown, it is common to posit a uniform pte(y)…”
Discloses output correction/adjustment of a probabilistic classifier post-hoc under distribution shift.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ISIS M BLACK whose telephone number is (703)756-1121. The examiner can normally be reached Monday - Friday 10:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571) 270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/I.M.B./Examiner, Art Unit 2124
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124