Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 8-14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claims do not fall within at least one of the four categories of patent eligible subject matter because the claims recite a computer-readable medium whose broadest reasonable interpretation includes non-transitory signals per se.
Claims 1, 4-7; 8, 11-14; 15, and 18-20 are rejected under 35 U.S.C. because the claimed invention is directed to an abstract idea, without significantly more.
Claim 1 recites a method, thus a process, one of the four statutory categories of patentable subject matter. However, Claim 1 further recites the steps of determining a label uncertainty score of the labeled data based on a difference between an average entropy score and an adjustment score and determining whether to re-label one or more labeled data in the set of labeled data based on the label uncertainty score, which are both mental processes clearly performable in the human mind. Thus, the claim recites an abstract idea.
The claim does not include any additional elements which could integrate the abstract idea into a practical application, because the additional elements consist of obtaining, by an electronic device, a set of labeled data, wherein each of the labeled data comprises a feature vector and a label and processing the labeled data to obtain a plurality of classification results by using a plurality of machine learning model, wherein each of the plurality of classification results is obtained by using a different machine learning model in the plurality of machine learning models to process the feature vector of the labeled data, neither of which applies of makes use of the abstract idea, and thus neither of which can provide a practical application of the abstract idea. Thus, both additional elements are insignificant extra-solution activity (see MPEP 2106.05(g)). Therefore, the claim is directed to the abstract idea.
The additional elements cannot provide significantly more than the abstract idea itself, because, taken alone and in combination, they are routine, well-understood, and conventional: obtaining a set of labeled data, gathering data, by MPEP 2106.05(d), “transmitting or receiving data over a network,” and obtain a plurality of classification results by using a plurality of machine learning models, by noting that an entire CPC classification, G06N20/20, is devoted to ensemble learning. Therefore, the claim is ineligible.
Claims 4-6 merely recite the particular technological environment to which the invention is applied, which by MPEP 2106.05(h) cannot integrate the abstract idea into a practical application nor provide significantly more than the abstract idea itself. Claim 7 recites only an additional mental process step, and thus no additional elements which could integrate the abstract idea into a practical application nor provide significantly more than the abstract idea itself.
Claims 8 and 11-14 recite a computer-readable medium containing instructions to perform precisely the methods of Claims 1 and 4-7, respectively. As performance of an abstract idea on generic computer components cannot integrate the abstract idea into a practical application nor provide significantly more than the abstract idea itself (MPEP 2106.05(f)(2)) Claims 8 and 11-14 are rejected for reasons set forth in the rejections of Claims 1 and 4-7, respectively. Similarly, Claims 15 and 18-20 recite an electronic device, comprising: one or more computers; and one or mor computer memory devices and are thus rejected for reasons set forth in the rejections of Claims 1 and 4-6, along with MPEP 2106.05(f)(2).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3, 5-10, 12-17, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Li, “Improved Regularization for Fine-Tuning in Neural Networks,” in view of Yildirim, “Leveraging Uncertainty in Deep Learning for Selective Classification.” An additional reference Gal, “Dropout as Bayesian Approximation: Representing Model Uncertainty in Deep Learning” is relied upon to demonstrate an inherency in Yildirim.
Regarding Claim 1, Li teaches a method, comprising: … by an electronic device … (Li, pg. 8, Section 5, “Our code is available” demonstrates that they perform their method on a computer) obtaining a set of labeled data, wherein each of the labeled data comprises a feature vector and a label (Li, pg. 3, Section 3, “We have a training data set … Let
(
x
1
(
t
)
,
y
1
(
t
)
)
,
…
,
(
x
n
(
t
)
,
y
n
(
t
)
)
be the feature vectors and the labels of the training data set”); for each labeled data in the set of labeled data: processing the labeled data to obtain a plurality of classification results … determining a label uncertainty score of the labeled data … and determining whether to re-label one or more labeled data in the set of labeled data based on the label uncertainty scores (Li, pg. 7, Section 4.2, 3rd paragraph, “Self label-correction: We propose a label correction step to augment the number of correct labels. We leverage the discriminating power of a network … and correct data points for which the model has high confidence”).
Li uses merely the prediction probability as confidence/uncertainty score, rather than the claimed steps of using a plurality of machine learning models to determine the uncertainty score. However, Yildirim computes an uncertainty score by for each … data in the set of … data: processing the … data to obtain a plurality of classification results by using a plurality of machine learning models, wherein each of the plurality of classification results is obtained by using a different machine learning model in the plurality of machine learning models to process the feature vector of the … data (Yildirim, pg. 6, 2nd column, Section 4, 2nd paragraph, “To quantify model uncertainty … we train a dropout neural network … We run trained and optimized DNN 100 times with its dropouts open. We use empirical … standard deviation of these 100 softmax outputs as the … model uncertainty”). Yildirim notes that their “model uncertainty” determination is via the method of Gal (see Yildirim, pg. 3, 1st column, 4th paragraph) where Gal is more clear that the uncertainty is calculated by using a plurality of machine learning models (Gal, pg. 4, 1st column, 2nd paragraph, Eq. (6), the predictions are performed using T different versions of the trained machine learning model, each of the T different models obtained by randomly dropping out some of the neurons of the model to obtain a plurality of different machine learning models to process the feature vector. Yildirim uses 100 different models with different dropout patterns). Yildirim further teaches determining a label uncertainty score of the … data based on a difference between an average entropy score and an adjustment score (Yildirim, pg. 3, Fig. 1 & Eqs. (1-6), where the label uncertainty score is the region to which the data point is assigned, e.g.
A
1
,
A
2
, etc., based on the adjustment score
σ
for the data point, calculated as the standard deviation of the 100 dropout/different model results, being greater than or less that an average entropy score of
σ
L
or
σ
R
, i.e. Eqs. (2) & (5)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply Yildirim’s method of determining whether a output is confident or not to the labeled data of Li in order to perform label correction. The motivation to do so is that Yildirim’s method “combines model uncertainty and predictive mean to identify optimal classification and rejection regions. Our results indicate superior performance of our framework in both non-rejected accuracy and rejection quality on several publicly available dataset” (Yildirim, Abstract) – that is, Yildirim provides a more sophisticated/improved method to determine whether a datapoint is confidently predicted as compared to Li’s method, which is then used to decide which datapoints to relabel as Li does.
Regarding Claim 2, the Li/Yildirim combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination has already been shown to teach wherein the adjustment score is determined based on a standard deviation of the classification results of the labeled data (Yildirim, pg. 3, Fig. 1 & Eqs. (1-6), where the label uncertainty score is the region to which the data point is assigned, e.g.
A
1
,
A
2
, etc., based on the adjustment score
σ
for the data point, calculated as the standard deviation of the 100 dropout/different model results).
Regarding Claim 3, the Li/Yildirim combination of Claim 2 teaches the method of Claim 2 (and thus the rejection of Claim 2 is incorporated). Yildirim further teaches wherein the adjustment score is determined further based on a scaling factor (Yildirim, pg. 6, 2nd column, Section 4, 2nd paragraph, “To quantify model uncertainty … we train a dropout neural network … We apply a grid search among dropout rates of (0.005, 0.01, 0.02) and regularization coefficients (0.1,0.25) to achieve optimal DNN configuration” where both “dropout rates” and “regularization coefficients” are scaling factors on which the adjustment score
σ
for a point depends).
Regarding Claim 5, the Li/Yildirim combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination has already been shown to teach wherein the plurality of machine learning models are part of a virtual ensemble (Yildirim, pg. 6, 2nd column, Section 4, 2nd paragraph, “To quantify model uncertainty … we train a dropout neural network … We run trained and optimized DNN 100 times with its dropouts open”).
Regarding Claim 6, the Li/Yildirim combination of Claim 5 teaches the method of Claim 5 (and thus the rejection of Claim 5 is incorporated). Yildirim’s computation of running the dropout network a plurality of times, as evidenced by Gal, further teaches wherein each neuron in the plurality of machine learning models is associated with a random function that returns an indicator to indicate whether the neuron is turned on or off (Gal, pg. 3, 1st column, 3rd paragraph, “With dropout, we sample binary variables for every input point and for every network unit in each layer … Each binary variable takes value 1 with probability
p
i
”).
Regarding Claim 7, the Li/Yildirim combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination has already been shown to teach wherein determining whether to re-label one or more labeled data comprises comparing the label uncertainty scores of the one or more labeled data with a configured threshold (Li, pg. 7, Section 4.2, 3rd paragraph, “Self label-correction: We … correct data points for which the model has high confidence” where “high confidence” is determined in the combination as taught by Yildirim, pg. 3, Fig.1, i.e. whether the uncertainty score is in region
A
1
, i.e. a threshold of
A
1
compared to
A
2
).
Claims 8-10 and 12-14 recite a computer-readable medium containing instructions to perform precisely the methods of Claims 1-3 and 5-7, respectively. As Li performs their method on a computer, in which such a computer-readable medium is inherent (Li, pg. 8, Section 5, “Our code is available”), Claims 8-10 and 12-14 are rejected for reasons set forth in the rejections of Claims 1-3 and 5-7, respectively. Similarly, Claims 15-17, 19, and 20 recite an electronic device, comprising: one or more computers; and one or mor computer memory devices and are thus rejected for reasons set forth in the rejections of Claims 1-3, 5, and 6, respectively.
Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Li, in view of Yildirim, and further in view of Singh, US PG Pub 2017/0185667.
Regarding Claim 4, the Li/Yildirim combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated). Li teaches that their method is generic and can be used for datasets of many types (Li, pg. 8, Section 5, “in a wide range of tasks”) but never specifically teaches wherein each of the labeled data represents a software code, and the label indicates whether the software code is potentially malicious.
Singh, however, in the context of an invention which can relabel mislabeled data (Singh, [0017], “Relabel modules can determine if an classification assigned to the instances needs to be changed”) teaches classification on datasets of malicious software (Singh, [0003], “Malicious software (“malware”) that infects a host computer may be able to perform any number of malicious actions … attempts to identify malware rely on the proper classification of data”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the Li/Yildirim combination to datasets of malicious code, such as that of Singh. The motivation to do so is that “it can be difficult and time consuming to properly classify large amounts of data [for use in malware detection]” (Singh, [0003]) and Li/Yildirim provides a solution to properly classify and relabel the malware data of Singh.
Claim 11 recites a computer-readable medium containing instructions to perform precisely the method of Claim 4. As Li performs their method on a computer, in which such a computer-readable medium is inherent (Li, pg. 8, Section 5, “Our code is available”), Claim 11 is rejected for reasons set forth in the rejection of Claim 4. Similarly, Claim 18 recites an electronic device, comprising: one or more computers; and one or mor computer memory devices and is thus rejected for reasons set forth in the rejection of Claim 4.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIAN M SMITH whose telephone number is (469)295-9104. The examiner can normally be reached Monday - Friday, 8:00am - 4pm Pacific.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BRIAN M SMITH/Primary Examiner, Art Unit 2122