DETAILED ACTION
This action is responsive to the Application filed on 03/21/2023. Claims 1-20are pending in the case. Claims 1, 8, and 15 are independent claims. Claims 1, 6, 9, 11, 15, 18, 19, 20 are amended.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s)1-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Cordeiro al. “PropMix: Hard Sample Filtering and Proportional MixUp for Learning with Noisy Labels”
Claim 1
Cordeiro teaches, a processor that executes computer-executable components stored in a non-transitory computer-readable memory, wherein the computer-executable components comprise: ( pg 6 Section 4.2 “For CIFAR-10 and CIFAR-100 we used a 18-layer PreaAct-ResNet-18 (PRN18) [18] as our backbone model… For CIFAR-10 and CIFAR-100, PRN18 is trained with a WarmUp stage of 30 epochs…” one of ordinary skill in the art would understand that such an algorithm and its training is implements on a computer with memory.) an access component that accesses a deep learning classifier and a training dataset on which the deep learning classifier was trained; and (pg 6 Section 4.2 “For CIFAR-10 and CIFAR-100 we used a 18-layer PreaAct-ResNet-18 (PRN18) [18] as our backbone model” the model is a deep model because it has at least 18 layers. pg 10 Table 3 “Test accuracy (%) for WebVision… by methods trained with 100 epochs.” The table shows the test accuracy of the authors model, indicating it is accessed and trained. Pg 4 “After the pre-training, we warm-up the classifier by training it for a few epochs on the (noisy) training data” the classifier is trained with training data.) a re-training component that re-trains the deep learning classifier using a loss function that is based on a Gaussian mixture model constructed from the training dataset. (Figure 1
PNG
media_image1.png
211
732
media_image1.png
Greyscale
caption “Our proposed PropMix has a self-supervised pre-training stage [6, 7, 12, 19], followed by a supervised training stage, where we first warm-up the classifier with a classification loss, using the pre-trained weights. Then, using the classification loss, we train a GMM to separate the samples into clean and noisy. Next, using classification confidence for the noisy set, we train a second GMM to separate the easy and hard noisy samples. The clean and easy noisy samples are proportionally combined in the MixUp for training” pg 5 “To optimise the classification term we rely on the regularised CE loss…
PNG
media_image2.png
47
172
media_image2.png
Greyscale
” the model is first trained, then retrained via a loss function which is based on the filtered dataset by the gaussian mixture model.)
Claim 2
Cordeiro teaches claim 1
Cordeiro teaches, wherein the deep learning classifier is configured to receive a data candidate as input… and to produce a classification label and a confidence score as output, ( pg 4 “After the pre-training, we warm-up the classifier by training it for a few epochs on the (noisy) training data set with the cross-entropy (CE) loss. The clean and noisy sets, X ,U ⊆ D…
PNG
media_image3.png
32
838
media_image3.png
Greyscale
… with τ denoting a classification threshold, … being a function that estimates the probability that (xi ,yi) is a clean label sample” the probability estimate of a clean label sample is a confidence score, while figure 1 cited above indicates the label estimation, corresponding to the classification label, of the candidate data in the dataset D.) and wherein the computer-executable components further comprise: a data component that generates a set of confidence lists collated according to class, by executing the deep learning classifier on the training dataset. (pg 4 “After the pre-training, we warm-up the classifier by training it for a few epochs…The clean and noisy sets, X ,U ⊆ D, are formed… Next, we obtain the sets of easy and hard noisy samples UE,UH ⊆ U, as follows.” Clean/noisy and easy/hard samples are organized into sets or lists according to class, where the attributes of the sample (i.e easy/hard) corresponds to the class. This is accomplished by executing the classifier)
Claim 3
Cordeiro teaches claim 2
Cordeiro teaches, a Gaussian component that generates the Gaussian mixture model based on the set of confidence lists, wherein constituent Gaussian distributions of the Gaussian mixture model respectively correspond to unique classes. (pg 4 “The function … in Eq.3 is a bi-modal Gaussian mixture model (GMM)… where γ denotes the GMM parameters and the larger mean component is the noisy component whereas the smaller mean component is the clean component…. The function … in Eq. 4 is a GMM, where γ denotes the GMM parameters and the smaller mean component is the hard noise component whereas the larger mean component is the easy noise component” the gaussian mixture models are generated based on the set of confidence lists. Each gaussian component of the models correspond to particular unique classes (i.e noisy/clean/hard/easy))
Claim 4
Cordeiro teaches claim 3
Cordeiro teaches, wherein the access component accesses a training data candidate on which the deep learning classifier has not been trained… and wherein the re-training component executes the deep learning classifier on the training data candidate (pg 2 “To improve the feature representation and model confidence in high noise scenarios, we also add a self-supervised pre-training stage” pg 3 “Our method proposes a hybrid approach. We claim that hard noisy samples are unlikely to have their label corrected, mainly in a high noise scenario. On the other hand, we can find easy noisy samples that are likely to be correctly relabelled and used in a supervised training. The main difference of existing filtering methods and our approach is that we filter out hard noisy samples, while keeping easy noisy samples to be relabelled and included in the training process” the processes first involve training with initial data. The labels are revised, i.e a new data set is created to retrain the model, thus accessing data which the classifier has not been trained, which is then used for training.) thereby yielding a first classification label and a first confidence score. (pg 2 “PropMix filters out hard noisy samples via a two-stage process, where the first stage classifies samples as clean or noisy using the loss values, and the second stage eliminates hard noisy samples using their classification confidence. Then, by re-labelling the easy noisy samples with the model output, adding these samples to the training set, and running a regular classification training with MixUp” the model extract a first classification confidence or score, and re-labels samples, thus yielding a first classification label.)
Claim 5
Cordeiro teaches claim 4
Cordeiro teaches, wherein the first classification label corresponds to a first constituent Gaussian distribution of the Gaussian mixture model ( pg 4-5 “Next, we obtain the sets of easy and hard noisy samples UE,UH ⊆ U, as follows… …
PNG
media_image4.png
42
465
media_image4.png
Greyscale
… The function p hard|pθ (c ∗ i | fφ (xi)), γ in Eq. 4 is a GMM” the classification label y corresponds to the constitute gaussian distribution of the GMM mixture model.) and wherein the re-training component determines, via the Gaussian mixture model, a measure of fit between the first confidence score and the first constituent Gaussian distribution ( pg 5 “To optimise the classification term we rely on the regularised CE loss…
PNG
media_image5.png
82
863
media_image5.png
Greyscale
” the retraining according to the loss function is based on the output from the Guassian mixture model, i.e via the GMM, measures the fit between the determined confidence score from the GMM and gaussian distribution via the KL loss term with parameters theta.)
Claim 6
Cordeiro teaches claim 5
Cordeiro teaches, wherein the loss function comprises a first term that is based on the first classification label, and wherein the loss function comprises a second term that is based on the measure of fit (pg 5
PNG
media_image6.png
57
177
media_image6.png
Greyscale
PNG
media_image5.png
82
863
media_image5.png
Greyscale
the first term l_CE is based on the classification label y. The Second term l_r is the measure of fit via KL divergence)
Claim 7
Cordeiro teaches claim 2
Cordeiro teaches, wherein the data component generates the set of confidence lists based on applying a drop out technique to the training dataset. (pg 4 Our proposed PropMix
PNG
media_image7.png
203
692
media_image7.png
Greyscale
pg 3 Section 3.2 “Then, we perform a supervised training, with a new filtering step to identify clean samples, easy noisy samples, and hard noisy samples, which are removed from training” discarding hard samples amounts to applying dropout to the training data set, with each training iteration the filtering is performed this the dropout is based on filtering described in the figure, the confidence list of filtered samples is based on the removed samples in part)
Claim 8
Cordeiro teaches, A computer-implemented method ( pg 6 Section 4.2 “For CIFAR-10 and CIFAR-100 we used a 18-layer PreaAct-ResNet-18 (PRN18) [18] as our backbone model”)
The remaining limitation are rejected for the reasons set forth in claim 1
Claim 9-14
The claims are rejected for the reasons set forth in the rejections of claims 2-7, in connection with claim 1
Claim 15
Cordeiro teaches, A computer program product for facilitating preservation of deep learning classifier confidence distributions, the computer program product comprising a non-transitory computer-readable memory having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: ( pg 6 Section 4.2 “For CIFAR-10 and CIFAR-100 we used a 18-layer PreaAct-ResNet-18 (PRN18) [18] as our backbone model… For CIFAR-10 and CIFAR-100, PRN18 is trained with a WarmUp stage of 30 epochs…” one of ordinary skill in the art would understand that such an algorithm and its training is implements on a computer with memory.))
The remaining limitation are rejected for the reasons set forth in claim 1
Claim 16-20
The claims are rejected for the reasons set forth in the rejections of claims 2-6, in connection with claim 1
Conclusion
Prior art:
Lee et al. “training confidence-calibrated classifiers for detecting out-of-distribution samples” describes generating training samples using a gaussian mixture model tuned to underlying class distributions.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 9:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.R.G./
Examiner, Art Unit 2122
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122