DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1-8 and 10-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yang et al. (Autolabeling-Enhanced Active Learning for Cost-Efficient Surface Defect Visual Classification, 2021, IEEE Transactions on Instrumentation and Measurement, Vol. 70, Pages 1-15), hereinafter “Yang”.
Regarding claim 1, Yang teaches:
A computing system (See the Abstract.) comprising:
one or more processing devices (See page 10, left column: “All the experiments were executed on a computer running the Windows 10 64-bit operating system and equipped with an Nvidia Titan Xp GPU with 64 GB of RAM and an Intel Xeon E5-2600 v3 processor.”) configured to:
receive a first labeled image set including a plurality of first images, wherein each of the first images includes one or more first identified regions of interest that have one or more respective first labels (See Fig. 4, receiving of “initial labeled samples” at the top left in green. The depicted set of images meet the claimed “first identified regions of interest”, since they are described by the authors as ROIs as seen in Fig. 1.);
receive an unlabeled image set including a plurality of second images without respective labels (See Fig. 4, receiving of unlabeled samples at the bottom left in pink.);
identify a plurality of second identified regions of interest included in the plurality of second images (See Fig. 4, DCEQS identifies a set of “label-needed samples” in purple DtD, which meet the claimed “second identified regions of interest”.);
compute a respective feature similarity value between each of the second identified regions of interest and the plurality of first identified regions of interest (See Fig. 4, ASMN computes similarity between the labeled samples in green and label-needed samples in purple. Also see Fig. 6 and page 3, left column: “In this study, a novel attention-based similarity measurement network (ASMN) is proposed as an implementation of this autolabeling module by measuring the similarity between the unlabeled and labeled samples.”);
identify, in one or more of the second images, a subset of the plurality of the second identified regions of interest (See Fig. 4, subset of four samples in teal that are described as “samples labeled by ASMN”.) that have feature similarity values above a predetermined similarity threshold (See Fig. 6, Decision Part when the threshold is met. Also see page 7, Eq. 17.);
apply respective second labels to the second identified regions of interest included in the subset (See page 8, left column: “The samples autolabeled by the ASMN are denoted by DtS.”);
construct a second labeled image set (See Fig. 4, DtN in brown and described as “newly labeled samples in one iteration”.) including:
the one or more second images that include the second identified regions of interest included in the subset (See Fig. 4, DtN includes the samples from DtD (in purple).); and
the second labels (See Fig. 4, DtN includes the labels in DtS automatically applied by ASMN.); and
train an image classification machine learning model with a training data set that includes the first labeled image set and the second labeled image set to thereby produce a trained image classification machine learning model (See the training/fine-tuning of the classifier using initial labeled samples and samples labeled by ASMN on the right side of Fig. 4.).
Regarding claim 2, Yang teaches:
The computing system of claim 1, wherein the first identified regions of interest include image data of: inkjet printing defects; rivets; cracks in objects; or additive manufacturing defects (See the line or ring Mura defects (which the examiner asserts meets the claimed “cracks in objects”) in Figs. 1 and 2. Also, on page 1, Yang states that “[a]ll types of surface defects should be detected and classified efficiently during manufacturing to control product quality”. Yang’s disclosure is applicable to a wide range of industrial product surface defect inspection including steel, ceramic, paper, circuit boards, and TFT-LCD displays.).
Regarding claim 3, Yang teaches:
The computing system of claim 1, wherein the one or more processing devices are further configured to: identify an additional subset of the plurality of second identified regions of interest (See Fig. 4, subset of two samples in lime green that are described as “samples labeled by human”.) that have respective feature similarity values below the predetermined similarity threshold (See Fig. 6, Decision Part when the threshold is not met. Also see page 7, Eq. 17.); output an additional labeling request in response to determining that the feature similarity values are below the predetermined similarity threshold; subsequently to outputting the additional labeling request, receive a plurality of additional labels associated with an additional subset of the second identified regions of interest; and apply the plurality of additional labels to the second identified regions of interest included in the additional subset (See Fig. 4, step 5 Human Labeling, which is understood to output an additional labeling request to human annotators. Also see page 3, left column: “Finally, the remaining unlabeled samples are annotated by human experts, and all the newly labeled samples are used to retrain the classifier.”).
Regarding claim 4, Yang teaches:
The computing system of claim 1, wherein the one or more processing devices are further configured to: during a testing phase, receive a plurality of test images (See page 8, right column: “when model training achieves convergence, only the classifier is deployed. Thus, in a final deployment, only the classifier module performs inference. Hence, the classifier module is separated from ALEAL after its performance is satisfactory. Then, the test defect images I are input into the classifier.”); at the image classification machine learning model, compute a plurality of test labels respectively associated with a plurality of test regions of interest included in the test images (See page 11, left column: “Then, unlabeled training samples are input into the classifier, which estimates category confidences for those samples.”); compute a model accuracy of the plurality of test labels (See Eq. 21 on page 10.); determine that the model accuracy is below a model accuracy threshold (See page 11, right column: “When δP = 0.1, the model does not converge and achieves low accuracy, which means that a large number of noisy labels exist. When δP = 0.15, the results improve: the accuracy reaches 0.87, higher than the previous accuracy of 0.64, although some noisy labels remain. The performance becomes satisfactory when δP = 0.2, which means that the pseudolabels almost match the ground truth.” Low accuracy that results in unsatisfactory performance implies an accuracy threshold.); output an additional labeling request in response to determining that the model accuracy is below the model accuracy threshold; and subsequently to outputting the additional labeling request, receive a plurality of additional labels associated with an additional subset of the second identified regions of interest (The examiner interprets the adjustment of the similarity threshold to retrain the classifier (see the threshold δP in Fig. 6 and the autolabeling loop in Fig. 4) to meet the claimed “output an additional labeling request”. Given the examiner’s interpretation, see page 11, right column: “When δP = 0.1, the model does not converge and achieves low accuracy, which means that a large number of noisy labels exist. When δP = 0.15, the results improve: the accuracy reaches 0.87, higher than the previous accuracy of 0.64, although some noisy labels remain. The performance becomes satisfactory when δP = 0.2, which means that the pseudolabels almost match the ground truth.).
Regarding claim 5, Yang teaches:
The computing system of claim 4, wherein the one or more processing devices are further configured to: compute a third labeled image set including: the plurality of second images that include the second identified regions of interest included in the additional subset; and the additional labels; and perform additional training at the image classification machine learning model using the third labeled image set (See Fig. 4, “Loop until the stop criterion is satisfied”. With each loop, an updated labeled image set is computed.).
Regarding claim 6, Yang teaches:
The computing system of claim 1, wherein the one or more processing devices are further configured to: during a testing phase, receive a plurality of test images (See page 8, right column: “when model training achieves convergence, only the classifier is deployed. Thus, in a final deployment, only the classifier module performs inference. Hence, the classifier module is separated from ALEAL after its performance is satisfactory. Then, the test defect images I are input into the classifier.”); at the image classification machine learning model, compute a plurality of test labels respectively associated with a plurality of test regions of interest included in the test images (See page 11, left column: “Then, unlabeled training samples are input into the classifier, which estimates category confidences for those samples.”); compute a model accuracy of the plurality of test labels (See Eq. 21 on page 10.); determine that the model accuracy is below a model accuracy threshold (See page 11, right column: “When δP = 0.1, the model does not converge and achieves low accuracy, which means that a large number of noisy labels exist. When δP = 0.15, the results improve: the accuracy reaches 0.87, higher than the previous accuracy of 0.64, although some noisy labels remain. The performance becomes satisfactory when δP = 0.2, which means that the pseudolabels almost match the ground truth.” Low accuracy that results in unsatisfactory performance implies an accuracy threshold.); and in response to determining that the model accuracy is below the model accuracy threshold, modify a similarity metric with which the feature similarity values are computed (The examiner interprets the adjustment of the similarity threshold to retrain the classifier (see the threshold δP in Fig. 6 and the autolabeling loop in Fig. 4) to meet the claimed “modify a similarity metric”. Further see page 11, right column, where it appears the similarity threshold is adjusted: “When δP = 0.1, the model does not converge and achieves low accuracy, which means that a large number of noisy labels exist. When δP = 0.15, the results improve: the accuracy reaches 0.87, higher than the previous accuracy of 0.64, although some noisy labels remain. The performance becomes satisfactory when δP = 0.2, which means that the pseudolabels almost match the ground truth.).
Regarding claim 7, Yang teaches:
The computing system of claim 1, wherein the one or more processing devices are further configured to: receive an inferencing-time image; at the image classification machine learning model, compute one or more inferencing-time labels respectively associated with one or more inferencing-time regions of interest included in the inferencing-time image; and output the one or more inferencing-time labels (The examiner interprets the claimed “inferencing-time” to mean any kind of object/characteristic/attribute/etc. of a scene in an image. Turning to Fig. 4, see any of the input images (meeting “inferencing-time image”), labels from the autolabeling module ASMN (meeting “inferencing-time labels”), and output to the classifier.).
Regarding claim 8, Yang teaches:
The computing system of claim 1, wherein the one or more processing devices are further configured to: compute a plurality of average representations of respective object classes indicated by the plurality of first labels (See Eq. 12 and page 7, right column: “Fig. 6 shows only the similarity measurement between a single label-needed sample and a single labeled sample. However, to achieve high confidence, the similarity of the label-needed sample should be compared with multiple labeled samples…Each individual final score between f I c i and f Ij is assigned a weight αi.”); and compute the feature similarity values based at least in part on the average representations (See Eq. 17.).
Regarding claim 10, Yang teaches:
The computing system of claim 1, wherein the one or more processing devices are configured to: identify the second identified regions of interest using the image classification machine learning model (See page 5, left column: “2) The unlabeled samples D0 U are input into the classifier, and their entropy values are obtained. 3) Based on the entropy values, the proposed DCEQS selects a portion of the unlabeled samples D0 U to label.”); and iteratively re-select the second labeled image set and train the image classification machine learning model over a plurality of sampling iterations (See “Loop until the stop criterion is satisfied” in Fig. 4.).
Regarding claim 11, Yang teaches:
The computing system of claim 10, wherein the one or more processing devices are configured to compute the feature similarity values at an image similarity neural network (See the CNN architecture in Fig. 6.).
Regarding claim 12, Yang teaches:
A method for use with a computing system (See the Abstract.), the method comprising:
receiving a first labeled image set including a plurality of first images, wherein each of the first images includes one or more first identified regions of interest that have one or more respective first labels (See Fig. 4, receiving of “initial labeled samples” at the top left in green. The depicted set of images meet the claimed “first identified regions of interest”, since they are described by the authors as ROIs as seen in Fig. 1.);
receiving an unlabeled image set including a plurality of second images without respective labels (See Fig. 4, receiving of unlabeled samples at the bottom left in pink.);
identifying a plurality of second identified regions of interest included in the plurality of second images (See Fig. 4, DCEQS identifies a set of “label-needed samples” in purple DtD, which meet the claimed “second identified regions of interest”.);
computing a respective feature similarity value between each of the second identified regions of interest and the plurality of first identified regions of interest (See Fig. 4, ASMN computes similarity between the labeled samples in green and label-needed samples in purple. Also see Fig. 6 and page 3, left column: “In this study, a novel attention-based similarity measurement network (ASMN) is proposed as an implementation of this autolabeling module by measuring the similarity between the unlabeled and labeled samples.”);
identifying, in one or more of the second images, a subset of the plurality of the second identified regions of interest (See Fig. 4, subset of four samples in teal that are described as “samples labeled by ASMN”.) that have feature similarity values above a predetermined similarity threshold (See Fig. 6, Decision Part when the threshold is met. Also see page 7, Eq. 17.);
applying respective second labels to the second identified regions of interest included in the subset (See page 8, left column: “The samples autolabeled by the ASMN are denoted by DtS.”);
constructing a second labeled image set (See Fig. 4, DtN in brown and described as “newly labeled samples in one iteration”.) including:
the one or more second images that include the second identified regions of interest included in the subset (See Fig. 4, DtN includes the samples from DtD (in purple).); and
the second labels (See Fig. 4, DtN includes the labels in DtS automatically applied by ASMN.); and
training an image classification machine learning model with a training data set that includes the first labeled image set and the second labeled image set to thereby produce a trained image classification machine learning model (See the training/fine-tuning of the classifier using initial labeled samples and samples labeled by ASMN on the right side of Fig. 4.).
Yang teaches claim 13 for the reasons given in the treatment of claim 2.
Yang teaches claim 14 for the reasons given in the treatment of claim 3.
Yang teaches claim 15 for the reasons given in the treatment of claim 4.
Yang teaches claim 16 for the reasons given in the treatment of claim 5.
Yang teaches claim 17 for the reasons given in the treatment of claim 6.
Yang teaches claim 18 for the reasons given in the treatment of claim 7.
Yang teaches claim 19 for the reasons given in the treatment of claim 10.
Regarding claim 20, Yang teaches:
A computing system (See the Abstract.) comprising:
one or more processing devices configured to (See page 10, left column: “All the experiments were executed on a computer running the Windows 10 64-bit operating system and equipped with an Nvidia Titan Xp GPU with 64 GB of RAM and an Intel Xeon E5-2600 v3 processor.”):
train an image classification machine learning model using a training data set that includes (See the classifier training in Fig. 4.):
a plurality of first labeled images that each include one or more first identified regions of interest with one or more respective first labels, wherein the first labeled images are received in one or more markup iterations (See Fig. 4, receiving of “initial labeled samples” at the top left in green. The depicted set of images meet the claimed “first identified regions of interest”, since they are described by the authors as ROIs as seen in Fig. 1. Page 4, right column states: “The classifier is trained using the initial human-labeled samples”. The examiner asserts that human labeling of the initial samples meet a “markup iteration”.); and
a plurality of second labeled images that each include one or more second identified regions of interest with one or more respective second labels (See Fig. 4, receiving of unlabeled samples at the bottom left in pink and which are given labels by the ASMN.), wherein the second labels are applied to the second identified regions of interest based at least in part on respective feature similarity values between each of the second identified regions of interest and the plurality of first identified regions of interest (See Fig. 4, ASMN computes similarity between the labeled samples in green and label-needed samples in purple. Also see Fig. 6 and page 3, left column: “In this study, a novel attention-based similarity measurement network (ASMN) is proposed as an implementation of this autolabeling module by measuring the similarity between the unlabeled and labeled samples.”);
receive an inferencing-time image (The examiner interprets the claimed “inferencing-time” to mean any kind of object/characteristic/attribute/etc. of a scene in an image. Turning to Fig. 4, see any of the samples in green or red received by the DCEQS.)
at the image classification machine learning model, compute one or more inferencing-time labels respectively associated with one or more inferencing-time regions of interest included in the inferencing-time image (See any of the labels generated by the autolabeling module ASMN in Fig. 4.); and
output the one or more inferencing-time labels (See the output to the classifier on the right side of Fig. 4.).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang (Autolabeling-Enhanced Active Learning for Cost-Efficient Surface Defect Visual Classification, 2021, IEEE Transactions on Instrumentation and Measurement, Vol. 70, Pages 1-15) in view of Keum et al. (Mean Shift-based SIFT Keypoint Filtering for Region-of-Interest Determination, 2012, The 6th International Conference on Soft Computing and Intelligent Systems, and The 13th International Symposium on Advanced Intelligence Systems, Pages 266-271), hereinafter “Keum”.
Claim 9 is met by the combination of Yang and Keum, wherein
Yang teaches:
The computing system of claim 1, wherein the one or more processing devices are configured to
Yang does not disclose the following; however, Keum teaches:
identify the second identified regions of interest via scale-invariant feature transform (SIFT) extraction (See Fig. 1, determination of ROI from a photo based on clustering of SIFT keypoints. Also see Fig. 6 (which parallels Fig. 1 in Yang).).
Yang and Keum together teach the limitations of claim 9. Keum is directed to a related field of art (determination of regions-of-interest in given images). Therefore, Yang and Keum are combinable. Yang identifies the second identified regions of interest in Figs. 1 and 5, where in Fig. 1, an image is captured and a ROI is identified, and then in Fig. 5, samples are separated into groups. However, Yang does not disclose how the initial ROI is determined from the captured image. Modifying the system and method of Yang by adding the capability to “identify the second identified regions of interest via scale-invariant feature transform (SIFT) extraction”, as taught by Keum, would yield the expected and predictable result of a complete ROI identification process. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Yang and Keum in this way.
Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN S LEE whose telephone number is (571)272-1981. The examiner can normally be reached 11:30 AM - 7:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Bee can be reached at (571)270-5183. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Jonathan S Lee/Primary Examiner, Art Unit 2677