DETAILED ACTION
Contents
Notice of Pre-AIA or AIA Status 2
Claim Rejections - 35 USC § 101 2
Claim Rejections - 35 USC § 102 3
Claim Rejections - 35 USC § 103 7
Allowable Subject Matter 17
Conclusion 19
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to applicant’s claim set received on 3/28/24. Claims 1-20 are currently pending.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-2, 10-11, 15-16 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter as follows. Regarding claims 1, 10, 15, the claims are directed to an abstract idea, namely mathematical operation and information processing. The claims are not integrated into a practical application and the claims lack an inventive concept. Furthermore claims 2, 11, and 16 are also directed to an abstract idea, specifically evaluating confidence data against thresholds to determine label retention. The dependent claims do not integrate the abstract idea into a practical application and does not recite an inventive concept. Thus, all of the listed claims are considered non-statutory subject matter.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless - (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1, 2 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Liu et al (ICLR: “UNBIASED TEACHER FOR SEMI-SUPERVISED OBJECT DETECTION”). Regarding claim 1, Liu discloses a training method of an object detection model, comprising: acquiring an input image, and determining an object pseudo label of the input image based on an object detection model, wherein the input image is labeled with a real label (see section 3, 3.2; label images …….Figure 3: Overview of Unbiased Teacher. Unbiased Teacher consists of two stages. Burn-In: we first train the object detector using available labeled data. Teacher-Student Mutual Learning consists of two steps. Student Learning: the fixed teacher generates pseudo-labels to train the Student, while Teacher and Student are given weakly and strongly augmented inputs, respectively. Teacher Refinement: the knowledge that the Student learned is then transferred to the slowly progressing Teacher via exponential moving average (EMA) on network weights. When the detector is trained until converge in the Burn-In stage, we switch to the Teacher-Student Mutual Learning stage. the Teacher generates pseudo-labels to train the Student, and the Student updates the knowledge it learned back to the Teacher; hence, the pseudo-labels used to train the Student itself are improved. Lastly, there exists class-imbalance and foreground-background imbalance problems in object detection, which impedes the effectiveness of semi-supervised techniques of image classification (e.g., pseudo-labeling) being used directly on SS-OD. Therefore, in Sec. 3.3, we also discuss how Focal loss (Lin et al., 2017b) and EMA training alleviate the imbalanced pseudo-label issue.);
acquiring a multi-object detection result of the input image based on an auxiliary detection model (see 3.1, fig. 3; It is important to have a good initialization for both Student and Teacher models, as we will rely on the Teacher to generate pseudo-labels to train the Student in the later stage. To do so, we first use the available supervised data to optimize our model θ with the supervised loss Lsup. With the supervised data Ds = {x s i , y s i } Ns i=1, the supervised loss of object detection consists of four losses: the RPN classification loss L rpn cls , the RPN regression loss L rpn reg , the ROI classification loss L roi cls , and the ROI regression loss L roi reg (Ren et al., 2015), Lsup = X i L rpn cls (x s i , y s i ) + L rpn reg (x s i , y s i ) + L roi cls (x s i , y s i ) + L roi reg(x s i , y s i ). (1) After Burn-In, we duplicate the trained weights θ for both the Teacher and the Student models (θt ← θ, θs ← θ). Starting from this trained detector, we further utilize the unsupervised data to improve the object detector via the following proposed training regimen.);
calculating a first loss according to the multi-object detection result of the input image and the real label of the input image (see section 3.1; It is important to have a good initialization for both Student and Teacher models, as we will rely on the Teacher to generate pseudo-labels to train the Student in the later stage. To do so, we first use the available supervised data to optimize our model θ with the supervised loss Lsup. With the supervised data Ds = {x s i , y s i } Ns i=1, the supervised loss of object detection consists of four losses: the RPN classification loss L rpn cls , the RPN regression loss L rpn reg , the ROI classification loss L roi cls , and the ROI regression loss L roi reg (Ren et al., 2015), Lsup = X i L rpn cls (x s i , y s i ) + L rpn reg (x s i , y s i ) + L roi cls (x s i , y s i ) + L roi reg(x s i , y s i ). (1) After Burn-In, we duplicate the trained weights θ for both the Teacher and the Student models (θt ← θ, θs ← θ). Starting from this trained detector, we further utilize the unsupervised data to improve the object detector via the following proposed training regimen.), and
calculating a second loss according to the multi-object detection result of the input image and the object pseudo label of the input image (see 3.2; Student Learning with Pseudo-Labeling. To address the lack of ground-truth labels for unsupervised data, we adapt the pseudo-labeling method to generate labels for training the Student with unsupervised data. This follows the principle of existing successful examples in semi-supervised image classification task (Lee, 2013; Sohn et al., 2020a). Similar to classification-based methods, to prevent the consecutively detrimental effect of noisy pseudo-labels (i.e., confirmation bias or error accumulation), we first set a confidence threshold δ of predicted bounding boxes to filter lowconfidence predicted bounding boxes, which are more likely to be false positive samples. While the confidence threshold method have achieved tremendous success in the image classification, it is however not sufficient for object detection. This is because there also exist duplicated box predictions and imbalanced prediction issues in the SS-OD (we leave the discussion of the imbalanced prediction issue in Sec. 3.3). To address the duplicated boxes prediction issue, we remove the repetitive predictions by applying class-wise non-maximum suppression (NMS) before the use of confidence thresholding as performed in STAC (Sohn et al., 2020b). In addition, noisy pseudo-labels can affect the pseudo-label generation model (Teacher). As a result, we detach the Student and the Teacher. To be more specific, after obtaining the pseudo-labels from the Teacher, only the learnable weights of the Student model is updated via back-propagation. θs ← θs + γ ∂(Lsup + λuLunsup) ∂θs , Lunsup = X i L rpn cls (x u i , yˆ u i ) + L roi cls (x u i , yˆ u i ) (2) Note that we do not apply unsupervised losses for the bounding box regression since the naive confidence thresholding is not able to filter the pseudo-labels that are potentially incorrect for bounding box regression (because the confidence of predicted bounding boxes only indicate the confidence of predicted object categories instead of the quality of bounding box locations (Jiang et al., 2018)).); and
updating the auxiliary detection model according to the first loss and the second loss, and updating the object detection model based on the auxiliary detection model that has been updated (see 3.2; Teacher Refinement via Exponential Moving Average. To obtain more stable pseudo-labels, we apply EMA to gradually update the Teacher model. The slowly progressing Teacher model can be regarded as the ensemble of the Student models in different training iterations. θt ← αθt + (1 − α)θs. (3) This approach has been shown to be effective in many existing works, e.g., ADAM optimization (Kingma & Ba, 2015), Batch Normalization (Ioffe & Szegedy, 2015), self-supervised learning (He et al., 2020; Grill et al., 2020), and SSL image classification (Tarvainen & Valpola, 2017), while we, for the first time, demonstrate its effectiveness also in alleviating pseudo-labeling bias issue for SS-OD (see next section).).
Regarding claim 2, Liu discloses determining a preselected pseudo label of the input image based on the object detection model, wherein the preselected pseudo label corresponds to a detection box confidence; determining a first confidence threshold corresponding to each object category in the input image; and in response to the detection box confidence corresponding to the preselected pseudo label being greater than a first confidence threshold of a corresponding object category, retaining the preselected pseudo label, and determining the object pseudo label of the input image (see 3.2, fig. 8, 9).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimedinvention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 10-11, 15-16 is rejected under 35 U.S.C. 103 as being unpatentable over Liu et al (ICLR: “UNBIASED TEACHER FOR SEMI-SUPERVISED OBJECT DETECTION”) in view of Aoki et al (US 2020/0394415 A1).
Regarding claim 10, Liu teaches a method comprising: acquiring an input image, and determining an object pseudo label of the input image based on an object detection model, wherein the input image is labeled with a real label (see section 3, 3.2; label images …….Figure 3: Overview of Unbiased Teacher. Unbiased Teacher consists of two stages. Burn-In: we first train the object detector using available labeled data. Teacher-Student Mutual Learning consists of two steps. Student Learning: the fixed teacher generates pseudo-labels to train the Student, while Teacher and Student are given weakly and strongly augmented inputs, respectively. Teacher Refinement: the knowledge that the Student learned is then transferred to the slowly progressing Teacher via exponential moving average (EMA) on network weights. When the detector is trained until converge in the Burn-In stage, we switch to the Teacher-Student Mutual Learning stage. the Teacher generates pseudo-labels to train the Student, and the Student updates the knowledge it learned back to the Teacher; hence, the pseudo-labels used to train the Student itself are improved. Lastly, there exists class-imbalance and foreground-background imbalance problems in object detection, which impedes the effectiveness of semi-supervised techniques of image classification (e.g., pseudo-labeling) being used directly on SS-OD. Therefore, in Sec. 3.3, we also discuss how Focal loss (Lin et al., 2017b) and EMA training alleviate the imbalanced pseudo-label issue.);
acquiring a multi-object detection result of the input image based on an auxiliary detection model (see 3.1, fig. 3; It is important to have a good initialization for both Student and Teacher models, as we will rely on the Teacher to generate pseudo-labels to train the Student in the later stage. To do so, we first use the available supervised data to optimize our model θ with the supervised loss Lsup. With the supervised data Ds = {x s i , y s i } Ns i=1, the supervised loss of object detection consists of four losses: the RPN classification loss L rpn cls , the RPN regression loss L rpn reg , the ROI classification loss L roi cls , and the ROI regression loss L roi reg (Ren et al., 2015), Lsup = X i L rpn cls (x s i , y s i ) + L rpn reg (x s i , y s i ) + L roi cls (x s i , y s i ) + L roi reg(x s i , y s i ). (1) After Burn-In, we duplicate the trained weights θ for both the Teacher and the Student models (θt ← θ, θs ← θ). Starting from this trained detector, we further utilize the unsupervised data to improve the object detector via the following proposed training regimen.);
calculating a first loss according to the multi-object detection result of the input image and the real label of the input image (see section 3.1; It is important to have a good initialization for both Student and Teacher models, as we will rely on the Teacher to generate pseudo-labels to train the Student in the later stage. To do so, we first use the available supervised data to optimize our model θ with the supervised loss Lsup. With the supervised data Ds = {x s i , y s i } Ns i=1, the supervised loss of object detection consists of four losses: the RPN classification loss L rpn cls , the RPN regression loss L rpn reg , the ROI classification loss L roi cls , and the ROI regression loss L roi reg (Ren et al., 2015), Lsup = X i L rpn cls (x s i , y s i ) + L rpn reg (x s i , y s i ) + L roi cls (x s i , y s i ) + L roi reg(x s i , y s i ). (1) After Burn-In, we duplicate the trained weights θ for both the Teacher and the Student models (θt ← θ, θs ← θ). Starting from this trained detector, we further utilize the unsupervised data to improve the object detector via the following proposed training regimen.), and
calculating a second loss according to the multi-object detection result of the input image and the object pseudo label of the input image (see 3.2; Student Learning with Pseudo-Labeling. To address the lack of ground-truth labels for unsupervised data, we adapt the pseudo-labeling method to generate labels for training the Student with unsupervised data. This follows the principle of existing successful examples in semi-supervised image classification task (Lee, 2013; Sohn et al., 2020a). Similar to classification-based methods, to prevent the consecutively detrimental effect of noisy pseudo-labels (i.e., confirmation bias or error accumulation), we first set a confidence threshold δ of predicted bounding boxes to filter lowconfidence predicted bounding boxes, which are more likely to be false positive samples. While the confidence threshold method have achieved tremendous success in the image classification, it is however not sufficient for object detection. This is because there also exist duplicated box predictions and imbalanced prediction issues in the SS-OD (we leave the discussion of the imbalanced prediction issue in Sec. 3.3). To address the duplicated boxes prediction issue, we remove the repetitive predictions by applying class-wise non-maximum suppression (NMS) before the use of confidence thresholding as performed in STAC (Sohn et al., 2020b). In addition, noisy pseudo-labels can affect the pseudo-label generation model (Teacher). As a result, we detach the Student and the Teacher. To be more specific, after obtaining the pseudo-labels from the Teacher, only the learnable weights of the Student model is updated via back-propagation. θs ← θs + γ ∂(Lsup + λuLunsup) ∂θs , Lunsup = X i L rpn cls (x u i , yˆ u i ) + L roi cls (x u i , yˆ u i ) (2) Note that we do not apply unsupervised losses for the bounding box regression since the naive confidence thresholding is not able to filter the pseudo-labels that are potentially incorrect for bounding box regression (because the confidence of predicted bounding boxes only indicate the confidence of predicted object categories instead of the quality of bounding box locations (Jiang et al., 2018)).); and
updating the auxiliary detection model according to the first loss and the second loss, and updating the object detection model based on the auxiliary detection model that has been updated (see 3.2; Teacher Refinement via Exponential Moving Average. To obtain more stable pseudo-labels, we apply EMA to gradually update the Teacher model. The slowly progressing Teacher model can be regarded as the ensemble of the Student models in different training iterations. θt ← αθt + (1 − α)θs. (3) This approach has been shown to be effective in many existing works, e.g., ADAM optimization (Kingma & Ba, 2015), Batch Normalization (Ioffe & Szegedy, 2015), self-supervised learning (He et al., 2020; Grill et al., 2020), and SSL image classification (Tarvainen & Valpola, 2017), while we, for the first time, demonstrate its effectiveness also in alleviating pseudo-labeling bias issue for SS-OD (see next section).). Liu does not teach expressly an electronic device, comprising: one or more processors; and a storage apparatus on which one or more programs are stored, wherein the one or more programs, when executed by the one or more processors, enable the one or more processors to implement a training method of an object detection model, and the training method of an object detection model comprises.
Aoki, in the same field of endeavor, teaches an electronic device, comprising: one or more processors; and a storage apparatus on which one or more programs are stored, wherein the one or more programs, when executed by the one or more processors, enable the one or more processors to implement a training method of an object detection model, and the training method of an object detection model comprises (see 0061, 0062; The procedure described in the above exemplary embodiment can be realized by a program that causes a computer (9000 in FIG. 9) that functions as the prediction model generation apparatus or the index generation apparatus to realize the functions of these apparatus. Such a computer is exemplified by a configuration including a CPU (Central Processing Unit) 9010, a communication interface 9020, a memory 9030, and an auxiliary storage device 9040 as shown in FIG. 9. That is, the CPU 9010 in FIG. 9 may execute a pre-processing program, a machine learning program, and/or a post-processing program, and execute the update processing of the data stored in the auxiliary storage device 9040 or the like. Of course, an image processing processor called GPU (Graphics Processing Unit) may be used instead of the CPU 9010. [0062] That is, each part (processing means and functions) of the prediction model generation apparatus and the index generation apparatus explained in the above-described exemplary embodiments executes the above-described processes by using a hardware mounted on a processor mounted on these apparatuses can be realized by a computer program).
It would have been obvious (before the effective filing date of the claimed invention) or (at the time the invention was made) to one of ordinary skill in the art to modify Liu to utilize the cited limitations as suggested by Aoki. The suggestion/motivation for doing so would have been to reduce the burden of inspectors that review the videos (see 0013). Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in the manner explained above using known engineering design, interface and/or programming techniques, without changing a “fundamental” operating principle of Liu, while the teaching of Aoki continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result. It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.
Regarding claim 11, Liu discloses determining a preselected pseudo label of the input image based on the object detection model, wherein the preselected pseudo label corresponds to a detection box confidence; determining a first confidence threshold corresponding to each object category in the input image; and in response to the detection box confidence corresponding to the preselected pseudo label being greater than a first confidence threshold of a corresponding object category, retaining the preselected pseudo label, and determining the object pseudo label of the input image (see 3.2, fig. 8, 9).
Regarding claim 15, Liu teaches a method comprising: acquiring an input image, and determining an object pseudo label of the input image based on an object detection model, wherein the input image is labeled with a real label (see section 3, 3.2; label images …….Figure 3: Overview of Unbiased Teacher. Unbiased Teacher consists of two stages. Burn-In: we first train the object detector using available labeled data. Teacher-Student Mutual Learning consists of two steps. Student Learning: the fixed teacher generates pseudo-labels to train the Student, while Teacher and Student are given weakly and strongly augmented inputs, respectively. Teacher Refinement: the knowledge that the Student learned is then transferred to the slowly progressing Teacher via exponential moving average (EMA) on network weights. When the detector is trained until converge in the Burn-In stage, we switch to the Teacher-Student Mutual Learning stage. the Teacher generates pseudo-labels to train the Student, and the Student updates the knowledge it learned back to the Teacher; hence, the pseudo-labels used to train the Student itself are improved. Lastly, there exists class-imbalance and foreground-background imbalance problems in object detection, which impedes the effectiveness of semi-supervised techniques of image classification (e.g., pseudo-labeling) being used directly on SS-OD. Therefore, in Sec. 3.3, we also discuss how Focal loss (Lin et al., 2017b) and EMA training alleviate the imbalanced pseudo-label issue.);
acquiring a multi-object detection result of the input image based on an auxiliary detection model (see 3.1, fig. 3; It is important to have a good initialization for both Student and Teacher models, as we will rely on the Teacher to generate pseudo-labels to train the Student in the later stage. To do so, we first use the available supervised data to optimize our model θ with the supervised loss Lsup. With the supervised data Ds = {x s i , y s i } Ns i=1, the supervised loss of object detection consists of four losses: the RPN classification loss L rpn cls , the RPN regression loss L rpn reg , the ROI classification loss L roi cls , and the ROI regression loss L roi reg (Ren et al., 2015), Lsup = X i L rpn cls (x s i , y s i ) + L rpn reg (x s i , y s i ) + L roi cls (x s i , y s i ) + L roi reg(x s i , y s i ). (1) After Burn-In, we duplicate the trained weights θ for both the Teacher and the Student models (θt ← θ, θs ← θ). Starting from this trained detector, we further utilize the unsupervised data to improve the object detector via the following proposed training regimen.);
calculating a first loss according to the multi-object detection result of the input image and the real label of the input image (see section 3.1; It is important to have a good initialization for both Student and Teacher models, as we will rely on the Teacher to generate pseudo-labels to train the Student in the later stage. To do so, we first use the available supervised data to optimize our model θ with the supervised loss Lsup. With the supervised data Ds = {x s i , y s i } Ns i=1, the supervised loss of object detection consists of four losses: the RPN classification loss L rpn cls , the RPN regression loss L rpn reg , the ROI classification loss L roi cls , and the ROI regression loss L roi reg (Ren et al., 2015), Lsup = X i L rpn cls (x s i , y s i ) + L rpn reg (x s i , y s i ) + L roi cls (x s i , y s i ) + L roi reg(x s i , y s i ). (1) After Burn-In, we duplicate the trained weights θ for both the Teacher and the Student models (θt ← θ, θs ← θ). Starting from this trained detector, we further utilize the unsupervised data to improve the object detector via the following proposed training regimen.), and
calculating a second loss according to the multi-object detection result of the input image and the object pseudo label of the input image (see 3.2; Student Learning with Pseudo-Labeling. To address the lack of ground-truth labels for unsupervised data, we adapt the pseudo-labeling method to generate labels for training the Student with unsupervised data. This follows the principle of existing successful examples in semi-supervised image classification task (Lee, 2013; Sohn et al., 2020a). Similar to classification-based methods, to prevent the consecutively detrimental effect of noisy pseudo-labels (i.e., confirmation bias or error accumulation), we first set a confidence threshold δ of predicted bounding boxes to filter lowconfidence predicted bounding boxes, which are more likely to be false positive samples. While the confidence threshold method have achieved tremendous success in the image classification, it is however not sufficient for object detection. This is because there also exist duplicated box predictions and imbalanced prediction issues in the SS-OD (we leave the discussion of the imbalanced prediction issue in Sec. 3.3). To address the duplicated boxes prediction issue, we remove the repetitive predictions by applying class-wise non-maximum suppression (NMS) before the use of confidence thresholding as performed in STAC (Sohn et al., 2020b). In addition, noisy pseudo-labels can affect the pseudo-label generation model (Teacher). As a result, we detach the Student and the Teacher. To be more specific, after obtaining the pseudo-labels from the Teacher, only the learnable weights of the Student model is updated via back-propagation. θs ← θs + γ ∂(Lsup + λuLunsup) ∂θs , Lunsup = X i L rpn cls (x u i , yˆ u i ) + L roi cls (x u i , yˆ u i ) (2) Note that we do not apply unsupervised losses for the bounding box regression since the naive confidence thresholding is not able to filter the pseudo-labels that are potentially incorrect for bounding box regression (because the confidence of predicted bounding boxes only indicate the confidence of predicted object categories instead of the quality of bounding box locations (Jiang et al., 2018)).); and
updating the auxiliary detection model according to the first loss and the second loss, and updating the object detection model based on the auxiliary detection model that has been updated (see 3.2; Teacher Refinement via Exponential Moving Average. To obtain more stable pseudo-labels, we apply EMA to gradually update the Teacher model. The slowly progressing Teacher model can be regarded as the ensemble of the Student models in different training iterations. θt ← αθt + (1 − α)θs. (3) This approach has been shown to be effective in many existing works, e.g., ADAM optimization (Kingma & Ba, 2015), Batch Normalization (Ioffe & Szegedy, 2015), self-supervised learning (He et al., 2020; Grill et al., 2020), and SSL image classification (Tarvainen & Valpola, 2017), while we, for the first time, demonstrate its effectiveness also in alleviating pseudo-labeling bias issue for SS-OD (see next section).). Liu does not teach expressly a computer-readable storage medium on which a computer program is stored, wherein the computer program, when executed by a processor, causes the process to perform operations comprising.
Aoki, in the same field of endeavor, teaches a computer-readable storage medium on which a computer program is stored, wherein the computer program, when executed by a processor, causes the process to perform operations comprising (see 0061, 0062; The procedure described in the above exemplary embodiment can be realized by a program that causes a computer (9000 in FIG. 9) that functions as the prediction model generation apparatus or the index generation apparatus to realize the functions of these apparatus. Such a computer is exemplified by a configuration including a CPU (Central Processing Unit) 9010, a communication interface 9020, a memory 9030, and an auxiliary storage device 9040 as shown in FIG. 9. That is, the CPU 9010 in FIG. 9 may execute a pre-processing program, a machine learning program, and/or a post-processing program, and execute the update processing of the data stored in the auxiliary storage device 9040 or the like. Of course, an image processing processor called GPU (Graphics Processing Unit) may be used instead of the CPU 9010. [0062] That is, each part (processing means and functions) of the prediction model generation apparatus and the index generation apparatus explained in the above-described exemplary embodiments executes the above-described processes by using a hardware mounted on a processor mounted on these apparatuses can be realized by a computer program).
It would have been obvious (before the effective filing date of the claimed invention) or (at the time the invention was made) to one of ordinary skill in the art to modify Liu to utilize the cited limitations as suggested by Aoki. The suggestion/motivation for doing so would have been to reduce the burden of inspectors that review the videos (see 0013). Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in the manner explained above using known engineering design, interface and/or programming techniques, without changing a “fundamental” operating principle of Liu, while the teaching of Aoki continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result. It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.
Regarding claim 16, Liu discloses determining a preselected pseudo label of the input image based on the object detection model, wherein the preselected pseudo label corresponds to a detection box confidence; determining a first confidence threshold corresponding to each object category in the input image; and in response to the detection box confidence corresponding to the preselected pseudo label being greater than a first confidence threshold of a corresponding object category, retaining the preselected pseudo label, and determining the object pseudo label of the input image (see 3.2, fig. 8, 9)
Allowable Subject Matter
Claims 3-9, 12-14, 17-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Regarding claims 3, 12, 17, none of the references of record alone or in combination suggest or fairly teach determining a second confidence threshold corresponding to each object category in the input image, wherein the second confidence threshold is less than the first confidence threshold of the same object category; in response to the detection box confidence corresponding to the preselected pseudo label being greater than or equal to a second confidence threshold of a corresponding object category and less than or equal to the first confidence threshold of the same object category, taking the preselected pseudo label as an uncertain pseudo label; and in response to the detection box confidence corresponding to the preselected pseudo label being less than the second confidence threshold corresponding to each object category, taking the preselected pseudo label as a background pseudo label.
Regarding claims 4, 13, 18, none of the references of record alone or in combination suggest or fairly teach a first preselected pseudo label and a second preselected pseudo label; an object category corresponding to the first preselected pseudo label belongs to a first category, and an object category corresponding to the second preselected pseudo label belongs to a second category; a sample proportion of the first category is greater than a sample proportion of the second category; and a first confidence threshold of the object category corresponding to the first preselected pseudo label is greater than a first confidence threshold of the object category corresponding to the second preselected pseudo label.
Regarding claims 5, 14, 19, none of the references of record alone or in combination suggest or fairly teach wherein the determining the first confidence threshold corresponding to each object category in the input image comprises: calculating an entropy of the preselected pseudo label of the input image; calculating an average entropy of each object category in the input image according to the entropy of the preselected pseudo label; and calculating the first confidence threshold corresponding to each object category in the input image according to the average entropy of each object category in the input image.
Regarding claims 6-8, 20, none of the references of record alone or in combination suggest or fairly teach wherein the auxiliary detection model comprises a feature extraction network, and the method further comprises: acquiring a feature map of the input image extracted by the feature extraction network; inputting the feature map into a global classification module to acquire a global classification result of the input image; and acquiring a third loss according to the global classification result and a global classification label; and the updating the auxiliary detection model according to the first loss and the second loss comprises: updating the auxiliary detection model according to the first loss, the second loss and the third loss.
Regarding claim 9, none of the references of record alone or in combination suggest or fairly teach wherein acquiring the multi-object detection result of the input image based on the auxiliary detection model comprises: acquiring at least two datasets, wherein real labels of images in different datasets correspond to different object categories; determining any first image from the at least two datasets, and respectively calculating a similarity between the first image and any of remaining images in the at least two datasets except the first image; determining a preset number of second images satisfying a low similarity condition from the remaining images in the at least two datasets; synthesizing the first image and the preset number of second images to acquire a third image; and determining the first image as the input image, and inputting the third image into the auxiliary detection model to acquire the multi-object detection result of the input image.
Conclusion
Claims 1-2, 10-11, 15-16 are rejected. Claims 3-9, 12-14, 17-20 are objected to as being dependent upon a rejected base claim.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWARD PARK. The examiner’s contact information is as follows:
Telephone: (571)270-1576 | Fax: 571.270.2576 | Edward.Park@uspto.gov
For email communications, please notate MPEP 502.03, which outlines procedures pertaining to communications via the internet and authorization. A sample authorization form is cited within MPEP 502.03, section II.
The examiner can normally be reached on M-F 9-6 CST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Moyer, can be reached on (571) 272-9523. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/EDWARD PARK/
Primary Examiner, Art Unit 2666