DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending regarding this application.
Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in Korea on 08/29/2023. It is noted, however, that applicant has not filed a certified copy of the foreign application as required by 37 CFR 1.55. According to the Filing Receipt filed on 04/11/2024, the Office has not received a statement under 37 CFR 1.55 regarding application KR10-2023-0114013.
The Office website provides additional information concerning the priority document exchange program (www.uspto.gov/PatentsPDX). This information includes the intellectual property offices that participate in the priority document exchange program, as well as the information necessary for each participating foreign intellectual property office to provide the Office with access to the foreign application.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 2, 5, 6, 7, 9, 10-12, 15, 16, 17, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Azarian Yazdi et al. (U.S. Publication No. 2024/0078797 A1), hereinafter Azarian Yazdi in view of Kim et al. (KR 10-2353336 B1, see attached English translation for citations), hereinafter Kim.
Regarding claim 1, Azarian Yazdi teaches an apparatus (Azarian Yazdi teaches a “computing device or apparatus” in para. [0126] and FIG. 1A and 1B) comprising:
one or more processors (Azarian Yazdi teaches “one or more processors” within an apparatus in para. [0126]); and
memory storing instructions that, when executed by the one or more processors (Azarian Yazdi teaches “example system 800 includes at least one processing unit (CPU or processor) 810 and connection 805 that communicatively couples various system components including system memory 815” in para. [0129]), cause the apparatus to:
recognize, via an interest network, an object of interest corresponding to a predetermined class by learning an image provided from a vehicle (Azarian Yazdi teaches “the object detection engine 508 (and/or the keypoint detection engine 506, as discussed above) is configured to process the features from the feature extraction engine 504 to detect one or more objects in the input image 502” in para. [0100]; see also that “the segmentation mask may be configured to keep information only for pixels corresponding to one or more classes (e.g., for pixels classified as a person), isolating the selected classified pixels from other classified pixels (e.g., isolating person pixels from background pixels)” in para. [0028] wherein the segmentation mask may be implemented by the “the systems and techniques described herein may be implemented by any type of system or device. One illustrative example of a system that can be used to implement the systems and techniques described herein is a vehicle (e.g., an autonomous or semi-autonomous vehicle)” as shown in para. [0034]. See also FIG. 2A. Here, keeping information only for pixels corresponding to specific classes is interpreted as only recognizing objects corresponding to a predetermined class); and
obtain, via the interest network, one or more reliability scores indicating reliability with which the object of interest is recognized (Azarian Yazdi teaches that “a score is computed for an object’s set of keypoints collectively” in para. [0105] and “the system 500 can select bounding boxes from the object detection engine 508 for which the system 500 has high confidence (e.g., a confidence or objectness-score the object detection engine 508 estimates for each of the detected bounding box estimates is greater than or equal to a bounding box confidence threshold)” in para. [0106]);
perform, via an auxiliary network, a learning process associated with the image and detect, in the image, an occlusive object that affects a learning result of the interest network (Azarian Yazdi teaches “variants of mask-RCNN can be used that predict if a keypoint is visible or occluded” in para. [0116]; see also that “the keypoint detection engine 506 can predict if each keypoint is visible or occluded, which may lead to system improvement (e.g., generating more accurate pseudo-labels)” in para. [0105]). Here, the variant(s) of the mask-RCNN is interpreted as the auxiliary network. Additionally, the presence of an occluded object inherently implies the detection of an occlusive object); and
determine whether to label the image based on:
(Azarian Yazdi teach that “if a cell of the heatmap grid is greater than (or equal to in some cases) the individual keypoint score threshold, the cell of the heatmap can be used to generate a keypoint pseudo-label. For instance, a value of 1 can be added to the cell of the heatmap and used as a keypoint pseudo-label. Any grid that is determined to be less than (or equal to in some cases) can be disregarded and not used as a keypoint pseudo-label. In some cases, the keypoint detection engine 506 can predict if each keypoint is visible or occluded, which may lead to system improvement (e.g., generating more accurate pseudo-labels)” in para. [0105]; here, the individual keypoint score(s) are interpreted as the one or more reliability scores, and there inherently exists a scenario wherein parts of the detected object are apparent as shown in para. [0105]. See also para. [0106] which explains a method of determining whether to label the image based on scoring related to bounding boxes. Lastly, para. [0107] teaches replacing the ground truth with the above pseudo-labels in the training process).
Azarian Yazdi fails to specifically teach determine whether to label the image based on:
whether the occlusive object is detected.
However, Kim teaches determine whether to label the image based on: whether the occlusive object is detected (Kim teaches “when the recognition rate of the neural network model according to the identification result is less than a preset first threshold value, the processor 120 may identify an obstructive object related to the recognition rate of the neural network model among a plurality of objects” in para. [0050], but “if the recognition rate of the neural network model according to the identification result is greater than or equal to a preset first threshold value, it can be said that the port structure included in the first image is recognized at a satisfactory level, so do not perform the additional neural network model learning process” in para. [0051]. Here, while labelling may occur in the additional neural network model learning process, the process of determining to perform further learning/processing based on the detection of an obstructive (occlusive) object can be combined with the pseudo-labelling process as taught by Azarian Yazdi to teach the above claim limitation. Here, both the variant of the mask-RCNN as taught by Azarian Yazdi in para. [0116] and the additional neural network learning process as taught by Kim in para. [0051] may be interpreted as the claimed auxiliary network).
Azarian Yazdi and Kim are both considered to be analogous to the claimed invention because they are in the same field of determining whether to utilize images as training data. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Azarian Yazdi to incorporate the teachings of Kim and “determine whether to label the image based on: whether the occlusive object is detected”. The motivation for doing so would have been “to improve the recognition rate of the neural network model”, as suggested by Kim in para. [0052]. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Azarian Yazdi with Kim to obtain the invention specified in claim 1.
Regarding claim 2, Azarian Yazdi and Kim teach the apparatus of claim 1, wherein the instructions, when executed by the one or more processors, cause the apparatus to determine whether to label the image based on the one or more reliability scores by:
determining whether to label the image based on:
a first reliability score indicating accuracy of segmentation of the image (Azarian Yazdi “the keypoint pseudo-label generation engine 620 may utilize individual keypoint thresholding (e.g., a keypoint scoring above a score threshold is accepted) or object-based keypoints thresholding (e.g., a score is computed for an object's set of keypoints collectively). In some examples, object-based keypoints thresholding can include person-based keypoints thresholding, such as using a score computed for all keypoints of a person class collectively” in para. [0105]), and
a second reliability score indicating accuracy of detecting the object of interest being output into a bounding box (Azarian Yazdi teaches a process of determining bounding boxes for the object(s) in para. [0100] wherein a confidence score is determined for the bounding boxes as shown in para. [0106], wherein “the keypoint pseudo-label generation engine 620 and/or the bounding box pseudo-label generation engine 622 may include a function (e.g., a neural network, a software program, or other function) or separate functions that are configured (e.g., trained etc.) to generate pseudo-labels” in para. [0104]).
Regarding claim 5, Azarian Yazdi and Kim teach the apparatus of claim 2, wherein the instructions, when executed by the one or more processors, cause the apparatus to determine whether to label the image by:
determining the first reliability score and the second reliability score based on recognition of the occlusive object through learning of the auxiliary network (Kim teaches “the processor 120 calculates a score related to each of the identification result according to the neural network model, the information obtained through the sensor 130 and the arrangement relationship between the plurality of objects, and based on the weighted sum of the calculated scores” in para. [0060]. Here, Kim teaches at least a first and second score. Azarian Yazdi additionally teaches the first and second reliability scores and the auxiliary network as taught in claims 1 and 2); and
determining to label the image for learning of the auxiliary network, based on at least one of the first reliability score or the second reliability score being less than a threshold value (While Azarian Yazdi teaches assigning pseudo-labels to images based on at least one of the first reliability score or the second reliability score being greater than a threshold value (which are then used as ground truth labels) in para. [0105]-[0107], Kim teaches determining to “perform the additional neural network model learning process” based on a first score being less than a threshold value. Here, both the variant of the mask-RCNN as taught by Azarian Yazdi in para. [0116] and the additional neural network learning process as taught by Kim in para. [0051] may be interpreted as the claimed auxiliary network). Similar motivations as applied to claim 1 can be applied here to claim 5.
Regarding claim 6, Azarian Yazdi and Kim teach the apparatus of claim 2, wherein the instructions, when executed by the one or more processors, cause the apparatus to determine whether to label the image by:
excluding the image from training data of the interest network and the auxiliary network based on the first reliability score and the second reliability score being greater than or equal to a threshold value (Azarian Yazdi teaches “any grid that is determined to be less than (or equal to in some cases) can be disregarded and not used as a keypoint pseudo-label” in para. [0105] (first reliability score). Here, the data being disregarded may be an entire image in some cases. Additionally, para. [0106] recites that objects that are erroneously detected are not given pseudo-labels (second reliability score). Additionally, since only images with pseudo-labels may are used to train the models (see para. [0107], it is inherent that an image with no pseudo-labels is excluded from training data of the system (which includes the interest/auxiliary network(s)). Additionally, Kim specifically teaches excluding an image from the training data if it is greater than a preset first threshold value in para. [0050]-[0051]). Similar motivations as applied to claim 1 can be applied here to claim 6.
Regarding claim 7, Azarian Yazdi and Kim teach the apparatus of claim 2, wherein the instructions, when executed by the one or more processors, cause the apparatus to determine whether to label the image by:
determining the first reliability score based on the occlusive object being not detected by learning of the auxiliary network (Kim teaches “the processor 120 calculates a score related to each of the identification result according to the neural network model, the information obtained through the sensor 130 and the arrangement relationship between the plurality of objects, and based on the weighted sum of the calculated scores” in para. [0060]. Here, Kim teaches at least a first score. Azarian Yazdi additionally teaches the first reliability score and the auxiliary network as taught in claims 1 and 2); and
determining to label the image for learning of the auxiliary network based on the first reliability score being less than a threshold value (While Azarian Yazdi teaches assigning pseudo-labels to images based on at least one of the first reliability score or the second reliability score being greater than a threshold value (which are then used as ground truth labels) in para. [0105]-[0107], Kim teaches determining to “perform the additional neural network model learning process” based on a first score being less than a threshold value in para [0051]. Here, both the variant of the mask-RCNN as taught by Azarian Yazdi in para. [0116], the labelling process as taught by Azarian Yazdi in para. [0105]-[0107], and the additional neural network learning process as taught by Kim in para. [0051] may be combined to cover the scope of the claimed learning of the auxiliary network). Similar motivations as applied to claim 1 can be applied here to claim 7.
Regarding claim 9, Azarian Yazdi and Kim teach the apparatus of claim 2, wherein the instructions, when executed by the one or more processors, further cause the apparatus to determine whether to label the image by:
determine to label the image for learning the interest network based on the first reliability score and the second reliability score being greater than or equal to a threshold value (Azarian Yazdi teaches “the original ground truth values (or labels) used in the loss function may be replaced with the pseudo-labels” in para. [0107] wherein the pseudo-labels are only determined based on both the first and second reliability scores being greater than a threshold as shown in para. [0105] and [0106], respectively).
Regarding claim 10, Azarian Yazdi and Kim teach the apparatus of claim 2, wherein the instructions, when executed by the one or more processors, further cause the apparatus to
determine whether the occlusive object is detected based on the first reliability score and the second reliability score being less than a threshold value (Azarian Yazdi teaches the first and second reliability score (see claim 2). Kim additionally teaches determining whether an occlusive object is detected based on a reliability score being less than a threshold value in para. [0050]. Here, the teaching of two reliability scores as taught by Azarian Yazdi can be combined with Kim’s teaching of determining whether an occlusive object is detected based on a reliability score being less than a threshold value to teach the above limitation). Similar motivations as applied to claim 1 can be applied here to claim 10.
Regarding claim 11, Azarian Yazdi teaches a method comprising:
recognizing, via an interest network, an object of interest corresponding to a predetermined class by learning an image from a vehicle (Azarian Yazdi teaches “the object detection engine 508 (and/or the keypoint detection engine 506, as discussed above) is configured to process the features from the feature extraction engine 504 to detect one or more objects in the input image 502” in para. [0100]; see also that “the segmentation mask may be configured to keep information only for pixels corresponding to one or more classes (e.g., for pixels classified as a person), isolating the selected classified pixels from other classified pixels (e.g., isolating person pixels from background pixels)” in para. [0028] wherein the segmentation mask may be implemented by the “the systems and techniques described herein may be implemented by any type of system or device. One illustrative example of a system that can be used to implement the systems and techniques described herein is a vehicle (e.g., an autonomous or semi-autonomous vehicle)” as shown in para. [0034]. See also FIG. 2A. Here, keeping information only for pixels corresponding to specific classes is interpreted as only recognizing objects corresponding to a predetermined class);
obtaining, via the interest network, one or more reliability scores with which the object of interest is recognized (Azarian Yazdi teaches that “a score is computed for an object’s set of keypoints collectively” in para. [0105] and “the system 500 can select bounding boxes from the object detection engine 508 for which the system 500 has high confidence (e.g., a confidence or objectness-score the object detection engine 508 estimates for each of the detected bounding box estimates is greater than or equal to a bounding box confidence threshold)” in para. [0106]);
performing, via an auxiliary network, a learning process associated with the image and detecting, in the image, an occlusive object that affects a learning result of the interest network (Azarian Yazdi teaches “variants of mask-RCNN can be used that predict if a keypoint is visible or occluded” in para. [0116]; see also that “the keypoint detection engine 506 can predict if each keypoint is visible or occluded, which may lead to system improvement (e.g., generating more accurate pseudo-labels)” in para. [0105]). Here, the variant(s) of the mask-RCNN is interpreted as the auxiliary network. Additionally, the presence of an occluded object inherently implies the detection of an occlusive object); and
determining whether to label the image based on:
(Azarian Yazdi teach that “if a cell of the heatmap grid is greater than (or equal to in some cases) the individual keypoint score threshold, the cell of the heatmap can be used to generate a keypoint pseudo-label. For instance, a value of 1 can be added to the cell of the heatmap and used as a keypoint pseudo-label. Any grid that is determined to be less than (or equal to in some cases) can be disregarded and not used as a keypoint pseudo-label. In some cases, the keypoint detection engine 506 can predict if each keypoint is visible or occluded, which may lead to system improvement (e.g., generating more accurate pseudo-labels)” in para. [0105]; here, the individual keypoint score(s) are interpreted as the one or more reliability scores, and there inherently exists a scenario wherein parts of the detected object are apparent as shown in para. [0105]. See also para. [0106] which explains a method of determining whether to label the image based on scoring related to bounding boxes. Lastly, para. [0107] teaches replacing the ground truth with the above pseudo-labels in the training process).
Azarian Yazdi fails to specifically teach determine whether to label the image based on:
whether the occlusive object is detected.
However, Kim teaches determine whether to label the image based on: whether the occlusive object is detected (Kim teaches “when the recognition rate of the neural network model according to the identification result is less than a preset first threshold value, the processor 120 may identify an obstructive object related to the recognition rate of the neural network model among a plurality of objects” in para. [0050], but “if the recognition rate of the neural network model according to the identification result is greater than or equal to a preset first threshold value, it can be said that the port structure included in the first image is recognized at a satisfactory level, so do not perform the additional neural network model learning process” in para. [0051]. Here, while labelling may occur in the additional neural network model learning process, the process of determining to perform further learning/processing based on the detection of an obstructive (occlusive) object can be combined with the pseudo-labelling process as taught by Azarian Yazdi to teach the above claim limitation. Here, both the variant of the mask-RCNN as taught by Azarian Yazdi in para. [0116] and the additional neural network learning process as taught by Kim in para. [0051] may be interpreted as the claimed auxiliary network).
Azarian Yazdi and Kim are both considered to be analogous to the claimed invention because they are in the same field of determining whether to utilize images as training data. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Azarian Yazdi to incorporate the teachings of Kim and “determine whether to label the image based on: whether the occlusive object is detected”. The motivation for doing so would have been “to improve the recognition rate of the neural network model”, as suggested by Kim in para. [0052]. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Azarian Yazdi with Kim to obtain the invention specified in claim 11.
Regarding claim 12, Azarian Yazdi and Kim teach the method of claim 11, wherein the recognizing of the object of interest comprises:
determining a first reliability score indicating accuracy of segmentation (Azarian Yazdi “the keypoint pseudo-label generation engine 620 may utilize individual keypoint thresholding (e.g., a keypoint scoring above a score threshold is accepted) or object-based keypoints thresholding (e.g., a score is computed for an object's set of keypoints collectively). In some examples, object-based keypoints thresholding can include person-based keypoints thresholding, such as using a score computed for all keypoints of a person class collectively” in para. [0105]); and
determining a second reliability score indicating accuracy of detecting the object of interest being output into a bounding box (Azarian Yazdi teaches a process of determining bounding boxes for the object(s) in para. [0100] wherein a confidence score is determined for the bounding boxes as shown in para. [0106], wherein “the keypoint pseudo-label generation engine 620 and/or the bounding box pseudo-label generation engine 622 may include a function (e.g., a neural network, a software program, or other function) or separate functions that are configured (e.g., trained etc.) to generate pseudo-labels” in para. [0104]).
Regarding claim 15, Azarian Yazdi and Kim teach the method of claim 12, wherein the determining of whether to label the image comprises:
determining the first reliability score and the second reliability score based on recognition of the occlusive object through learning of the auxiliary network (Kim teaches “the processor 120 calculates a score related to each of the identification result according to the neural network model, the information obtained through the sensor 130 and the arrangement relationship between the plurality of objects, and based on the weighted sum of the calculated scores” in para. [0060]. Here, Kim teaches at least a first and second score. Azarian Yazdi additionally teaches the first and second reliability scores and the auxiliary network as taught in claims 11 and 12); and
determining to label the image for learning of the auxiliary network, based on at least one of the first reliability score or the second reliability score being less than a threshold value (While Azarian Yazdi teaches assigning pseudo-labels to images based on at least one of the first reliability score or the second reliability score being greater than a threshold value (which are then used as ground truth labels) in para. [0105]-[0107], Kim teaches determining to “perform the additional neural network model learning process” based on a first score being less than a threshold value. Here, both the variant of the mask-RCNN as taught by Azarian Yazdi in para. [0116] and the additional neural network learning process as taught by Kim in para. [0051] may be interpreted as the claimed auxiliary network). Similar motivations as applied to claim 11 can be applied here to claim 15.
Regarding claim 16, Azarian Yazdi and Kim teach the method of claim 12, wherein the determining of whether to label the image comprises:
excluding the image from training data of the interest network and the auxiliary network based on the first reliability score and the second reliability score being greater than or equal to a threshold value (Azarian Yazdi teaches “any grid that is determined to be less than (or equal to in some cases) can be disregarded and not used as a keypoint pseudo-label” in para. [0105] (first reliability score). Here, the data being disregarded may be an entire image in some cases. Additionally, para. [0106] recites that objects that are erroneously detected are not given pseudo-labels (second reliability score). Additionally, since only images with pseudo-labels may are used to train the models (see para. [0107], it is inherent that an image with no pseudo-labels is excluded from training data of the system (which includes the interest/auxiliary network(s)). Additionally, Kim specifically teaches excluding an image from the training data if it is greater than a preset first threshold value in para. [0050]-[0051]). Similar motivations as applied to claim 11 can be applied here to claim 16.
Regarding claim 17, Azarian Yazdi and Kim teach the method of claim 12, wherein the determining of whether to label the image comprises:
determining the first reliability score based on the occlusive object being not detected by learning of the auxiliary network (Kim teaches “the processor 120 calculates a score related to each of the identification result according to the neural network model, the information obtained through the sensor 130 and the arrangement relationship between the plurality of objects, and based on the weighted sum of the calculated scores” in para. [0060]. Here, Kim teaches at least a first score. Azarian Yazdi additionally teaches the first reliability score and the auxiliary network as taught in claims 1 and 2); and
determining to label the image for learning of the auxiliary network based on the first reliability score being less than a threshold value (While Azarian Yazdi teaches assigning pseudo-labels to images based on at least one of the first reliability score or the second reliability score being greater than a threshold value (which are then used as ground truth labels) in para. [0105]-[0107], Kim teaches determining to “perform the additional neural network model learning process” based on a first score being less than a threshold value in para [0051].. Here, both the variant of the mask-RCNN as taught by Azarian Yazdi in para. [0116] and the additional neural network learning process as taught by Kim in para. [0051] may be interpreted as the claimed auxiliary network). Similar motivations as applied to claim 11 can be applied here to claim 17.
Regarding claim 19, Azarian Yazdi and Kim teach the method of claim 12, wherein the determining of whether to label the image comprises:
determining to label the image for learning the interest network based on the first reliability score and the second reliability score being greater than or equal to a threshold value (Azarian Yazdi teaches “the original ground truth values (or labels) used in the loss function may be replaced with the pseudo-labels” in para. [0107] wherein the pseudo-labels are only determined based on both the first and second reliability scores being greater than a threshold as shown in para. [0105] and [0106], respectively).
Regarding claim 20, Azarian Yazdi and Kim teach the method of claim 12,
wherein the determining of the first reliability score and the second reliability score is performed before the detecting of the occlusive object (Azarian Yazdi teaches the first and second reliability score in claim 12, wherein in the case of Kim, Kim teaches calculating a reliability score before comparing said “score” with a threshold to determine if there exists an occlusive object in para. [0050]), and
wherein the detecting of the occlusive object comprise detecting the occlusive object based on the first reliability score and the second reliability score being less than a threshold value (Azarian Yazdi teaches the first and second reliability score (see claim 2). Kim additionally teaches determining whether an occlusive object is detected based on a reliability score being less than a threshold value in para. [0050]. Here, the teaching of two reliability scores as taught by Azarian Yazdi can be combined with Kim’s teaching of determining whether an occlusive object is detected based on a reliability score being less than a threshold value to teach the above limitation). Similar motivations as applied to claim 11 can be applied here to claim 20.
Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Azarian Yazdi et al. (U.S. Publication No. 2024/0078797 A1), hereinafter Azarian Yazdi in view of Kim et al. (KR 10-2353336 B1, see attached English translation for citations), hereinafter Kim, Tonogai et al. (U.S. Publication No. 2023/0364795 A1), hereinafter Tonogai and *THE INVENTOR HAS WAIVED THE RIGHT TO BE CITED* (applicant: Hiscene Shanghai Information Tech Co LTD) (CN 108765481 A, see attached English translation), hereinafter Hiscene.
Regarding claim 3, Azarian Yazdi and Kim teach the apparatus of claim 2.
Azarian Yazdi and Kim fail to teach to obtain entropy of each pixel of the image; determine, based on the entropy of each pixel of the image, a representative entropy value; and determine the first reliability score, wherein the first reliability score is inversely proportional to the representative entropy value.
However, Tonogai teaches to obtain entropy of each pixel of the image (Tonogai teaches that “the entropy of a certain range is the sum or the representative value (the average, median, or mode) of entropy of all voxels in the range” in para. [0105]; here, it is inherent that a range could include an entire image);
determine, based on the entropy of each pixel of the image, a representative entropy value (Tonogai teaches “the entropy of a certain range is the sum or the representative value (the average, median, or mode) of entropy of all voxels in the range” in para. [0105]. Here, the average/median/mode entropy is interpreted as equivalent to the representative entropy value).
Azarian Yazdi, Kim, and Tonogai are all considered to be analogous to the claimed invention because they are in the same field of object detection in images. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Azarian Yazdi (as modified by Kim) to incorporate the teachings of Tonogai and include “to obtain entropy of at least one bounding box in the image; determine, based on the entropy of each pixel of the image, a representative entropy value”. The motivation for doing so would have been to “indicat[e] the reliability of determination as to whether an object is included in the range” in order to determine whether to expand the viewing range, as suggested by Tonogai in para. [0113]. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Azarian Yazdi and Kim with Tonogai to obtain the invention specified in the above claim limitations.
Azarian Yazdi, Kim, and Tonogai fail to teach determine the first reliability score, wherein the first reliability score is inversely proportional to the representative entropy value.
However, Hiscene teaches determine the first reliability score, wherein the first reliability score is inversely proportional to the representative entropy value (Hiscene teaches “that the initial uncertainty u(x) is inversely proportional to the initial confidence c(x), that is, the higher the initial confidence, the lower the initial uncertainty calculated” in para. [0062]. Here, it would have been obvious to combine the averaging method as taught by Tonogai with the inversely proportional relationship as taught by Hiscene to teach a representative entropy value which is inversely proportional to the confidence (reliability score). See also Azarian Yazdi’s teaching of the first reliability score in claim 2).
Azarian Yazdi, Kim, Tonogai, and Hiscene are all considered to be analogous to the claimed invention because they are in the same field of object detection in images. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Azarian Yazdi (as modified by Kim and Tonogai) to incorporate the teachings of Hiscene and include “determine the first reliability score, wherein the first reliability score is inversely proportional to the representative entropy value”. The motivation for doing so would have been to “increase the precision of prediction of depth map, and [] obtain the uncertainty distribution of depth map”, as suggested by Hiscene in the abstract. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Azarian Yazdi, Kim, and Tonogai with Hiscene to obtain the invention specified in claim 3.
Regarding claim 13, Azarian Yazdi and Kim teach the method of claim 12.
While Azarian Yazdi teaches determining the first reliability score (see claim 12), Azarian Yazdi and Kim fail to teach wherein the determining of the first reliability score comprises: obtaining entropy of each pixel of the image; determining, based on the entropy of each pixel of the image, a representative entropy value; and determining the first reliability score, wherein the first reliability score is inversely proportional to the representative entropy value.
However, Tonogai teaches obtaining entropy of each pixel of the image (Tonogai teaches that “the entropy of a certain range is the sum or the representative value (the average, median, or mode) of entropy of all voxels in the range” in para. [0105]; here, it is inherent that a range could include an entire image);
determining, based on the entropy of each pixel of the image, a representative entropy value (Tonogai teaches “the entropy of a certain range is the sum or the representative value (the average, median, or mode) of entropy of all voxels in the range” in para. [0105]. Here, the average/median/mode entropy is interpreted as equivalent to the representative entropy value).
Azarian Yazdi, Kim, and Tonogai are all considered to be analogous to the claimed invention because they are in the same field of object detection in images. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Azarian Yazdi (as modified by Kim) to incorporate the teachings of Tonogai and include “obtaining entropy of at least one bounding box in the image; determining, based on the entropy of each pixel of the image, a representative entropy value”. The motivation for doing so would have been to “indicat[e] the reliability of determination as to whether an object is included in the range” in order to determine whether to expand the viewing range, as suggested by Tonogai in para. [0113]. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Azarian Yazdi and Kim with Tonogai to obtain the invention specified in the above claim limitations.
Azarian Yazdi, Kim, and Tonogai fail to teach determining the first reliability score, wherein the first reliability score is inversely proportional to the representative entropy value.
However, Hiscene teaches determining the first reliability score, wherein the first reliability score is inversely proportional to the representative entropy value (Hiscene teaches “that the initial uncertainty u(x) is inversely proportional to the initial confidence c(x), that is, the higher the initial confidence, the lower the initial uncertainty calculated” in para. [0062]. Here, it would have been obvious to combine the averaging method as taught by Tonogai with the inversely proportional relationship as taught by Hiscene to teach a representative entropy value which is inversely proportional to the confidence (reliability score). See also Azarian Yazdi’s teaching of the first reliability score in claim 12).
Azarian Yazdi, Kim, Tonogai, and Hiscene are all considered to be analogous to the claimed invention because they are in the same field of object detection in images. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Azarian Yazdi (as modified by Kim and Tonogai) to incorporate the teachings of Hiscene and include “determining the first reliability score, wherein the first reliability score is inversely proportional to the representative entropy value”. The motivation for doing so would have been to “increase the precision of prediction of depth map, and [] obtain the uncertainty distribution of depth map”, as suggested by Hiscene in the abstract. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Azarian Yazdi, Kim, and Tonogai with Hiscene to obtain the invention specified in claim 13.
Claims 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Azarian Yazdi et al. (U.S. Publication No. 2024/0078797 A1), hereinafter Azarian Yazdi in view of Kim et al. (KR 10-2353336 B1, see attached English translation for citations), hereinafter Kim and Gil (U.S. Publication No. 2022/0165068 A1).
Regarding claim 4, Azarian Yazdi and Kim teach the apparatus of claim 2.
Azarian Yazdi and Kim fail to teach to obtain entropy of at least one bounding box in the image; and determine the second reliability score, wherein the second reliability score is inversely proportional to the entropy of the at least one bounding box.
However, Gil teaches to obtain entropy of at least one bounding box in the image (Gil, see para. [0069] and FIGs. 4A, 4B, 5A, and 5B, wherein an uncertainty is determined for a bounding box. Here, the uncertainty is interpreted as equivalent to the claimed entropy); and
determine the second reliability score, wherein the second reliability score is inversely proportional to the entropy of the at least one bounding box (Gil teaches that “the uncertainty value is inversely proportional to the first and second probabilities” in claim 4, wherein the uncertainty is “an uncertainty value for each of the estimated center point of the bounding box and the region outside the center point” as shown in claim 1, and “the classifier may calculate a probability of whether the object in each aligned ROI belongs to a specific class” as shown in para. [0052]. See also Azarian Yazdi’s teaching of the first reliability score in claim 2).
Azarian Yazdi, Kim, and Gil are all considered to be analogous to the claimed invention because they are in the same field of determining whether to utilize images as training data. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Azarian Yazdi (as modified by Kim) to incorporate the teachings of Gil and include “to obtain entropy of at least one bounding box in the image; and determine the second reliability score, wherein the second reliability score is inversely proportional to the entropy of the at least one bounding box”. The motivation for doing so would have been “determining whether the travel is possible based on an area having the uncertainty within a preset region of interest or a maximum magnitude of the uncertainty value” and to avoid a situation wherein, “when recognition of the object is not properly achieved during autonomous driving because of the weather, an edge case, and the like, that is, when it is uncertain whether the object displayed in the image filmed through the camera 100 is actually the object or not, an accident may be caused with incorrect vehicle control”, as suggested by Gil in para. [0024] and para. [0068], respectively. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Azarian Yazdi and Kim with Gil to obtain the invention specified in claim 4.
Regarding claim 14, Azarian Yazdi and Kim teach the method of claim 12,
While Azarian Yazdi teaches the second reliability score (see claim 12), Azarian Yazdi and Kim fail to teach wherein the determining of the second reliability score comprises: obtaining entropy of at least one bounding box in the image; and determining the second reliability score, wherein the second reliability score is inversely proportional to the entropy of the at least one bounding box.
However, Gil teaches obtaining entropy of at least one bounding box in the image (Gil, see para. [0069] and FIGs. 4A, 4B, 5A, and 5B, wherein an uncertainty is determined for a bounding box. Here, the uncertainty is interpreted as equivalent to the claimed entropy); and
determining the second reliability score, wherein the second reliability score is inversely proportional to the entropy of the at least one bounding box (Gil teaches that “the uncertainty value is inversely proportional to the first and second probabilities” in claim 4, wherein the uncertainty is “an uncertainty value for each of the estimated center point of the bounding box and the region outside the center point” as shown in claim 1, and “the classifier may calculate a probability of whether the object in each aligned ROI belongs to a specific class” as shown in para. [0052]. See also Azarian Yazdi’s teaching of the first reliability score in claim 12).
Azarian Yazdi, Kim, and Gil are all considered to be analogous to the claimed invention because they are in the same field of determining whether to utilize images as training data. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Azarian Yazdi (as modified by Kim) to incorporate the teachings of Gil and include “obtaining entropy of at least one bounding box in the image; and determining the second reliability score, wherein the second reliability score is inversely proportional to the entropy of the at least one bounding box”. The motivation for doing so would have been “determining whether the travel is possible based on an area having the uncertainty within a preset region of interest or a maximum magnitude of the uncertainty value” and to avoid a situation wherein, “when recognition of the object is not properly achieved during autonomous driving because of the weather, an edge case, and the like, that is, when it is uncertain whether the object displayed in the image filmed through the camera 100 is actually the object or not, an accident may be caused with incorrect vehicle control”, as suggested by Gil in para. [0024] and para. [0068], respectively. Therefore, it would have been obvious to one of ordinary skill at the time the invention was filed to combine Azarian Yazdi and Kim with Gil to obtain the invention specified in claim 14.
Allowable Subject Matter
Claims 8 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter.
The best prior art of record is Azarian Yazdi, Kim, Garrett et al. (U.S. Publication No. 2020/0103274 A1), hereinafter Garrett, and Gil (U.S. Publication No. 2022/0165068 A1). Prior art applied alone or in combination with fails to anticipate or render obvious claims 8 and 18.
Claim 8
Regarding claim 8, Azarian Yazdi and Kim teach the apparatus of claim 2.
Azarian Yazdi further teaches the first and second reliability score (see claims 1 and 2).
However, neither Azarian Yazdi, nor Kim, neither Garrett, nor Gil, nor a combination teaches to determine the second reliability score based on the first reliability score being greater than or equal to a threshold value; and [to] determine to add a new class of the interest network based on the second reliability score being less than the threshold value.
Similar analysis as applied to claim 8 can be applied to corresponding claim 18.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner
should be directed to KYLA G ALLEN whose telephone number is (703)756-5315. The examiner can
normally be reached M-F 7:30am - 4:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a
USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use
the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor,
John Villecco can be reached on (571) 272-7319. The fax phone number for the organization where this
application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from
Patent Center. Unpublished application information in Patent Center is available to registered users. To
file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit
https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and
https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional
questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like
assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or
571-272-1000.
/Kyla Guan-Ping Tiao Allen/
Examiner, Art Unit 2661
/JOHN VILLECCO/Supervisory Patent Examiner, Art Unit 2661