DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 1, 11 and therefore claims 2–10 and 12–19 which depend therefrom are objected to because of the following informalities: claims 1 and 11 recite “a set of detected objects” then later “each detected object” thus individually referencing each detected object in the set and providing antecedent basis therein, but later recite “a detected object in the set of detected objects,” and thus should instead recite “one of the detected objects in the set of detected objects.” Appropriate correction is required.
Statement on Subject Matter Eligibility under 35 USC § 101
The examiner finds that the independent claims 1, 11 and 20 (and therefore the dependent claim thereof) are subject matter eligible under 35 U.S.C. 101 for the following reasoning. While the claims are generally directed towards image processing for object detection, and do not recite technical details of how the claimed machine-learning model, software detector and processors (claims 1 and 11) or detector (claim 20) perform the disclosed method steps, nonetheless, the recited method steps are not those practically performed by the human mind. For example, the human mind does not practically determine a similarity metric between a template and a detected object. Object detection by discerning qualitatively against a reference involves an abstract idea, however when specifically considering a similarity metric (quantitative) between a template and a detected object, in combination with all of the other limitations of claims 1, 11 and 20 which are directed towards technical differences between a machine vision approach of object detection versus that practically performed in the human mind, then claims 1, 11 and 20 considered as a whole do not recite abstract ideas.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1–4, 6, 8–14, 16, and 18–19 are rejected under 35 U.S.C. 103 as being unpatentable over Hao et al., US Patent Application Publication No. US 2023/0177800 A1 (herein “Hao”) in view of Pulijala et al., US Patent Application Publication No. US 2025/0069399 A1 (herein “Pulijala”).
Regarding claims 1 and 11, with claim 1 as exemplary, substantive differences between claims 1 and 11 noted in curly brackets {}, and with deficiencies of Hao noted in square brackets [], Hao teaches {a method – claim 1 / a system – claim 11} for user-assisted object detection, the {method/system} comprising (Hao fig. 6, ¶158, Abstract, object recognition process including receiving user input regarding a candidate object detection by a robot):
{at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations comprising – claim 11 only, (Hao ¶¶217–218, 221–223, object recognition apparatus performing the disclosed object recognition method disclosed via processors and a memory storing code to execute the method by the processor)}
receiving an input image (Hao ¶157, robot camera obtains an image including an object);
performing object detection by a software detector to identify a set of detected objects (Hao ¶157, robot performs object recognition based on a first eigenvector of the object to obtain a reference label (identify) of the object, where ¶110 teaches the detected objects to be one or more objects in the image), the software detector including a machine-learning model (Hao ¶¶116–119, image is input into a multi-layer neural network to obtain eigenvectors, a bounding box and eventually a subclass label of the object);
outputting one or more indicators of the set of detected objects, each detected object in the set of detected objects being associated with a confidence level (Hao ¶¶157–158, robot notifies the user through voice what it guesses the object is (one or more indicators), based on a first confidence level of the label);
receiving a user input (Hao ¶158, user responds to robot saying “Wrong, it is A” or “This is A” where A is label for the pictured object);
identifying a template including an image portion associated with the user input (Hao ¶¶ 157, if the confidence level of the handheld object is lower than a preset confidence level, then an incremental feature database is accessed to determine (identifying) an eigenvector (template) and corresponding label where the eigenvector has a confidence level that is higher than the preset confidence level);
determining a similarity metric between the template and a detected object in the set of detected objects (Hao ¶140, a reference label of the object is determined by calculating a distance (similarity metric) between the first eigenvector of the pictured object (detected object) and each first eigenvector in a first feature database, where the first eigenvector in the first feature database (template) is selected that has the highest confidence level based on the distance);
modifying [a confidence level] of the detected object, based at least in part on the determined similarity metric (Hao ¶158, when the robot incorrectly recognizes an object due to the selected eigenvector and label thereof being incorrect and corrected by the user, a second eigenvector is obtained from different angles of the object, and the label provided by the User is given to the second eigenvector (associated with the user input), and both are saved into the incremental feature database, thus modifying the entry in the incremental feature database for the object); and
generating an output including an indicator of the object, based at least in part on the modified [confidence level] (Hao ¶159, in this way, when the next recognition (resulting in a recognition result) is performed (based at least in part on the modified), the label is obtained from the incremental feature database, where ¶¶146–147 teaching that the recognition result is presented on a display or notified to the user through voice (generating an output)),
{wherein the method is performed using one or more processors – claim 1 only (Hao ¶¶217–218, 221–223, object recognition apparatus performing the disclosed object recognition method disclosed via processors and a memory storing code to execute the method by the processor)}.
While Hao teaches updating eigenvectors and labels in a database that are used to calculate confidence values for object detection, Hao does not explicitly teach the confidence levels being modified. That is, the confidence level for an object provided in the object detection system does essentially feature a confidence level modification by way of the changed eigenvector which is used to find future confidence levels, but Hao does not explicitly teach modifying a confidence level.
Pulijala teaches modifying a confidence level (Pulijala ¶86, an initial confidence factor determined from a preliminary identification is modified to a value that is the average of additionally acquired information, including feedback from the user who utters the name of the object).
Therefore taking the teachings of Hao and Pulijala together as a whole, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the confidence level of Hao explicitly as taught by Pulijala at least because doing so would improve the reliability of identification of objects and overcome the problem of not being able to identify objects within an environment to an adequate degree of confidence to reliably identify them for certain applications. See Pulijala ¶12.
Regarding claims 2 and 12, with claim 2 being exemplary, Hao teaches wherein the determining a similarity metric comprises determining the similarity metric using a similarity machine-learning model (Hao ¶140, determining a distance (similarity metric) is between two eigenvectors, where ¶153 teaches the eigenvectors are output from a last pooling layer of an object recognition model (similarity machine-learning model)).
Regarding claims 3 and 13, with claim 3 being exemplary, Hao teaches wherein the determining a similarity metric comprises determining a similarity distance between the template and the image portion associated with the user input (Hao ¶140, distance (similarity distance) is determined between the eigenvector from the user input image and object detected therein versus eigenvectors stored in a feature database of known eigenvectors for various object labels, where the eigenvectors are a type of template defining features characteristic of a particular object).
Regarding claims 4 and 14, with claim 4 being exemplary, Hao teaches wherein: the user input includes an indication of a missed object that is not in the set of detected objects (Hao ¶158, user provides feedback telling the robot it is wrong, and what the object label is, or in the alternative, the robot tells the user it does not recognize the object “missed object” and the user tells the robot what the label should be “This is A”); and the method further comprises adding the missed object to the set of detected objects (Hao ¶158, robot subsequently saves (adding) into the incremental feature database the eigenvector and user supplied label for the object, thus including the object now into the se of detected objects for whatever objects were detected via having a matching eigenvector already in the database).
Regarding claims 6 and 16, with claim 6 being exemplary, Hao teaches wherein: the user input includes an indication of one object in the set of detected objects being a valid detection (Hao ¶158, the user responds “Correct” to the robot’s presentation of object detection results of “I guess it is B, right?”); and the template is designated as a positive template (Hao ¶21, the label of an identified object (positive ID on the object) is obtained from a matching eigenvector (template) in the incremental recognition database).
Regarding claims 8 and 18, with claim 8 being exemplary, Hao teaches further comprising adding the template to a template library (Hao ¶158, a determined first eigenvector is saved into the incremental feature database along with the label for the object it represents).
Regarding claims 9 and 19, with claim 9 being exemplary, Hao teaches wherein the [modifying a confidence level] of the object comprises: identifying a new object that is not in the set of detected objects (Hao ¶158, user provides feedback telling the robot it is wrong, and what the object label is, or in the alternative, the robot tells the user it does not recognize the object, thus a new object not in the robots set of detected objects, and the user tells the robot what the label should be “This is A”); adding the new object to the set of detected objects (Hao ¶158, robot subsequently saves (adding) into the incremental feature database the eigenvector and user supplied label for the object, thus including the object now into the se of detected objects for whatever objects were detected via having a matching eigenvector already in the database); and determining the confidence level associated with the new object based at least in part on the template (Hao ¶¶158, and 140, a new eigenvector is determined from additional angled photos taken of the object that is not recognized, and where in the new/second eigenvector determination process, a new confidence level is determined, where the second/new eigenvector and corresponding confidence level is prompted to be found due to the first eigenvector not being accurate enough to match with one already existing in the database, thus “based at least in part on the first template”).
Hao does not explicitly teach, but Pulijala teaches modifying a confidence level (Pulijala ¶86, an initial confidence factor determined from a preliminary identification is modified to a value that is the average of additionally acquired information, including feedback from the user who utters the name of the object).
Therefore taking the teachings of Hao and Pulijala together as a whole, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the confidence level of Hao explicitly as taught by Pulijala at least because doing so would improve the reliability of identification of objects and overcome the problem of not being able to identify objects within an environment to an adequate degree of confidence to reliably identify them for certain applications. See Pulijala ¶12.
Regarding claim 10, Hao teaches wherein the image is a first image in a sequence of images, wherein the template is a first template (Hao ¶¶157–158, first eigenvector is determined initially for the object from a first image), and wherein the method further comprises predicting a second template for a second image subsequent to the first image based at least in part on the first template (Hao ¶¶157–158, when the robot incorrectly recognizes the object based on an imprecise first eigenvector having been determined, and realizes this through the user feedback, the robot guides the user to flip the object and obtain a plurality of frames of images of the object at different angles, from which a second eigenvector is obtained).
Claims 5 and 15, are rejected under 35 U.S.C. 103 as being unpatentable over Hao in view of Pulijala, and further in view of Pugdeethosapol et al., "Automatic Image Labeling with Click Supervision on Aerial Images," 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 2020, pp. 1-8, doi: 10.1109/IJCNN48605.2020.9207363 (herein “Pugdeethosapol”).
Regarding claims 5 and 15, with claim 5 as exemplary, and with deficiencies of Hao notes in square brackets [], Hao teaches wherein: [the user input includes a boundary drawn by a user]; the image portion is an image portion within the boundary (Hao ¶¶111–112, an initial bounding box is obtained for each of the one or more detected objects in the image); and the template is designated as a positive template (Hao ¶21, the label of an identified object (positive ID on the object) is obtained from a matching eigenvector (template) in the incremental recognition database).
Hao as modified above by Pulijala does not explicitly teach, where Pugdeethosapol teaches the user input includes a boundary drawn by a user (Pugdeethosapol page 4, section B. Adjustment Model, a user manually adjusts a bounding box around an object to be detected).
Therefore taking the teachings of Hao as modified by Pulijala and Pugdeethosapol together as a whole, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the bounding box of Hao to be user input as disclosed by Pugdeethosapol at least because doing so would allow for the bounding box to be more precisely defined over the object and adjust the detection model to be more accurate. See Pugdeethosapol, page 4, section B. Adjustment Model.
Claims 7, 17 and 20, are rejected under 35 U.S.C. 103 as being unpatentable over Hao in view of Pulijala, and further in view of Wu et al., "Semi-Automatically Labeling Objects in Images," in IEEE Transactions on Image Processing, vol. 18, no. 6, pp. 1340-1349, June 2009, doi: 10.1109/TIP.2009.2017360, (herein “Wu”).
Regarding claims 7 and 17, with claim 7 being exemplary, and with deficiencies of Hao noted in square brackets [], Hao teaches wherein: the user input includes an indication of one object in the set of detected objects being an invalid detection (Hao ¶158, the user responds “Wrong, it is “A”” to the robot’s presentation of object detection results of “I guess it is B, right?”); and [the template is designated as a negative template].
Hao as modified above by Pulijala does not, but Wu teaches the template is designated as a negative template (Wu page 1344, section V. SmartLabel-2, negative samples are sampled from the input image by detecting uninformative areas and sampling negative patches in the image, then obtaining a background map (template) of these uninformative (negative) regions).
Therefore, taking the teachings of Hao as modified above by Pulijala and Wu together as whole, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the eigenvector storage disclosed in Hao to include negative samples as well as disclosed in Wu at least because doing so would improve performance of an object detection classifier by providing training samples including both positive samples and negative samples. See Wu page 1344, right column, first paragraph under section A.
Regarding claim 20, with deficiencies of Hao noted in square brackets [], Hao teaches a method for user-assisted object detection, the method comprising (Hao fig. 6, ¶158, Abstract, object recognition process including receiving user input regarding a candidate object detection by a robot):
receiving an input image (Hao ¶157, robot camera obtains an image including an object);
performing object detection, by a detector, to identify a set of detected objects comprising one or more detected objects (Hao ¶157, robot performs object recognition based on a first eigenvector of the object to obtain a reference label (identify) of the object, where ¶110 teaches the detected objects to be one or more objects in the image);
outputting one or more indicators of the one or more detected objects, each detected object of the set of detected objects being associated with a confidence level (Hao ¶¶157–158, robot notifies the user through voice what it guesses the object is (one or more indicators), based on a first confidence level of the label);
receiving a user input that indicates a missed object that is not in the set of detected objects (Hao ¶158, user responds to robot saying “Wrong, it is A” or “This is A” where A is label for the pictured object that was not detected by the robot (missed object)); and
adding [an image portion] associated with the user input as a template to a template library (Hao ¶158, a determined first eigenvector is saved into the incremental feature database along with the label for the object it represents);
scanning the input image, using the template, to update the set of detected objects (Hao ¶¶110–112, and 116, bounding box regression including a continuous panning and zooming (scanning), where the bounding box regression is performed on the first eigenvector (using the template), the bounding box regression being used to identify objects in the image);
determining one or more similarities between the template and one or more detected objects of the updated set of detected objects (Hao ¶¶158, and 140, a new eigenvector is determined from additional angled photos taken of the object that is not recognized, and where in the new/second eigenvector determination process, a new confidence level is determined, where the second/new eigenvector and corresponding confidence level is prompted to be found due to the first eigenvector not being accurate enough to match with one already existing in the database, thus “based at least in part on the first template”);
modifying [one or more confidence levels of the confidence levels] associated with the detected objects of the set of detected objects, based at least in part on the one or more determined similarities (Hao ¶158, when the robot incorrectly recognizes an object due to the selected eigenvector and label thereof being incorrect and corrected by the user, a second eigenvector is obtained from different angles of the object, and the label provided by the User is given to the second eigenvector (associated with the user input), and both are saved into the incremental feature database, thus modifying the entry in the incremental feature database for the object); and
generating an output including one or more indicators of the one or more [modified confidence levels] and their respective one or more detected objects in the set of detected objects (Hao ¶159, in this way, when the next recognition (resulting in a recognition result) is performed (based at least in part on the modified), the label is obtained from the incremental feature database, where ¶¶146–147 teaching that the recognition result is presented on a display or notified to the user through voice (generating an output)).
While Hao teaches updating eigenvectors and labels in a database that are used to calculate confidence values for object detection, Hao does not explicitly teach the confidence levels being modified or modified confidence levels. That is, the confidence level for an object provided in the object detection system does essentially feature a confidence level modification by way of the changed eigenvector which is used to find future confidence levels, but Hao does not explicitly teach modifying a confidence level.
Further, while Hao teaches updating a feature database including signature/template data corresponding to object recognition, Hao does not explicitly state “an image portion” interpreted to mean a section of the image itself, as being added to the database.
Pulijala teaches modifying one or more confidence levels and the one or more modified confidence levels (Pulijala ¶86, an initial confidence factor determined from a preliminary identification is modified to a value that is the average of additionally acquired information, including feedback from the user who utters the name of the object).
Wu teaches adding “an image portion” to a set of object recognition data that serves in a template/signature capacity (Wu page 1344, section V. SmartLabel-2, samples are sampled from the input image by detecting uninformative areas and sampling negative patches (an image portion) in the image, then obtaining a background map (template) of these uninformative regions).
Therefore taking the teachings of Hao and Pulijala together as a whole, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the confidence level of Hao explicitly as taught by Pulijala at least because doing so would improve the reliability of identification of objects and overcome the problem of not being able to identify objects within an environment to an adequate degree of confidence to reliably identify them for certain applications. See Pulijala ¶12.
Further, taking the teachings of Hao as modified above by Pulijala and Wu together as whole, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the eigenvector storage disclosed in Hao to include negative samples as well as disclosed in Wu at least because doing so would improve performance of an object detection classifier by providing training samples including both positive samples and negative samples. See Wu page 1344, right column, first paragraph under section A.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Sabini et al., US 2023/0136672 A1, directed towards using templates for object recognition.
Soble et al., US 2023/0114066 A1, directed towards using templates in medical imaging.
Bossard et al., US 2022/0392209 A1, directed towards object recognition and determining confidence levels including requesting user feedback to improve confidence levels.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M KOETH whose telephone number is (571)272-5908. The examiner can normally be reached Monday-Thursday, 09:00-17:00, EDT/EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached at 571-272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
MICHELLE M. KOETH
Primary Examiner
Art Unit 2671
/MICHELLE M KOETH/Primary Examiner, Art Unit 2671