Last updated: April 19, 2026
Application No. 17/586,360
OBJECT RECOGNITION SYSTEMS AND METHODS

Final Rejection §103
Filed
Jan 27, 2022
Examiner
CONNER, SEAN M
Art Unit
2663
Tech Center
2600 — Communications
Assignee
Grubbrr Spv LLC
OA Round
4 (Final)
Interview Optional

— +27.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 454 resolved cases, 2023–2026
Examiner Intelligence

CONNER, SEAN M View full profile →
Grants 79% — above average
Career Allow Rate
357 granted / 454 resolved
+16.6% vs TC avg
Strong +27% interview lift
Without
With
+27.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
22 currently pending
Career history
476
Total Applications
across all art units
Statute-Specific Performance

§101
11.5%
-28.5% vs TC avg
§103
47.9%
+7.9% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
21.1%
-18.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 454 resolved cases
Office Action

§103
DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . The Amendment filed 30 December 2025 (hereinafter “the Amendment”) has been entered and considered. Claims 1, 8, 10, 12, 14, and 20 have been amended. Claims 1-20, all the claims pending in the application, are rejected. All new grounds of rejection set forth in the present action were necessitated by Applicant’s claim amendments; accordingly, this action is made final. 

Response to Amendment
Claim Objections
On page 1 of the Remarks of the Amendment, Applicant contends that the objections have been obviated through amendment. The Examiner partially agrees and partially disagrees.
In view of the amendment to independent claim 1, the objection to this claim and its independent claims are overcome. However, the amendment to claim 1 necessitates a further objection detailed below. 
Independent claim 8 has not been amended to overcome the prior objection, as the claim continues to recite “the presence and location data” interchangeably with “the presence and the location data”. The Examiner recommends amending one of these recitations in the claim (lines 11 and 12) to be consistent with the other. This objection is maintained. 
Claim 12 has been amended to recite “the second single output feed image”. Initially, this is inconsistent with the language of claim 8 which recites “single image output feed” (the term “image” immediately follows the term “single” unlike in claim 12). Additionally, as noted in the previous action, there is insufficient antecedent basis for a “second” image in the claims. The Examiner recommends either amending claim 12 to change its dependency from claim 8 to claim 10 (which does recite “a second single image output feed”) or amending claim 12 to recite “[[the]] a second single image output feed 
In view of the amendment to independent claim 14, the objection to this claim and its independent claims are overcome. However, the amendment to claim 14 necessitates a further objection detailed below.

Prior Art Rejections
In view of the claim amendments to the independent claims, the previously applied prior art rejections are withdrawn. Applicant’s arguments are rendered moot in view of the new grounds of rejection set forth below. 


Claim Objections
Claims 1-20 are objected to because of the following informalities:
Independent claim 1 has been amended to recite “wherein the single image output feed that includes the plurality of objects oriented on the horizontal surface” which should be further amended to recite “wherein the single image output feed 

Claims 2-7 inherit the above informality by virtue of their dependency on claim 1. 

Also, Claim 4 recites “the moving image” which has been canceled from independent claim 1, from which claim 4 depends. Accordingly, this limitation lacks antecedent basis in the claim and will be interpreted as reciting “the single image output feed”. 

Independent claim 8 twice recites “the moving image” in both lines 10 and 12. However, the first recitation of this term has been canceled from the claim and replaced with “a single image output feed”. Accordingly, this limitation lacks antecedent basis in the claim and will be interpreted as reciting “the single image output feed”. 

Also, independent claim 8 recites “A method for comprising” which should be amended to recite “A method [[for]] comprising” for clarity of language. 

Additionally, independent claim 8 recites “detecting, at the second location, the presence and the location data” (line 11) and later recites “utilizing the moving image and the presence and location data” (line 12). For consistency of language, the Examiner recommends either amending the former to recite “detecting, at the second location, the presence and [[the]] location data” or the latter to recite “utilizing the moving image and the presence and the location data”.

Claims 9-13 are objected to by virtue of their dependency on claim 8.

Also, claim 12 recites “the second single output feed image”. Although claim 10 recites “a second single image output feed”, claim 12 is directly dependent on claim 8. Therefore, there is insufficient antecedent basis for the limitation recited in claim 12.

Independent claim 14 recites “the moving image” in line 13. However, the first recitation of this term has been canceled from the claim and replaced with “a single image output feed”. Accordingly, this limitation lacks antecedent basis in the claim and will be interpreted as reciting “the single image output feed”. 

Claims 15-20 are objected to by virtue of their dependency on claim 14. 

Also, claim 17 recites “the moving image” in line 2. However, the first recitation of this term has been canceled from claim 14 (from which claim 17 depends) and replaced with “a single image output feed”. Accordingly, “the moving image” lacks antecedent basis in the claim and will be interpreted as reciting “the single image output feed”. 

Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 8-10, 12-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication No. 2023/0186266 to Johnson et al. (hereinafter “Johnson”) in view of “MetaSearch: Incremental Product Search via Deep Meta-Learning” by Wang et al. (hereinafter “Wang”) and further in view of U.S. Patent Application Publication No. 2022/0083959 to Skaff et al. (hereinafter “Skaff”).

As to independent claim 1, Johnson discloses a method (Abstract and Fig. 2A, reproduced and annotated below, discloses that Johnson is directed to a “self-checkout kiosk” 200) comprising: providing a substantially horizontal surface; receiving a plurality of objects oriented by a user on the horizontal surface ([0017] and Fig. 2A discloses that the “self-checkout kiosk 200 includes a scan area 203 where a customer can arrange items”, wherein the scan area 203 is shown to be a horizontal surface); using only a single image sensor to capture a single image output feed of the horizontal surface, wherein the single image output feed that includes the plurality of objects oriented on the horizontal surface ([0017, 0037] discloses only a single “camera 211 that has a field of view 213 on the scan area 203”, wherein the camera 211 “may continuously capture video in the scanning area”); processing the single image output feed to detect presence data for each of the plurality of objects; classifying the plurality of objects ([0037-0040] discloses using the “camera to detect objects (e.g., produce objects) in real-time captured imagery, and identify a produce type or a produce brand of each detected produce object”); and updating a machine learning model with classification data generated by classifying the plurality of objects ([0024] discloses that the classification is performed by a “neural network” which is trained on “an ongoing basis” by prompting a user to confirm the classification output by the neural network). 

    PNG
    media_image1.png
    572
    556
    media_image1.png
    Greyscale

Johnson does not expressly disclose detecting location data, wherein the location data includes a pair of rectangular coordinates identifying a position of a corresponding object on the horizontal surface. Also, Johnson does not expressly disclose utilizing the image and the presence and location data to create individual representations of the plurality of objects or that the classifying is performed through employment of the individual representations. 
Wang, like Johnson, is directed to an “automatic checkout system” containing “a product capturing platform” shown to be substantially horizontal and having a plurality of products placed thereon (Section IV and Fig. 7, reproduced and annotated below). Wang discloses analyzing images of the products using “object detection models” for “product detection” which necessarily identify presence and location data in order to arrive at “product images cropped by the detection model” (Section IV). Wang further shows that the “query images” extracted from the “product image” are localized in a cropped bounding “box”, each including an individual representation of a product for query (Fig. 14). Indeed, the “online search” stage of the disclosed framework similarly shows that each cropped “query image” is an individual representation of a product for query (Fig. 2). Wang discloses using the cropped product images for “product search” using the MetaSearch framework which, “given a query image” such as the cropped product image, identifies a “maximum classification score” therefor (Sections III(D), IV). Finally, Wang discloses that, when a retailer wants to add new products in their store, they can take pictures of the new products and upload them to the system such that “the searching model can be efficiently updated by the MetaSearch framework” that performs the classification (Section IV). 
That is, Wang discloses providing a substantially horizontal surface; receiving a plurality of objects oriented by a user on the horizontal surface (Section IV and Fig. 7 (reproduced and annotated below) disclose an “automatic checkout system” containing “a product capturing platform” shown to be substantially horizontal and having a plurality of products placed thereon); using a single image sensor to capture an image of the horizontal surface, wherein the image includes the plurality of objects oriented on the horizontal surface (Section IV and Fig. 7 disclose “cameras to take photos for the products put on” the product capturing platform; using plural “cameras” requires the use of a single one of them); processing the image to detect presence data and location data for each of the plurality of objects (Section IV discloses using “object detection models” for “product detection” which necessarily identify presence and location data in order to arrive at “product images cropped by the detection model”; Fig. 14 further shows that the “query images” extracted from the “product image” are localized in a cropped bounding “box”); utilizing the image and the presence and location data to create individual representations of the plurality of objects (Section IV discloses “product images cropped by the detection model” which presupposes the use of presence and location data; Fig. 14 further shows that the “query images” extracted from the “product image” are localized in a cropped bounding “box”, each including an individual representation of a product for query; the “online search” stage of the disclosed framework in Fig. 2 similarly shows that each cropped “query image” is an individual representation of a product for query); classifying the plurality of objects through employment of the individual representations (Section IV discloses using the cropped product images for “product search” using the MetaSearch framework; more specifically, Section III(D) discloses that, “given a query image” such as the cropped product image, the MetaSearch framework identifies a “maximum classification score” therefor); and updating a machine learning model with classification data generated by classifying the plurality of objects (Section IV discloses that, when a retailer wants to add new products in their store, they can take pictures of the new products and upload them to the system such that “the searching model can be efficiently updated by the MetaSearch framework” that performs the classification).

    PNG
    media_image2.png
    602
    827
    media_image2.png
    Greyscale

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Johnson to use a MetaSearch framework which detects location data of the imaged objects, utilizes the image and presence and location data of the detected objects to create cropped sub-images of the plurality of objects, and performs the classifying through employment of the cropped sub-images, as taught by Wang, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have achieved “high search accuracy for both base categories and new products from few-shot samples” (Section I of Wang).
Skaff, like Johnson, is directed to image-based product detection in a retail environment (Abstract and [0040, 0062-0068]). Similar to Wang, Skaff discloses that products in images are localized by a bounding box and extracted for input to a trained product classifier (Abstract and [0040, 0062-0068]). In particular, Skaff discloses a product detector 402 which produces a product image with a bounding box around each product therein, each bounding box being “represented…as a tuple of data of the form BB= {x, y, w, h}”, wherein x and y are the coordinates of a corner of the bounding box, and w and h respectively represent the width and height thereof ([0065]). That is, Skaff discloses that the location data includes a pair of rectangular coordinates identifying a position of a corresponding object on the horizontal surface ([0065]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the proposed combination of Johnson and Wang to represent the bounding boxes of each product localized in the product image by the product detector using a pair (x, y) of rectangular coordinates identifying a position of the product in the image, as taught by Skaff, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have provided a means of accurately extracting each product for input to the classifier ([0068] of Skaff).

As to claim 2, Johnson as modified above further teaches that the single image sensor is a video camera ([0023, 0037] of Johnson discloses that the single camera 211 may be “a digital video camera”).

As to claim 3, Johnson as modified by Wang and Skaff further teaches that the machine learning model is a deep learning model ([0024-0025] of Johnson discloses that the model is a neural network; Section III and Fig. 2 of Wang discloses that the model is a deep convolutional neural network CNN; the reasons for combining the references are the same as those discussed above in conjunction with claim 1).

As to claim 4, Johnson as modified by Wang and Skaff further teaches that utilizing comprises cropping each of the plurality of objects from the single image output feed to create the individual representations (Section IV of Wang discloses “product images cropped by the detection model”; [0023, 0037] of Johnson discloses that the single camera 211 may be “a digital video camera” that captured video; the reasons for combining the references are the same as those discussed above in conjunction with claim 1).

As to claim 5, Johnson as modified by Wang and Skaff further teaches that updating comprises adding classification information for the plurality of objects to a pre-existing machine learning model (Section III and Fig. 2 of Wang discloses the “baseline model” (pre-existing) and “base class” database are updated to a novel model by training on a “novel class” as part of the updating; the reasons for combining the references are the same as those discussed above in conjunction with claim 1).
As to independent claim 8, Johnson discloses a method (Abstract and Fig. 2A, reproduced and annotated below, discloses that Johnson is directed to a “self-checkout kiosk” 200) for comprising: providing a substantially horizontal surface; receiving a plurality of objects oriented by a user on the horizontal surface ([0017] and Fig. 2A discloses that the “self-checkout kiosk 200 includes a scan area 203 where a customer can arrange items”, wherein the scan area 203 is shown to be a horizontal surface); using only a single image sensor, at a first location, to capture a single image output feed that includes the plurality of objects ([0017, 0037] discloses only a single “camera 211 that has a field of view 213 on the scan area 203”, wherein the camera 211 “may continuously capture video in the scanning area”, wherein the store location at which the self-checkout kiosk is located corresponds to the claimed first location); processing the single image output feed to detect presence data for each of the plurality of objects; classifying the plurality of objects ([0037-0040] discloses using the “camera to detect objects (e.g., produce objects) in real-time captured imagery, and identify a produce type or a produce brand of each detected produce object”); updating a machine learning model with classification data generated by classifying the plurality of objects ([0024] discloses that the classification is performed by a “neural network” which is trained on “an ongoing basis” by prompting a user to confirm the classification output by the neural network). 

    PNG
    media_image1.png
    572
    556
    media_image1.png
    Greyscale


Johnson does not expressly disclose detecting location data, wherein the location data includes a pair of rectangular coordinates identifying a position of a corresponding object on the horizontal surface. Also, Johnson does not expressly disclose sending the moving image over a network to a second location; detecting, at the second location, the presence and the location data of the plurality of objects; utilizing the moving image and the presence and location data to create individual representations of the plurality of objects; sending the machine learning model over the network to the first location or that the classifying is performed through employment of the individual representations. 
Wang, like Johnson, is directed to an “automatic checkout system” containing “a product capturing platform” shown to be substantially horizontal and having a plurality of products placed thereon (Section IV and Fig. 7, reproduced and annotated below). Wang discloses analyzing images of the products using “object detection models” for “product detection” which necessarily identify presence and location data in order to arrive at “product images cropped by the detection model” (Section IV). Wang further shows that the “query images” extracted from the “product image” are localized in a cropped bounding “box”, each including an individual representation of a product for query (Fig. 14). Indeed, the “online search” stage of the disclosed framework similarly shows that each cropped “query image” is an individual representation of a product for query (Fig. 2). Wang discloses using the cropped product images for “product search” using the MetaSearch framework which, “given a query image” such as the cropped product image, identifies a “maximum classification score” therefor (Sections III(D), IV). Finally, Wang discloses that, when a retailer wants to add new products in their store, they can take pictures of the new products and upload them to the system such that “the searching model can be efficiently updated by the MetaSearch framework” that performs the classification, wherein the updated system is “deployed” (i.e., sent) to the store (Section IV). 
That is, Wang discloses providing a substantially horizontal surface; receiving a plurality of objects oriented by a user on the horizontal surface (Section IV and Fig. 7 (reproduced and annotated below) disclose an “automatic checkout system” containing “a product capturing platform” shown to be substantially horizontal and having a plurality of products placed thereon); using a single image sensor, at a first location, to capture an image that includes the plurality of objects (Section IV and Fig. 7 disclose “cameras to take photos for the products put on” the product capturing platform, wherein the first location is the “store” at which the platform, camera, and checkout system are located; using plural “cameras” requires the use of a single one of them); processing the image to detect presence data and location data for each of the plurality of objects (Section IV discloses using “object detection models” for “product detection” which necessarily identify presence and location data in order to arrive at “product images cropped by the detection model”; Fig. 14 further shows that the “query images” extracted from the “product image” are localized in a cropped bounding “box”); sending the image over a network to a second location (Section IV discloses that the “cameras then transmit the photos to the back-end system for product detection, which can be achieved by the off-the-shelf object detection models”); detecting, at the second location, the presence and the location data of the plurality of objects (Section IV discloses using “object detection models” for “product detection” which necessarily identify presence and location data in order to arrive at “product images cropped by the detection model”, wherein the “object detection models” are located at “the back-end system” corresponding the claimed second location; Fig. 14 further shows that the “query images” extracted from the “product image” are localized in a cropped bounding “box”); utilizing the image and the presence and location data to create individual representations of the plurality of objects (Section IV discloses “product images cropped by the detection model” which presupposes the use of presence and location data; Fig. 14 further shows that the “query images” extracted from the “product image” are localized in a cropped bounding “box”, each including an individual representation of a product for query; the “online search” stage of the disclosed framework in Fig. 2 similarly shows that each cropped “query image” is an individual representation of a product for query); classifying the plurality of objects through employment of the individual representations (Section IV discloses using the cropped product images for “product search” using the MetaSearch framework; more specifically, Section III(D) discloses that, “given a query image” such as the cropped product image, the MetaSearch framework identifies a “maximum classification score” therefor); updating a machine learning model with classification data generated by classifying the plurality of objects; and sending the machine learning model over the network to the first location (Section IV discloses that, when a retailer wants to add new products in their store, they can take pictures of the new products and upload them to the system such that “the searching model can be efficiently updated by the MetaSearch framework” that performs the classification, wherein the updated system is “deployed” (i.e., sent) to the store).

    PNG
    media_image2.png
    602
    827
    media_image2.png
    Greyscale

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Johnson to use a MetaSearch framework which detects location data of the imaged objects, utilizes the image and presence and location data of the detected objects to create cropped sub-images of the plurality of objects, and performs the classifying through employment of the cropped sub-images, wherein the framework is trained remotely at a back-end system which deploys the trained model to the store, as taught by Wang, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification of adopting the MetaSearch framework would have achieved “high search accuracy for both base categories and new products from few-shot samples” (Section I of Wang). It is further predictable that proposed modification of remote training at a back-end system would have reduced the need for additional computation resources at the store. 
Skaff, like Johnson, is directed to image-based product detection in a retail environment (Abstract and [0040, 0062-0068]). Similar to Wang, Skaff discloses that products in images are localized by a bounding box and extracted for input to a trained product classifier (Abstract and [0040, 0062-0068]). In particular, Skaff discloses a product detector 402 which produces a product image with a bounding box around each product therein, each bounding box being “represented…as a tuple of data of the form BB= {x, y, w, h}”, wherein x and y are the coordinates of a corner of the bounding box, and w and h respectively represent the width and height thereof ([0065]). That is, Skaff discloses that the location data includes a pair of rectangular coordinates identifying a position of a corresponding object on the horizontal surface ([0065]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the proposed combination of Johnson and Wang to represent the bounding boxes of each product localized in the product image by the product detector using a pair (x, y) of rectangular coordinates identifying a position of the product in the image, as taught by Skaff, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have provided a means of accurately extracting each product for input to the classifier ([0068] of Skaff).

As to claim 9, Johnson as modified by Wang and Skaff further teaches loading the machine learning model at a user terminal of a point-of-sale system at the first location (Section IV of Wang discloses that the updated model is “deployed” in an automatic checkout system at the retailer’s “store”; the reasons for combining the references are the same as those discussed above in conjunction with claim 8).

As to claim 10, Johnson as modified above further teaches capturing a second single image output feed of a second plurality of objects at the point-of-sale system; using the machine learning model to identify the second plurality of objects; creating a checkout cart including the second plurality of objects; and enabling the customer to purchase the second plurality of objects through the checkout cart ([0011, 0017, 0037] of Johnson discloses that the camera 211 “may continuously capture video in the scanning area”, wherein the self-checkout kiosk is a point-of-sale “POS” system; [0024-0025, 0037-0040] of Johnson discloses using the neural network to analyze the images to detect objects and identify a produce type or a produce brand of each detected produce object; [0048-0056] of Johnson discloses adding the items to “a list of items to be purchased” by the customer). 

As to claim 12, Johnson as modified by Wang and Skaff further teaches that the second single output feed image is a two-dimensional image ([0023, 0037] of Johnson discloses that the camera is a video camera that captures video of the products; Figs. 2 and 14 of Wang show that the product images are 2D images; the reasons for combining the references are the same as those discussed above in conjunction with claim 8).

As to claim 13, Johnson as modified above further teaches that the single image sensor is a video camera ([0023, 0037] of Johnson discloses that the camera is a video camera that captures video of the products).


Independent claim 14 recites an apparatus comprising: a substantially horizontal surface to receive a plurality of objects oriented by a user ([0017] and Fig. 2A of Johnson discloses that the “self-checkout kiosk 200 includes a scan area 203 where a customer can arrange items”, wherein the scan area 203 is shown to be a horizontal surface); a processor; and a memory coupled with the processor, the memory comprising executable instructions that when executed by the processor cause the processor to effectuate operations ([0032-0034] of Johnson discloses processing circuitry 301 which executes “program instructions stored in memory” to “perform the corresponding functions described” in the reference) comprising the steps recited in independent claim 1. Accordingly, claim 14 is rejected for reasons analogous to those discussed above in conjunction with claim 1 mutatis mutandis.

Claims 15-18 and 20 recite features nearly identical to those recited in claims 2-5 and 12, respectively. Accordingly, claims 15-18 and 20 are rejected for reasons analogous to those discussed above in conjunction with claims 2-5 and 12, respectively.


Claims 6, 11, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Johnson in view of Wang and Skaff and further in view of U.S. Patent No. 11,481,751 to Chaubard et al. (hereinafter “Chaubard”).

As to claim 6, Johnson as modified by Wang and Skaff further teaches displaying a two-dimensional image on an output display device (Fig. 7 of Wang shows a monitor displaying a 2D image thereon). However, the proposed combination of Wang, Johnson and Skaff does not expressly disclose that the image is of each of the objects or using the individual representations to draw a boundary around each of the plurality of objects on the output display device. 
Chaubard, like Johnson, is directed to a “retail store automated checkout system [that] uses images…to recognize products being purchased” (Abstract). Chaubard discloses that the system includes a monitor that displays what the computer vision algorithm is seeing, showing boxes around each of the products in the image (col. 3, lines 16-22).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the proposed combination of Johnson, Wang and Skaff to display, on the monitor, the product image being analyzed by the machine-learning algorithm with bounding boxes around each product, as taught by Chaubard, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have given “the cashier or customer visual feedback…which can make them more accurate and faster at checking out”, as taught by Chaubard (col. 3, lines 16-22). 

Each of claims 11 and 19 (though broader in scope than claim 6) recites features similar to those recited in claim 6. Accordingly, claims 11 and 19 are rejected for reasons analogous to those discussed above in conjunction with claim 6 mutatis mutandis.


Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Johnson in view of Wang and Skaff and further in view of U.S. Patent Application Publication No. 2022/0414899 to Datar et al. (hereinafter “Datar”).
As to claim 7, the proposed combination of Johnson, Wang and Skaff does not expressly disclose determining whether or not the plurality of objects are oriented such that individual representations of each of the plurality of objects can be created; and if individual representations cannot be created, then prompting the user to reorient the plurality of objects. 
Datar, like Johnson, is directed to a system which “allow[s] users to checkout parts or supplies by themselves”, the system including a platform 202 on which products 204 are placed and cameras 108 for imaging the products, wherein each product is extracted from the images thereof (Abstract, [0051-0074] and Figs. 2A, 4, and 5A-C). Datar discloses “determining that one or more of the items 204 on the platform 202 have not been identified”, and in response, “output[s] a request for the user to reposition one or more items 204 on the platform 202 to assist the item tracking device 104 with identifying some of the items 204 on the platform” ([0082]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the proposed combination of Johnson, Wang and Skaff to prompt a user to rearrange the products in response to determining that one or more of the products are not detected, as taught by Datar, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have “prevent[ed] the item tracking device 104 from double counting items 204” or missing items ([0082] of Datar).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEAN M CONNER whose telephone number is (571)272-1486. The examiner can normally be reached 10 AM - 6 PM Monday through Friday, and some Saturday afternoons.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Greg Morse can be reached at (571) 272-3838. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/SEAN M CONNER/Primary Examiner, Art Unit 2663
Read full office action
Prosecution Timeline

Jan 27, 2022
Application Filed
Mar 09, 2024
Non-Final Rejection — §103
Sep 16, 2024
Response Filed
Dec 06, 2024
Final Rejection — §103
Jun 11, 2025
Request for Continued Examination
Jun 12, 2025
Response after Non-Final Action
Jun 26, 2025
Non-Final Rejection — §103
Dec 30, 2025
Response Filed
Feb 21, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/328,597
Patent 12586374
MULTIMODAL VIDEO SUMMARIZATION
2y 5m to grant Granted Mar 24, 2026
18/632,078
Patent 12586412
USING TWO-DIMENSIONAL IMAGES AND MACHINE LEARNING TO IDENTIFY INFORMATION PERTAINING TO EYE SHAPE
2y 5m to grant Granted Mar 24, 2026
18/909,470
Patent 12585862
Training Data for Training Artificial Intelligence Agents to Automate Multimodal Software Usage
2y 5m to grant Granted Mar 24, 2026
18/009,783
Patent 12579778
Pattern Matching Device, Pattern Measuring System, Pattern Matching Program
2y 5m to grant Granted Mar 17, 2026
18/212,325
Patent 12573180
COLLECTION OF IMAGE DATA FOR USE IN TRAINING A MACHINE-LEARNING MODEL
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
79%
Grant Probability
99%
With Interview (+27.1%)
2y 9m
Median Time to Grant
High
PTA Risk
Based on 454 resolved cases by this examiner. Grant probability derived from career allow rate.
OBJECT RECOGNITION SYSTEMS AND METHODS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email