Last updated: April 18, 2026
Application No. 18/779,845
ITEM IDENTIFICATION IN AN IMAGE BY A VISUAL IDENTIFICATION MODEL TRAINED USING INFORMATION FROM AN ITEM SCANNER SYSTEM

Non-Final OA §101§102§103
Filed
Jul 22, 2024
Examiner
WERONSKI, MATTHEW S
Art Unit
3627
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
DRAGONFRUIT AI, INC.
OA Round
1 (Non-Final)
This examiner grants 10% of cases after interview

— +19.8% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 115 resolved cases, 2023–2026
Examiner Intelligence

WERONSKI, MATTHEW S View full profile →
Grants only 10% of cases
Career Allow Rate
11 granted / 115 resolved
-42.4% vs TC avg
Strong +20% interview lift
Without
With
+19.8%
Interview Lift
resolved cases with interview
Typical timeline
4y 0m
Avg Prosecution
32 currently pending
Career history
147
Total Applications
across all art units
Statute-Specific Performance

§101
31.5%
-8.5% vs TC avg
§103
37.7%
-2.3% vs TC avg
§102
22.3%
-17.7% vs TC avg
§112
7.5%
-32.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 115 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION



Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

		Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Step 1: Whether a Claim is to a Statutory Category
In the instant case, claims 1-13 recite method/ process claims, claims 14-19 recite method/ process claims and claim 20 recites an apparatus/ machine claim that are performing a series of functions.  Therefore, these claims fall within the four statutory categories of invention of a process and a machine.  Step 1 is satisfied. 

Step2A – Prong 1: Does the Claim Recite a Judicial Exception
Claim 1 (and similarly claims 14 and 20) recites the following abstract concepts that are found to include an enumerated “abstract idea”:
A method for training a model to visually identify items, the method comprising: 	
receiving an image captured at a capture time of a checkout space including a scanner system; 
receiving an indication that an item has been scanned by the scanner system, wherein the indication includes an identity of the item and identifies a scan time when the item was scanned; 
correlating the scan time with the capture time; and 
providing the image and the identity of the item to a visual identification model to train the visual identification model to identify the item from other images.

[Emphasis added to show the bolded abstract idea being executed by unbolded additional elements that do not meaningfully limit the abstract idea]

This system claim is grouped within the "certain methods of organizing human activity” grouping of abstract ideas in prong one of step 2A of the Alice/Mayo test because the claims involve a series of steps for following rules or instructions to identify the item from other images which is a process that is encompassed by the abstract idea of managing personal behavior.  See MPEP (2106.04)(a)(2)(II)(C).  Accordingly, claim 1 (and similarly claims 14 and 20) recites an abstract idea. 

Step2A – Prong 2: Does the Claim Recite Additional Elements that Integrate the Judicial Exception into a Practical Application
This judicial exception is not integrated into a practical application because, when analyzed under prong two of step 2A of the Alice/Mayo test, the additional elements of the claims such as scanner system and visual identification model merely use a computer as a tool to perform an abstract idea.  Specifically, the scanner system and visual identification model perform the steps or functions of following rules or instructions to identify the item from other images.  The use of a processor/computer as a tool to implement the abstract idea and/or generally linking the use of the abstract idea to a particular technological environment does not integrate the abstract idea into a practical application because it requires no more than a computer (or technical elements disclosed at a high level of generality such as scanner system and visual identification model) performing functions of receiving, capturing, scanning, identifying, correlating, providing and training that correspond to acts required to carry out the abstract idea or merely attempt to limit the use of the abstract idea to a particular technological environment (MPEP 2106.05(f) and (h)).  Accordingly, the additional elements do not impose any meaningful limits on practicing the abstract idea, and the claims are directed to an abstract idea.

Step2B: Does the Claim Amount to Significantly More
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  The additional element analysis of Step 2A Prong 2 is equally applied to Step 2B.  “Another consideration when determining whether a claim recites significantly more than a judicial exception is whether the additional element(s) are well-understood, routine, conventional activities previously known to the industry. This consideration is only evaluated in Step 2B of the eligibility analysis.”  MPEP 2106.05(d).  The courts have recognized the following computer functions as well‐understood, routine, and conventional (“WURC”) functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity.  Exemplary claim 1 recites the following limitations that the courts have found to be WURC: 
Claim 1 includes several limitations relating to receiving/transmitting (providing as claimed) data.  See MPEP 2106.05(d)(II) where courts found to be WURC - i. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607, 610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network); but see DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245, 1258, 113 USPQ2d 1097, 1106 (Fed. Cir. 2014) (“Unlike the claims in Ultramercial, the claims at issue here specify how interactions with the Internet are manipulated to yield a desired result‐‐a result that overrides the routine and conventional sequence of events ordinarily triggered by the click of a hyperlink.” (emphasis added));
Claim 1 includes limitations relating to performing repetitive calculations (correlating as claimed).  See MPEP 2106.05(d)(II) where the courts found storing data to be WURC.  - ii. Performing repetitive calculations, Flook, 437 U.S. at 594, 198 USPQ2d at 199 (recomputing or readjusting alarm limit values); Bancorp Services v. Sun Life, 687 F.3d 1266, 1278, 103 USPQ2d 1425, 1433 (Fed. Cir. 2012) ("The computer required by some of Bancorp’s claims is employed only for its most basic function, the performance of repetitive calculations, and as such does not impose meaningful limits on the scope of those claims.");
Claim 1 recites capturing data by a scanner system – See MPEP 2106.05(d)(II) where courts found to be WURC - v. Electronically scanning or extracting data from a physical document, Content Extraction and Transmission, LLC v. Wells Fargo Bank, 776 F.3d 1343, 1348, 113 USPQ2d 1354, 1358 (Fed. Cir. 2014) (optical character recognition);
Accordingly, when viewed alone and in ordered combination, these additional elements are not found to recite significantly more than the underlying abstract idea. 

Independent claim 14 describes the abstract idea of managing personal behavior by following rules or instructions to identify the items from subsequent images.  Independent claim 14 does not include additional elements to perform the respective functions of receiving, capturing, identifying, scanning and training beyond technical elements disclosed at a high level of generality, such as cameras, checkout scanners and visual identification model that integrate the abstract idea into a practical application or that provide significantly more than the abstract idea for the same reasons as noted above regarding claim 1.  Therefore, independent claim 14 is also not patent eligible.  
Independent claim 20 describes the abstract idea of managing personal behavior by following rules or instructions to identify the item from other images.  Independent claim 20 does not include additional elements to perform the respective functions of storing, directing, receiving, capturing, scanning, identifying, scanning, correlating, providing and training beyond technical elements disclosed at a high level of generality, such as computer readable storage media, processing system, scanner system and visual identification model that integrate the abstract idea into a practical application or that provide significantly more than the abstract idea for the same reasons as noted above regarding claim 1.  Therefore, independent claim 20 is also not patent eligible.  
Dependent claims 2-13 further describe the abstract idea of managing personal behavior by following rules or instructions to identify the item from other images.  Said dependent claims merely descriptive material relating to the environment of use of the abstract idea or do not include additional elements to perform the respective functions of receiving, capturing, feeding, identifying, scanning, correlating, providing, training, cropping, determining and decrementing beyond the technical elements disclosed at a high level of generality, such as, visual identification model, scanner system, second scanner system and as disclosed in independent claim 1 that integrate the abstract idea into a practical application or that provide significantly more than the abstract idea.  Therefore, said dependent claims are also not patent eligible.  Further, the dependency of these claims on ineligible independent claim 1 also renders said dependent claims as not patent eligible.
Dependent claims 15-19 further describe the abstract idea of managing personal behavior by following rules or instructions to identify the items from subsequent images.  Said dependent claims merely show descriptive material relating to the environment of use of the abstract idea or do not include additional elements to perform the respective functions of receiving, inputting, identifying, capturing, transmitting, failing to indicate and training beyond the technical elements disclosed at a high level of generality, such as, cameras, visual identification model, communication network, premises equipment, remote processing system and disclosed in independent claim 14 that integrate the abstract idea into a practical application or that provide significantly more than the abstract idea.  Therefore, said dependent claims are also not patent eligible.  Further, the dependency of these claims on ineligible independent claim 14 also renders said dependent claims as not patent eligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-7, 9-11 and 14-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Brakob et al. (US 2023/0037427 A1).
Regarding Claim 1, Brakob teaches:
A method for training a model to visually identify items (See Brakob ¶ [0048] -  the cameras or other imaging devices can be configured to capture images of products as the customer scans such products. These images can be used by the computing system to identify products that are being purchased by the customer. Such images can also be used to train and/or improve one or more machine learning models that are used to identify the products), the method comprising: 	
receiving an image captured at a capture time of a checkout space including a scanner system (See Brakob ¶ [0048] -  the flatbed [checkout space by example] can include one or more cameras or other imaging devices… the cameras or other imaging devices can be configured to capture images of products as the customer scans such products. These images can be used by the computing system to identify products that are being purchased by the customer. Such images can also be used to train and/or improve one or more machine learning models that are used to identify the products); 
receiving an indication that an item has been scanned by the scanner system, wherein the indication includes an identity of the item and identifies a scan time when the item was scanned (See Brakob ¶ [0049-0050] -  The one or more scanning devices can be barcode, SKU, or other label identifying devices… The POS terminal can be configured to identify products that are scanned using the one or more scanning devices and [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal); 
correlating the scan time with the capture time (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal); and 
providing the image and the identity of the item to a visual identification model to train the visual identification model to identify the item from other images (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal and [0098] - The computing system can train a product classification model to map visual features of the products into multi-dimensional feature space using the image training data).

Regarding Claim 2, Brakob teaches:
The method of claim 1, comprising: 
receiving a second image captured of a different space from the checkout space (See Brakob ¶ [0056] - multiple overhead cameras can be positioned in a checkout area. One overhead camera can then be trained on multiple checkout lanes, thereby having a different FOV … the overhead camera can continuously capture image data of the checkout lane [different space from the checkout space by example]); 	
feeding the second image to the visual identification model (See Brakob ¶ [0056] - image data captured by the overhead camera can be still images and/or video feeds. The low resolution image data can be used to build training datasets for training and improving the machine learning models); and 
receiving output from the visual identification model, wherein the output identifies the item in the second image (See Brakob ¶ [0056] - The low resolution image data can be used to build training datasets for training and improving the machine learning models described herein… the low resolution image data can be beneficial to improve accuracy of detection and identification of products using the machine learning models).

Regarding Claim 3, Brakob teaches:
The method of claim 2, wherein the checkout space and the different space are collocated at a location of an entity (See Brakob ¶ [0048] -  the flatbed [checkout space by example] can include one or more cameras or other imaging devices… the cameras or other imaging devices can be configured to capture images of products as the customer scans such products and [0056] - the overhead camera can continuously capture image data of the checkout lane [different space from the checkout space by example]). 

Regarding Claim 4, Hui teaches:
The method of claim 2, wherein the checkout space is at a first location of a first entity and the different space is located at a second location of a second entity. (See Brakob ¶ [0047] - The computing system can be in communication with components (e.g., POS terminals) at multiple checkout lanes in one store [first location of a first entity by example] and/or across multiple different stores [second location of a second entity by example]. The computing system can also be a cloud service, an edge computing device, and/or any combination thereof and ¶ [0048] & [0056] as noted above regarding claim 3 for teaching [checkout space by example] and [different space from the checkout space by example]).
Regarding Claim 5, Brakob teaches:
The method of claim 1, comprising: 
receiving a second image captured at a second capture time of the checkout space including the scanner system (See Brakob ¶ [0048] -  the flatbed [checkout space by example] can include one or more cameras or other imaging devices… the cameras or other imaging devices can be configured to capture images of products as the customer scans such products and [0060] - The scanned barcodes and/or product identifications can also be transmitted to/accessed by the computing system as the customer scans each barcode in real-time [second capture time by example]); 
receiving a second indication that the item has been scanned by the scanner system, wherein the second indication includes the identity of the item and identifies a second scan time when the item was scanned (See Brakob ¶ [0049-0050] -  The one or more scanning devices can be barcode, SKU, or other label identifying devices… The POS terminal can be configured to identify products that are scanned using the one or more scanning devices, [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal and [0060] - The scanned barcodes and/or product identifications can also be transmitted to/accessed by the computing system as the customer scans each barcode in real-time [second capture time by example]); 
correlating the second scan time with the second capture time (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal and [0060] - The scanned barcodes and/or product identifications can also be transmitted to/accessed by the computing system as the customer scans each barcode in real-time[second capture time by example]); and 
providing the second image and the identity of the item to the visual identification model to train the visual identification model to identify the item from the other images (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal, [0060] - The scanned barcodes and/or product identifications can also be transmitted to/accessed by the computing system as the customer scans each barcode in real-time[second capture time by example] and [0098] - The computing system can train a product classification model to map visual features of the products into multi-dimensional feature space using the image training data). 

Regarding Claim 6, Brakob teaches:
The method of claim 5, wherein the second image captures the item from an angel not captured in the image (See Brakob ¶ [0055] - The overhead camera can face down to get a top down view of the checkout lane. The overhead camera can also be positioned or oriented at an angle to capture more than just the checkout lane. For example, the overhead camera can be angled such that the FOV includes an area surrounding the checkout lane where the user may place a shopping cart, basket, or products to be purchased).
Regarding Claim 7, Brakob teaches:
The method of claim 1, comprising: 
receiving a second image captured at a second capture time of a second space including a second scanner system (See Brakob ¶ [0047] - The computing system can be in communication with components (e.g., POS terminals) at multiple checkout lanes in one store [second space by example] and/or across multiple different stores and ¶ [0048] & [0060] as noted above regarding claim 3 for teaching scanner systems and [second capture time by example]); 
receiving a second indication that the item has been scanned by the second scanner system, wherein the second indication includes the identity of the item and identifies a second scan time when the item was scanned (See Brakob ¶ [0049-0050] -  The one or more scanning devices can be barcode, SKU, or other label identifying devices… The POS terminal can be configured to identify products that are scanned using the one or more scanning devices, [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal and [0060] - The scanned barcodes and/or product identifications can also be transmitted to/accessed by the computing system as the customer scans each barcode in real-time [second capture time by example]); 
correlating the second scan time with the second capture time (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal and [0060] - The scanned barcodes and/or product identifications can also be transmitted to/accessed by the computing system as the customer scans each barcode in real-time[second capture time by example]); and 
providing the second image and the identity of the item to the visual identification model to train the visual identification model to identify the item from the other images (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal, [0060] - The scanned barcodes and/or product identifications can also be transmitted to/accessed by the computing system as the customer scans each barcode in real-time[second capture time by example] and [0098] - The computing system can train a product classification model to map visual features of the products into multi-dimensional feature space using the image training data).

Regarding Claim 9, Brakob teaches:
The method of claim 1, wherein the image is a video, and the method comprising: 
determining a time frame including the scan time in which the item can be seen in the video (See Brakob ¶ [0063-0064] - the overhead camera can capture image data (e.g., images, video feeds) of a scanning area of the checkout lane. The scanning area can include the flatbed and/or any other portion or area surrounding the checkout lane where the scanning event was detected … when the scanning event is identified, the overhead camera can select or otherwise identify a portion of the captured image data that corresponds to a same timestamp at which the scanning event was detected. The captured image data or the selected portion of the image data can be transmitted to the computing system for further processing and use in making product identification and matching determinations). 

Regarding Claim 10, Brakob teaches:
The method of claim 1, comprising: 
receiving a second image captured of a retail space displaying a plurality of items (See Brakob ¶ [0055] - the overhead camera can be angled such that the FOV includes an area surrounding the checkout lane where the user may place a shopping cart, basket, or products to be purchased. Having the surrounding area in the FOV can provide more context around a checkout process at the checkout lane, [0060] - The scanned barcodes and/or product identifications can also be transmitted to/accessed by the computing system as the customer scans each barcode in real-time[second capture time by example] and [0098] - The computing system can train a product classification model to map visual features of the products into multi-dimensional feature space using the image training data); and 
feeding the second image into the visual identification model (See Brakob ¶ [0056] - image data captured by the overhead camera can be still images and/or video feeds. The low resolution image data can be used to build training datasets for training and improving the machine learning models), wherein the visual identification model provides output identifying at least one instance of the item in the second image (See Brakob ¶ [0088-0091] - The computing system can map the image training data into n-dimensional space. A number of dimensions in space can depend on a number of features that are identified… The more image training data of bananas that falls into these other dimensions, the more likely a cluster of points or values will form in these dimensions to identify the banana, thereby differentiating the banana from a cluster of points or values around the oblong shape and yellow coloring dimensions that represent the zucchini).

Regarding Claim 11, Brakob teaches:
The method of claim 10, wherein the visual identification model is also trained to identify a second item of the plurality of items and wherein the output also identifies at least one instance of the second item in the second image (See Brakob ¶ [0088-0091] - The computing system can map the image training data into n-dimensional space. A number of dimensions in space can depend on a number of features that are identified… The more image training data of bananas that falls into these other dimensions, the more likely a cluster of points or values will form in these dimensions to identify the banana, thereby differentiating the banana from a cluster of points or values around the oblong shape and yellow coloring dimensions that represent the zucchini [second item by example]).

Regarding Claim 14, Brakob teaches:
A method for training a model to visually identify items (See Brakob ¶ [0048] -  the cameras or other imaging devices can be configured to capture images of products as the customer scans such products. These images can be used by the computing system to identify products that are being purchased by the customer. Such images can also be used to train and/or improve one or more machine learning models that are used to identify the products), the method comprising: 	
receiving images captured by a plurality of cameras directed towards a plurality of checkout spaces including a plurality of checkout scanners (See Brakob ¶ [0047] - The computing system can be in communication with components (e.g., POS terminals) at multiple checkout lanes in one store and/or across multiple different stores. The computing system can also be a cloud service, an edge computing device, and/or any combination thereof, [0048] -  Referring to the checkout lane , the one or more scanning devices can be integrated into the flatbed the flatbed can include one or more cameras or other imaging devices… the cameras or other imaging devices can be configured to capture images of products as the customer scans such products and [0055] - The overhead camera can face down to get a top down view of the checkout lane. The overhead camera can also be positioned or oriented at an angle to capture more than just the checkout lane); 
identifying items being scanned in the images from scan information received from the plurality of checkout scanners when the items are scanned (See Brakob ¶ [0049-0050] -  The one or more scanning devices can be barcode, SKU, or other label identifying devices… The POS terminal can be configured to identify products that are scanned using the one or more scanning devices); and 
training a visual identification model to identify the items from subsequent images (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal and [0098] - The computing system can train a product classification model to map visual features of the products into multi-dimensional feature space using the image training data).

Regarding Claim 15, Brakob teaches:
The method of claim 14, comprising: 
receiving the subsequent images from a second plurality of cameras (See Brakob ¶ [0047] - The computing system can be in communication with components (e.g., POS terminals) at multiple checkout lanes in one store and/or across multiple different stores. The computing system can also be a cloud service, an edge computing device, and/or any combination thereof, [0048] -  Referring to the checkout lane, the one or more scanning devices can be integrated into the flatbed the flatbed can include one or more cameras or other imaging devices… the cameras or other imaging devices can be configured to capture images of products as the customer scans such products, [0055] - The overhead camera can face down to get a top down view of the checkout lane. The overhead camera can also be positioned or oriented at an angle to capture more than just the checkout lane and Fig. 2 – multiple overhead cameras (110A and 110B)); 
inputting the subsequent images into the visual identification model (See Brakob ¶ [0056] - image data captured by the overhead camera can be still images and/or video feeds. The low resolution image data can be used to build training datasets for training and improving the machine learning models); and 
receiving output from the visual identification model identifying at least one of the items in subsequent images (See Brakob ¶ [0088-0091] - The computing system can map the image training data into n-dimensional space. A number of dimensions in space can depend on a number of features that are identified… The more image training data of bananas that falls into these other dimensions, the more likely a cluster of points or values will form in these dimensions to identify the banana, thereby differentiating the banana from a cluster of points or values around the oblong shape and yellow coloring dimensions that represent the zucchini).

Regarding Claim 16, Brakob teaches:
The method of claim 14, wherein receiving the images comprises: 
receiving the images over a communication network from premises equipment at a plurality of locations having the plurality of checkout spaces (See Brakob ¶ [0047] - The computing system can be in communication with components (e.g., POS terminals) at multiple checkout lanes in one store and/or across multiple different stores. The computing system can also be a cloud service, an edge computing device, and/or any combination thereof and ¶ [0081] - The image training data can be image data (e.g., still images, video feeds) of every product that has been scanned at checkout lanes across the network of stores).
Regarding Claim 17, Brakob teaches:
The method of claim 14, comprising: 
in a camera connected to premises equipment at a location, capturing an image of the subsequent images (See Brakob ¶ [0047] - The computing system can be in communication with components (e.g., POS terminals) at multiple checkout lanes in one store [premises equipment by example] and/or across multiple different stores. The computing system can also be a cloud service, an edge computing device, and/or any combination thereof and ¶ [0081] - The image training data can be image data (e.g., still images, video feeds) of every product that has been scanned at checkout lanes across the network of stores); 
in the premises equipment, inputting the image into a portion of the visual identification model and transmitting the image over a communication network to a remote processing system (See Brakob ¶ [0076] - Based on detecting the scanning event, the camera can capture image data from the FOV. The image data can include the checkout lane where the scanning event was detected … The camera can then transmit the image data to the computing system and ¶ [0080] - The computing system and/or the other computing systems can be a remote computing system, server, network of computers or servers, cloud service, and/or edge computing device); 
in the remote processing system, inputting the image into a different portion of the visual identification model (See Brakob ¶ [0078] - The computing system can also use one or more machine learning models that are trained to identify a product from surrounding features, objects, and/or ambient environment. Therefore, using the machine learning models, the computing system can extract only the product from the image data and use that extracted portion of the image data for further processing and ¶ [0080] - The computing system and/or the other computing systems can be a remote computing system, server, network of computers or servers, cloud service, and/or edge computing device); and 
receiving output from the visual identification model identifying at least one of the items in the image (See Brakob ¶ [0079] - The computing system can then perform product matching techniques using the extracted portion of the image data and product identification(s) (e.g., scanned barcodes, product information, etc.) that can be received from the POS terminal at the checkout lane. The computing system can therefore determine whether the customer engaged in ticket switching at the checkout lane. The computing system can also identify what product the customer is likely purchasing).

Regarding Claim 18, Brakob teaches:
The method of claim 17, wherein the image is transmitted in response to the portion of the visual identification model failing to indicate an item in the image (See Brakob ¶ [0149-0150] - The process can be used to determine whether a customer engaged in ticket switching and is trying to purchase a product with an incorrect barcode [failing to indicate an item in the image by example]… A customer, for example, can place the unknown product over/on a flatbed scanner, which can recognize and scan the barcode appearing on the unknown product. The barcode can be another type of label or product identifier, such as a sticker, QR code, and/or SKU [incorrect barcode]. The one or more candidate product identifications can be determined by the computing system after applying the product classification model and/or multiple product identification models to image data of the unknown product associated with the scanned barcode).

Regarding Claim 19, Brakob teaches:
The method of claim 17, wherein the portion of the visual identification model comprises an instance of the visual identification model trained from a portion of the images captured by a portion of the plurality of cameras at the location (See Brakob ¶ [0169] - The product classification model can be trained using image training data. The image training data can be retrieved, by the classification model generator, from the models data store. The classification model generator can also receive the image training data directly from one or more of the cameras).

Regarding Claim 20, Brakob teaches:
An apparatus for training a model to visually identify items (See Brakob ¶ [0048] -  the cameras or other imaging devices can be configured to capture images of products as the customer scans such products. These images can be used by the computing system to identify products that are being purchased by the customer. Such images can also be used to train and/or improve one or more machine learning models that are used to identify the products), the apparatus comprising: 
one or more computer readable storage media (See Brakob ¶ [0195] -  The computing device includes a processor, a memory, a storage device, a high-speed interface connecting to the memory… The processor can process instructions for execution within the computing device, including instructions stored in the memory); 
a processing system operatively coupled with the one or more computer readable storage media (See Brakob ¶ [0195] -  The computing device includes a processor, a memory, a storage device, a high-speed interface connecting to the memory… The processor can process instructions for execution within the computing device, including instructions stored in the memory); and 
program instructions stored on the one or more computer readable storage media that, when read and executed by the processing system (See Brakob ¶ [0195] -  The computing device includes a processor, a memory, a storage device, a high-speed interface connecting to the memory… The processor can process instructions for execution within the computing device, including instructions stored in the memory), direct the apparatus to: 
receive an image captured at a capture time of a checkout space including a scanner system (See Brakob ¶ [0048] -  the flatbed [checkout space by example] can include one or more cameras or other imaging devices… the cameras or other imaging devices can be configured to capture images of products as the customer scans such products. These images can be used by the computing system to identify products that are being purchased by the customer. Such images can also be used to train and/or improve one or more machine learning models that are used to identify the products); 
receive an indication that an item has been scanned by the scanner system, wherein the indication includes an identity of the item and identifies a scan time when the item was scanned (See Brakob ¶ [0049-0050] -  The one or more scanning devices can be barcode, SKU, or other label identifying devices… The POS terminal can be configured to identify products that are scanned using the one or more scanning devices and [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal); 
correlate the scan time with the capture time (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal); and 
provide the image and the identity of the item to a visual identification model to train the visual identification model to identify the item from other images (See Brakob ¶ [0053] - During training of the machine learning models, these features can be labeled and confidence of such labeling can increase since the image data can be associated with a timestamp of a correct barcode scan at the POS terminal and [0098] - The computing system can train a product classification model to map visual features of the products into multi-dimensional feature space using the image training data).

















Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

	
	Claim 8, 12 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Brakob et al. (US 2023/0037427 A1) in view of Bronicki et al. (US 2022/0114868 A1).
Regarding Claim 8, modified Brakob teaches:
The method of claim 1, comprising: …
While Brakob teaches a computer based system for training vision based machine learning models for image analysis to identify known and unknow items that are being scanned at a checkout area of a store by providing the image to the visual identification model (Brakob ¶ [0047-0048], [0053], [0060] and [0098]), Brakob does not explicitly teach  cropping portions of the image other than the item before said images are provided to the visual identification model.  This is taught by Bronicki (See Bronicki ¶ [0147] - The transmitted information may include raw images, cropped images, processed image data, data about products identified in the images, and so forth, thereby showing images are cropped before transmission since the transmitted data includes cropped images).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include in the vision model image analysis system of Brakob the use of cropped images as taught by Bronicki to improve image acquisition (Bronicki ¶ [0168]), thereby increasing the accuracy and efficiency of the Brakob vision model image analysis system.

Regarding Claim 12, modified Brakob teaches:
The method of claim 10, wherein the second image is a video image (See Brakob ¶ [0063-0064] as noted above regarding claim 9), the method comprising: 
While Brakob teaches a computer based system for training vision based machine learning models for image analysis to identify known and unknow items that are being scanned at a checkout area of a store (Brakob ¶ [0047-0048], [0053], [0060] and [0098]), Brakob does not explicitly teach  determining a first instance of the at least one instance is absent from the video image at a second time; and decrementing an inventory of the item by one.  This is taught by Bronicki (See Bronicki ¶ [0563] -  an inventory of an item may be tracked over time to determine a rate at which a product is removed from a retail shelf. For example, this may include detecting a quantity of product on shelf in multiple images over time and [0574] - if only one of product is available, the updates may include reducing the quantity of item by one and as shown in Figs. 43 and 44).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include in the vision model image analysis system of Brakob the use of item count tracking for captured images as taught by Bronicki to improve store execution, which may help store associates of retail store to quickly address situations that may negatively impact revenue and customer experience in the retail store (Bronicki ¶ [0232]), thereby increasing the accuracy and efficiency of the Brakob vision model image analysis system.

Regarding Claim 13, modified Brakob teaches:
The method of claim 12, comprising: 
identifying a customer in the video image (See Brakob ¶ [0052] -  Images captured by the integrated camera can also be used to identify characteristics of the customer that can be used to objectively identify the customer, such as body movements, behavior, and/or appearance); ...
While Brakob teaches a computer based system for training vision based machine learning models for image analysis to identify known and unknow items that are being scanned at a checkout area of a store (Brakob ¶ [0047-0048], [0053], [0060] and [0098]), Brakob does not explicitly teach determining the customer removed the first instance from the retail space.  This is taught by Bronicki (See Bronicki ¶ [0563] -  an inventory of an item may be tracked over time to determine a rate at which a product is removed from a retail shelf. For example, this may include detecting a quantity of product on shelf in multiple images over time and as shown in Figs. 43 and 44).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include in the vision model image analysis system of Brakob the use of item count tracking for captured images as taught by Bronicki to improve store execution, which may help store associates of retail store to quickly address situations that may negatively impact revenue and customer experience in the retail store (Bronicki ¶ [0232]), thereby increasing the accuracy and efficiency of the Brakob vision model image analysis system.









Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW S WERONSKI whose telephone number is (571)272-5802. The examiner can normally be reached M-F 8 am - 5 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fahd A. Obeid can be reached at 5712703324. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MATTHEW S WERONSKI/Examiner, Art Unit 3627                                                                                                                                                                                                        
/PETER LUDWIG/Primary Examiner, Art Unit 3627
Read full office action
Prosecution Timeline

Jul 22, 2024
Application Filed
Apr 01, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/855,056
Patent 12443938
Point-of-Sale (POS) Operation System
2y 5m to grant Granted Oct 14, 2025
16/208,681
Patent 12400247
REPRESENTING SETS OF ENTITITES FOR MATCHING PROBLEMS
2y 5m to grant Granted Aug 26, 2025
16/292,854
Patent 12367454
METHOD AND SYSTEM FOR VEHICLE MANAGEMENT
2y 5m to grant Granted Jul 22, 2025
18/299,703
Patent 12333614
QUALITY, AVAILABILITY AND AI MODEL PREDICTIONS
2y 5m to grant Granted Jun 17, 2025
17/573,634
Patent 12327393
SYSTEM AND METHOD FOR CAPTURING CONSISTENT STANDARDIZED PHOTOGRAPHS AND USING PHOTOGRAPHS FOR CATEGORIZING PRODUCTS
2y 5m to grant Granted Jun 10, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
10%
Grant Probability
29%
With Interview (+19.8%)
4y 0m
Median Time to Grant
Low
PTA Risk
Based on 115 resolved cases by this examiner. Grant probability derived from career allow rate.