Last updated: May 29, 2026
Application No. 18/148,877
ITEM VERIFICATION SYSTEMS AND METHODS FOR RETAIL CHECKOUT STANDS

Non-Final OA §101§103
Filed
Dec 30, 2022
Examiner
WHITAKER, ANDREW B
Art Unit
3629
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Datalogic Usa Inc.
OA Round
3 (Non-Final)
This examiner grants 19% of cases after interview

— +19.1% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 558 resolved cases, 2023–2026
Examiner Intelligence

WHITAKER, ANDREW B View full profile →
Grants only 19% of cases
Career Allowance Rate
105 granted / 558 resolved
-33.2% vs TC avg
Strong +19% interview lift
Without
With
+19.1%
Interview Lift
resolved cases with interview
Typical timeline
4y 2m
Avg Prosecution
43 currently pending
Career history
613
Total Applications
across all art units
Statute-Specific Performance

§101
11.3%
-28.7% vs TC avg
§103
79.2%
+39.2% vs TC avg
§102
8.4%
-31.6% vs TC avg
§112
0.3%
-39.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 558 resolved cases
Office Action

§101 §103
DETAILED ACTION

Status of the Claims
The following is a Non-final Office Action in response to amendments and remarks filed 09 January 2026.
Claims 1 and 11 have been amended.
Claims 26-27 have been added.
Claims 1, 3-7, 9, 11, 13-17, and 19-27 are pending and have been examined.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 09 January 2026 has been entered.
 
Response to Arguments
Applicants argue that the 35 U.S.C. 101 rejection under the Alice Corp. vs. CLS Bank Int’l be withdrawn; however the Examiner respectfully disagrees.  Again, the arguments are not compliant under 37 CFR 1.111(b) as they amount to a mere allegation of patent eligibility based upon a bare assertion of improvement.  The Examiner respectfully does not find the assertion persuasive because a bare assertion of an improvement without the detail necessary to be apparent is not sufficient to show an improvement (MPEP 2106.04(d)(1) (discussing MPEP 2106.05(a)). That is, the Examiner does not find any evidence that the claimed aspects are any improvement over conventional systems.  This argument appears to be whether or not the use of computer or computing components for increased speed and efficiency integrates the claims into a practical application as well as amounts to significantly more; however the Examiner respectfully disagrees.  Nor, in addressing the second step of Alice, does claiming the improved speed or efficiency inherent with applying the abstract idea on a computer provide a sufficient inventive concept. See Bancorp Servs., LLC v. Sun Life Assurance Co. of Can., 687 F.3d 1266, 1278 (Fed. Cir. 2012) (“[T]he fact that the required calculations could be performed more efficiently via a computer does not materially alter the patent eligibility of the claimed subject matter.”); CLS Bank, Int’l v. Alice Corp., 717 F.3d 1269, 1286 (Fed. Cir. 2013) (en banc) aff’d, 134 S. Ct. 2347 (2014) (“[S]imply appending generic computer functionality to lend speed or efficiency to the performance of an otherwise abstract concept does not meaningfully limit claim scope for purposes of patent eligibility.” (citations omitted)).  As such, this argument is not persuasive, and the rejection not withdrawn.  
Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Applicant's arguments do not comply with 37 CFR 1.111(c) because they do not clearly point out the patentable novelty which he or she thinks the claims present in view of the state of the art disclosed by the references cited or the objections made. Further, they do not show how the amendments avoid such references or objections.
Applicant’s arguments that Chaubard does not teach the first classifying of the identified item to an item class have been fully considered but are not persuasive for a plurality of reasons.  Firstly, as recited, the claims do not negatively limit the classification from being performed by an algorithm nor do they negatively limit the same algorithm (or trained) being used in the classify and select steps; the claims only require that classification be performed based upon identity of the item and then selection of an analysis algorithm.  To put another way, the claims do not negatively limit from the “waterfall” method as disclosed by Chaubard; only that a classification is to occur and then an algorithm selection based upon the classification.  Secondly, as previously cited and contrary to Applicant’s arguments, Chaubard discusses the ability to “The output of this portion of the processing per product bounding box is then a series of bounding shapes, each with an associated set of “features”, which could be written words and their locations in the images, such as “Tide” or “Coca-Cola”, a bounding box around the barcode inside the product bounding shape if it is there, and the results of the template matching process which is a list of probabilities or distances of a match with each of the known classes. Then all of this information about this product within the bounding box is taken together and the system tries to infer the correct SKU (Stock Keeping Unit), UPC, or other product identifier of the product. To do this, the system preferably performs a “water-fall” classification scheme that works in the following way. First, if the barcode is read, then it is taken as truth and the process is finished. In some embodiments, the system will still continue the processing even after detecting a barcode to detect a form of theft called “tag switching”, a trick that people looking to steal from the store use, which consists of adding a tag of an inexpensive item (like ground beef) on an expensive item (like filet mignon) to get a lower price. If the barcode is not detected, the system tries to match the text read from the OCR to a database of known information about all the “known” products and to match the text to a dictionary or lexicon via Jaccard Similarity or otherwise. If there is a match above a certain threshold, then the process is finished and we have our match. In some embodiments, we will continue our processing to increase system accuracy. If there is no text or the Jaccard Similarity is not high enough, then the template matching pipeline or embedding vector method is run to produce a set of classifications, scores, or distances of all “known” products and the system will merge that information with other known information such as previous “track information” (which we will define further below) or the OCR Jaccard Similarity information or barcode information or any other information to produce a best estimated probability across all “known” products and take the argmax of those probabilities over all known products/SKUs to make a final guess as to what the product in the bounding shape is. Finally, the cleaned output of this pipeline is a list of bounding shapes for each product in the image, a guess of a product identifier per bounding shape, and an associated probability per guess per bounding shape. If this occurs at time (t), we will call this output Ft, (Chaubard Col. 9 line 34-Col. 10 line 9)” which is clearly classifying an item based upon features for which then the SKU of the actual item is identified.  To put another way, Chaubard performs a matching process (i.e. classification of the item based upon features in an image such as an OCR CNN algorithm) and then attempts to infer the SKU with an algorithm (now as input into a CRNN from the CNN) to make a final guess as to what the product is.  As such, this argument is not persuasive and the rejection not overcome.  
Applicant’s remaining arguments with respect to the prior art have been fully considered and addressed in the updated rejection below, as necessitated by amendments.  
In response to arguments in reference to any depending claims that have not been individually addressed, all rejections made towards these dependent claims are maintained due to a lack of reply by the Applicants in regards to distinctly and specifically pointing out the supposed errors in the Examiner's prior office action (37 CFR 1.111). The Examiner asserts that the Applicants only argue that the dependent claims should be allowable because the independent claims are unobvious and patentable over the prior art.

	
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 3-7, 9, 11, 13-17, and 19-27 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  The claims are directed to a process (an act, or series of acts or steps), a machine (a concrete thing, consisting of parts, or of certain devices and combination of devices), and a manufacture (an article produced from raw or prepared materials by giving these materials new forms, qualities, properties, or combinations, whether by hand labor or by machinery). Thus, each of the claims falls within one of the four statutory categories (Step 1).  However, the claim(s) recite(s) query the database and obtain the reference image data for the item; compare one or more item features in the reference image data to a corresponding one or more item features in the item image; compute an item match score corresponding to a match rate of the one or more item features in the reference image data to the one or more item features in the item image; compare the item match score to a threshold match score; and generate an exception in response to the item match score failing to equal or exceed the threshold match score which is an abstract idea of a performing computations in accordance with a mathematical formula as well as the abstract idea of a mental process.
The limitations of “query the database and obtain the reference image data for the item; compare one or more item features in the reference image data to a corresponding one or more item features in the item image; query a remote database located remotely from the data reading system to obtain the reference stock descriptor for the item if the reference stock descriptor for the item is not found within the local database; classify the identified item into an item class based on the identity of the item; select an analysis algorithm from a plurality of available analysis algorithms for comparing one or more item features of the reference stock descriptor to the corresponding one or more item features of the captured item image, wherein the selection of the analysis algorithm is based on the item class of the item; compare one or more item features in the reference stock descriptor to a corresponding one or more item features in the captured item image according to the selected analysis algorithm; compute an item match score corresponding to a match rate of the one or more item features in the reference stock descriptor to the one or more item features in the captured item image; compare the item match score to a threshold match score; and generate an exception in response to the item match score failing to equal or exceed the threshold match score, verify the item in response to the item match score for the item equaling or exceeding the threshold match score; and update a transaction list associated with the customer transaction with item information for the verified item” as drafted, is a process that, under its broadest reasonable interpretation, covers mathematical concepts—mathematical relationships, mathematical formulas or equations, mathematical calculations as well as a mental process—concepts performed in the human mind (including an observation, evaluation, judgment, opinion) but for the recitation of generic computer components (Step 2A Prong 1).  That is, other than reciting “a processor operable to…the processor is further configured to:,” (or “via a processor” in claim 11) nothing in the claim element precludes the step from the mathematical concept grouping and/or from practically being performed in the mind.  For example, but for the “processor” language, “query,” “query,” “classify,” “select,” “compare,” “compute,” “compare,” “generate” “verify,” and “update” in the context of this claim encompasses the user manually observing different types of merchandise (or images thereof) comparing to the threshold which is a mathematical concept as well as a mental process/judgement.  However, if possible, the Examiner should consider the limitations together as a single abstract idea rather than as a plurality of separate abstract ideas to be analyzed individually. “For example, in a claim that includes a series of steps that recite mental steps as well as a mathematical calculation, an Examiner should identify the claim as reciting both a mental process and a mathematical concept for Step 2A, Prong One to make the analysis clear on the record.” MPEP 2106.04, subsection II.B. Under such circumstances, however, the Supreme Court has treated such claims in the same manner as claims reciting a single judicial exception. Id. (discussing Bilski v. Kappos, 561 U.S. 593 (2010)). Here, the limitations are considered together as a single abstract idea for further analysis. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitations as a mathematical concept, while some of the limitations may be performed in the mind after certain limitations are performed, but for the recitation of generic computer components, then it falls within the grouping of abstract ideas. (Step 2A, Prong One: YES).  Accordingly, the claim(s) recite(s) an abstract idea.
This judicial exception is not integrated into a practical application (Step 2A Prong Two).  The “a data reading system and method for obtaining a decoding data from items processed by a customer during a retail transaction,, the system comprising: a housing supporting a scan window; one or more data readers disposed within the housing, each data reader having a field-of-view directed through the scan window, wherein each data reader is operable to capture an item image of an item as the item passes across the scan window during a customer transaction; memory disposed within the housing, the memory including a local database having stored therein a local cache of reference stock descriptors for a plurality of times, the reference stock descriptors derived from reference images of the plurality of items; and a processor operably coupled with the one or more data readers and the memory, the processor operable to receive the captured item image from at least one of the one or more data readers for the item and identify the item from the captured item image, wherein after identifying the item, the processor is further configured to” and “A method of data reading via a data reading system by obtaining a decoding data from items processed by a customer during a retail transaction, the method comprising: disposing a memory within a housing of a data reading system, the memory including a local database having stored therein a local cache of reference stock descriptors for a plurality of items, the reference stock descriptors derived from reference images of the plurality of items; capturing, via one or more data readers, an item image of an item as the item passes across a scan window of the data reading system during a customer transaction;” are additional elements that simply provide insignificant extrasolution data gathering activities.  Next, the claim only recites one additional element – using a processor to perform the steps. The processor the steps is recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of electronic data query, storage, retrieval and arithmetic) such that it amounts no more than mere instructions to apply the exception using a generic computer component.  Specifically the claims amount to nothing more than an instruction to apply the abstract idea using a generic computer or invoking computers as tools by adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.04(d)(I) discussing MPEP 2106.05(f).  Accordingly, the combination of these additional elements does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea, even when considered as a whole (Step 2A Prong Two: NO). 
The claim does not include a combination of additional elements that are sufficient to amount to significantly more than the judicial exception (Step 2B).  As discussed above with respect to integration of the abstract idea into a practical application (Step 2A Prong 2), the combination of additional elements of using a processor to perform the steps amounts to no more than mere instructions to apply the exception using a generic computer component.  Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.   Reevaluating here in step 2B, the “a data reading system and method for obtaining a decoding data from items processed by a customer during a retail transaction,, the system comprising: a housing supporting a scan window; one or more data readers disposed within the housing, each data reader having a field-of-view directed through the scan window, wherein each data reader is operable to capture an item image of an item as the item passes across the scan window during a customer transaction; memory disposed within the housing, the memory including a local database having stored therein a local cache of reference stock descriptors for a plurality of times, the reference stock descriptors derived from reference images of the plurality of items; and a processor operably coupled with the one or more data readers and the memory, the processor operable to receive the captured item image from at least one of the one or more data readers for the item and identify the item from the captured item image, wherein after identifying the item, the processor is further configured to” and “A method of data reading via a data reading system by obtaining a decoding data from items processed by a customer during a retail transaction, the method comprising: disposing a memory within a housing of a data reading system, the memory including a local database having stored therein a local cache of reference stock descriptors for a plurality of items, the reference stock descriptors derived from reference images of the plurality of items; capturing, via one or more data readers, an item image of an item as the item passes across a scan window of the data reading system during a customer transaction;” elements perform steps which are insignificant extrasolution activities are also determined to be well-understood, routine and conventional activity in the field.  The Symantec, TLI, and OIP Techs court decisions in MPEP 2106.05(d)(II) indicate that the mere receipt or transmission of data over a network is well-understood, routine, and conventional function when it is claimed in a merely generic manner (as is here).  Therefore, when considering the additional elements alone, and in combination, there is no inventive concept in the claim.  As such, the claim(s) is/are not patent eligible, even when considered as a whole (Step 2B: NO).
Claims 3, 7, 15, and 21-25 recites the additional limitations further limiting the data and how the data is processed/used which is still directed towards the abstract idea previously identified and is not an inventive concept that meaningfully limits the abstract idea.  Again, as discussed with respect to claims 1 and 11, the claims are simply limitations which are no more than mere instructions to apply the exception using a computer or with computing components.  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  Even when considered as a whole, the claims do not integrate the judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B. 
Claims 4-6, 9, 16-17, 19-20, and 26-27 recites the additional limitations further limiting and including additional mathematical concepts which is still directed towards the abstract idea previously identified and is not an inventive concept that meaningfully limits the abstract idea.  Again, as discussed with respect to claims 1 and 11, the claims are simply limitations which are no more than mere instructions to apply the exception using a computer or with computing components.  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  Even when considered as a whole, the claims do not integrate the judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B. 
Claims 13-14 recite the additional limitations further stopping and taking additional information which is not an inventive concept that meaningfully limits the abstract idea.  Again, as discussed with respect to claims 1 and 11, the claims are simply limitations which are no more than mere instructions to apply the exception using a computer or with computing components.  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  Even when considered as a whole, the claims do not integrate the judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B. 
Claims 1, 3-7, 9, 11, 13-17, and 19-27 are therefore not eligible subject matter, even when considered as a whole.


	Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claim(s) 1, 3-7, 9, 11, 13-17, and 19-27 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chaubard (US Patent No. 11,481,751) and further in view of Goncalves et al. (US PG Pub. 2011/0286628)
	
As per claims 1 and 11, Chaubard discloses a data reading system and method for obtaining a decoding data from items processed by a customer during a retail transaction,, the system comprising (computer vision system to retrofit point of sale system, Chaubard Col. 2 line 53-Col. 3 line 15; Fig. 1; Fig. 1A): 
a housing supporting a scan window (scanner, Chaubard Col. 2 line 53-Col. 3 line 15; Fig. 1; Fig. 1A) 
one or more data readers disposed within the housing, each data reader having a field-of-view directed through the scan window, wherein each data reader is operable to capture an item image of an item as the item passes across the scan window during a customer transaction (cameras, visual images, Chaubard Col. 5 lines 18-37); 
memory disposed within the housing, the memory including a local database having stored therein a local cache of reference stock descriptors for a plurality of times, the reference stock descriptors derived from reference images of the plurality of items (library of pre-stored labeled images, Chaubard Col. 2 lines 18-37; The cameras and associated computer vision software in a system computer work to recognize each product by visual images, matched to a system database library of labeled images for each product in the store's inventory, Col. 5 lines 18-37) (Examiner notes the POS terminal having the library of pre-stored images as the local database within the memory); and 
a processor operably coupled with the one or more data readers and the memory, the processor operable to receive the captured item image from at least one of the one or more data readers for the item and identify the item from the captured item image, wherein after identifying the item, the processor is further configured to (For each time step t, the host CPU or GPU or otherwise will load the most recent readings captured from all attached and powered cameras and sensors into the processors memory. Depending on the camera or sensor type, we may preprocess the data in different ways to best prepare the data for later stages of the processing pipeline, Chaubard Col. 13 lines 26-31; library of pre-stored labeled images, Chaubard Col. 2 lines 18-37): 
query the local database stored locally within the memory within the housing to obtain the reference stock descriptor for the item (database of known products, Chaubard Col. 8 line 65-Col. 9 line 33; These images are compared in the software to a library of pre-stored labeled images, if available, and these may include multiple images for each of the store's products, taken from different angles. If an item in the checkout area is recognized to a sufficiently high probability, meeting a preset threshold of, e.g., 90%, that item is added to the checkout list and charged to the customer. If probability is too low, below the preset threshold, the system may prompt the customer or cashier to reposition the item if it appears stacking or occlusion has occurred, or to remove and scan that item using the store's existing scanning system. This ties the new image to the known store item, adding to the library of image data so that the item is more easily identified in future transactions at not just this register, but all registers that have this system. The customer or cashier can bag the items as the accelerated scanning is occurring. Finally, the customer checks out using cash, a credit card or a debit card as they do today or through a more modern method like face recognition, tap and go, or otherwise, Col. 2 lines 18-37; The output of this portion of the processing per product bounding box is then a series of bounding shapes, each with an associated set of “features”, which could be written words and their locations in the images, such as “Tide” or “Coca-Cola”, a bounding box around the barcode inside the product bounding shape if it is there, and the results of the template matching process which is a list of probabilities or distances of a match with each of the known classes. Then all of this information about this product within the bounding box is taken together and the system tries to infer the correct SKU (Stock Keeping Unit), UPC, or other product identifier of the product. To do this, the system preferably performs a “water-fall” classification scheme that works in the following way. First, if the barcode is read, then it is taken as truth and the process is finished. In some embodiments, the system will still continue the processing even after detecting a barcode to detect a form of theft called “tag switching”, a trick that people looking to steal from the store use, which consists of adding a tag of an inexpensive item (like ground beef) on an expensive item (like filet mignon) to get a lower price. If the barcode is not detected, the system tries to match the text read from the OCR to a database of known information about all the “known” products and to match the text to a dictionary or lexicon via Jaccard Similarity or otherwise. If there is a match above a certain threshold, then the process is finished and we have our match. In some embodiments, we will continue our processing to increase system accuracy. If there is no text or the Jaccard Similarity is not high enough, then the template matching pipeline or embedding vector method is run to produce a set of classifications, scores, or distances of all “known” products and the system will merge that information with other known information such as previous “track information” (which we will define further below) or the OCR Jaccard Similarity information or barcode information or any other information to produce a best estimated probability across all “known” products and take the argmax of those probabilities over all known products/SKUs to make a final guess as to what the product in the bounding shape is. Finally, the cleaned output of this pipeline is a list of bounding shapes for each product in the image, a guess of a product identifier per bounding shape, and an associated probability per guess per bounding shape. If this occurs at time (t), we will call this output Ft, Chaubard Col. 9 line 34-Col. 10 line 9); 
classify the identified item into an item class based on the identity of the item (The output of this portion of the processing per product bounding box is then a series of bounding shapes, each with an associated set of “features”, which could be written words and their locations in the images, such as “Tide” or “Coca-Cola”, a bounding box around the barcode inside the product bounding shape if it is there, and the results of the template matching process which is a list of probabilities or distances of a match with each of the known classes. Then all of this information about this product within the bounding box is taken together and the system tries to infer the correct SKU (Stock Keeping Unit), UPC, or other product identifier of the product. To do this, the system preferably performs a “water-fall” classification scheme that works in the following way. First, if the barcode is read, then it is taken as truth and the process is finished. In some embodiments, the system will still continue the processing even after detecting a barcode to detect a form of theft called “tag switching”, a trick that people looking to steal from the store use, which consists of adding a tag of an inexpensive item (like ground beef) on an expensive item (like filet mignon) to get a lower price. If the barcode is not detected, the system tries to match the text read from the OCR to a database of known information about all the “known” products and to match the text to a dictionary or lexicon via Jaccard Similarity or otherwise. If there is a match above a certain threshold, then the process is finished and we have our match. In some embodiments, we will continue our processing to increase system accuracy. If there is no text or the Jaccard Similarity is not high enough, then the template matching pipeline or embedding vector method is run to produce a set of classifications, scores, or distances of all “known” products and the system will merge that information with other known information such as previous “track information” (which we will define further below) or the OCR Jaccard Similarity information or barcode information or any other information to produce a best estimated probability across all “known” products and take the argmax of those probabilities over all known products/SKUs to make a final guess as to what the product in the bounding shape is. Finally, the cleaned output of this pipeline is a list of bounding shapes for each product in the image, a guess of a product identifier per bounding shape, and an associated probability per guess per bounding shape. If this occurs at time (t), we will call this output Ft, Chaubard Col. 9 line 34-Col. 10 line 9);
select an analysis algorithm from a plurality of available analysis algorithms for comparing one or more item features of the reference stock descriptor to the corresponding one or more item features of the captured item image, wherein the selection of the analysis algorithm is based on the item class of the item (First a Convolutional Neural Network (CNN) is run to infer if there exists a product at each part of the image, and if so impose a bounding shape to fit around each distinct product (perhaps a box or polygon, and if using depth, a bounding cuboid or prism or frustrum. Each bounding shape is processed individually, running optical character recognition (OCR) on it to attempt to read any words on the packaging of the product and using a barcode detection algorithm to try to find and recognize barcodes if in the image, if such exist or are able to be read. In addition, a template matching algorithm is applied, taking a database of labeled “known” products or by using a classifier that was already trained to detect those “known” products and attempting to match each one to the pixels in the bounding shape. This template matching process takes in a known image and the bounding box image and outputs a probability of match. This could be done with a Siamese CNN or by a more well studied, more classical, descriptor and homography based approach. Another approach is to train a CNN to output the probability per product over all the possible products. However, maintaining and training such a large CNN is difficult, especially since we would need to retrain this every time one of the products changes or we want to add one new product. This much retraining may not be scalable. Thus, the system can employ an “embedding” Convolutional Neural Network that is a function that maps from a set of pixels, video, or depth pixels to a compact description or embedding vector that can be used to quickly compare the new unknown embedding vector to millions of other “known” embedding vectors that came from images with labels from the master image library with a distance function like cosine distance or Lp-Norm or otherwise. Such a comparison from between the unknown embedding vector to millions of known embedding vectors can be done in milliseconds on a Graphical Processing Unit (GPU) or other parallelized processing system, Chaubard Col. 8 line 65-Col. 9 line 33; see also In another embodiment, we apply an Oriented 3D volume detection from Pixel-wise neural network predictions where we take each cameras pixels and push them through a fully-convolutional neural network as the backbone and merge this data with a Header Network that is tasked with localizing the Objects size (height, length, width), shape (classification, embedding, or otherwise), and heading (angle/orientation in the global coordinate system in α, β, and γ), Col. 14 lines 42-50; OCR to read any text on the Object We would use an Optical Character Recognition (OCR) algorithm on each image captured at time t. There are many OCR algorithms to choose from. In one embodiment, we would use a CNN to infer a rotated bounding box [x1, y1, x2, y2, angle] for every word in the image. We would then input each bounding box crop in the image into a Convolutional Recurrent Neural Network (CRNN) to output the character sequence inside the box, Col. 15 lines 53-62) (Examiner notes the classification of an item using a CNN which is then input into a CRNN as the ability to select an algorithm based upon the classification of the item or product).
compare one or more item features in the reference stock descriptor to a corresponding one or more item features in the captured item image according to the selected analysis algorithm (These images are compared in the software to a library of pre-stored labeled images, if available, and these may include multiple images for each of the store's products, taken from different angles. If an item in the checkout area is recognized to a sufficiently high probability, meeting a preset threshold of, e.g., 90%, that item is added to the checkout list and charged to the customer. If probability is too low, below the preset threshold, the system may prompt the customer or cashier to reposition the item if it appears stacking or occlusion has occurred, or to remove and scan that item using the store's existing scanning system. This ties the new image to the known store item, adding to the library of image data so that the item is more easily identified in future transactions at not just this register, but all registers that have this system. The customer or cashier can bag the items as the accelerated scanning is occurring. Finally, the customer checks out using cash, a credit card or a debit card as they do today or through a more modern method like face recognition, tap and go, or otherwise, Chaubard Col. 2 lines 18-37; ); 
compute an item match score corresponding to a match rate of the one or more item features in the reference stock descriptor to the one or more item features in the captured item image (These images are compared in the software to a library of pre-stored labeled images, if available, and these may include multiple images for each of the store's products, taken from different angles. If an item in the checkout area is recognized to a sufficiently high probability, meeting a preset threshold of, e.g., 90%, that item is added to the checkout list and charged to the customer. If probability is too low, below the preset threshold, the system may prompt the customer or cashier to reposition the item if it appears stacking or occlusion has occurred, or to remove and scan that item using the store's existing scanning system. This ties the new image to the known store item, adding to the library of image data so that the item is more easily identified in future transactions at not just this register, but all registers that have this system. The customer or cashier can bag the items as the accelerated scanning is occurring. Finally, the customer checks out using cash, a credit card or a debit card as they do today or through a more modern method like face recognition, tap and go, or otherwise, Chaubard Col. 2 lines 18-37); 
compare the item match score to a threshold match score (These images are compared in the software to a library of pre-stored labeled images, if available, and these may include multiple images for each of the store's products, taken from different angles. If an item in the checkout area is recognized to a sufficiently high probability, meeting a preset threshold of, e.g., 90%, that item is added to the checkout list and charged to the customer. If probability is too low, below the preset threshold, the system may prompt the customer or cashier to reposition the item if it appears stacking or occlusion has occurred, or to remove and scan that item using the store's existing scanning system. This ties the new image to the known store item, adding to the library of image data so that the item is more easily identified in future transactions at not just this register, but all registers that have this system. The customer or cashier can bag the items as the accelerated scanning is occurring. Finally, the customer checks out using cash, a credit card or a debit card as they do today or through a more modern method like face recognition, tap and go, or otherwise, Chaubard Col. 2 lines 18-37); 
generate an exception in response to the item match score failing to equal or exceed the threshold match score; (These images are compared in the software to a library of pre-stored labeled images, if available, and these may include multiple images for each of the store's products, taken from different angles. If an item in the checkout area is recognized to a sufficiently high probability, meeting a preset threshold of, e.g., 90%, that item is added to the checkout list and charged to the customer. If probability is too low, below the preset threshold, the system may prompt the customer or cashier to reposition the item if it appears stacking or occlusion has occurred, or to remove and scan that item using the store's existing scanning system. This ties the new image to the known store item, adding to the library of image data so that the item is more easily identified in future transactions at not just this register, but all registers that have this system. The customer or cashier can bag the items as the accelerated scanning is occurring. Finally, the customer checks out using cash, a credit card or a debit card as they do today or through a more modern method like face recognition, tap and go, or otherwise, Chaubard Col. 2 lines 18-37; In the event there is a low confidence or wrong prediction, this flags to the system that that class should be retrained and could mean that the original images that the system was trained on are not accurate so they should be flagged for a labeler to verify that the image is correct. In some embodiments, there is no labeling team and the new data is automatically retrained on. There may be cleaning steps to remove likely inaccurate data. Such an example of automatic cleaning is by calculating the loss of the objective function (typically cross-entropy loss) produced by the prediction and the truth label on a specific image, and if the loss is above a certain number (i.e. 10 for cross-entropy), then that is too high for the model to be that wrong, so we can remove this image from our master image library. In other embodiments, the labeling team is instructed to label and relabel certain data before retraining occurs. With an updated set of labeled images, we retrain the models in the pipeline and test it on a curated set of test transactions in a simulator to ensure the newly trained models outperform the previous models. In some embodiments, the labeling team would visualize a few error cases and cluster the type of errors into specific buckets called Pareto buckets (i.e. “over-exposed image”, “dark products”, “occluded item”, “apples”). Then the labeling team will relabel or gather more data to solve the highest offending bucket of errors to get more accurate data to retrain the models or change the way the data is collected with new angles or new hardware or change the training or model or inference methods or thresholds, Col. 10 line 57-Col. 11 line 18)
verify the item in response to the item match score for the item equaling or exceeding the threshold match score (training, retraining of the system, In the event there is a low confidence or wrong prediction, this flags to the system that that class should be retrained and could mean that the original images that the system was trained on are not accurate so they should be flagged for a labeler to verify that the image is correct. In some embodiments, there is no labeling team and the new data is automatically retrained on. There may be cleaning steps to remove likely inaccurate data. Such an example of automatic cleaning is by calculating the loss of the objective function (typically cross-entropy loss) produced by the prediction and the truth label on a specific image, and if the loss is above a certain number (i.e. 10 for cross-entropy), then that is too high for the model to be that wrong, so we can remove this image from our master image library. In other embodiments, the labeling team is instructed to label and relabel certain data before retraining occurs. With an updated set of labeled images, we retrain the models in the pipeline and test it on a curated set of test transactions in a simulator to ensure the newly trained models outperform the previous models. In some embodiments, the labeling team would visualize a few error cases and cluster the type of errors into specific buckets called Pareto buckets (i.e. “over-exposed image”, “dark products”, “occluded item”, “apples”). Then the labeling team will relabel or gather more data to solve the highest offending bucket of errors to get more accurate data to retrain the models or change the way the data is collected with new angles or new hardware or change the training or model or inference methods or thresholds, Chaubard Col. 10 line 43-Col. 11 line 18); and
update a transaction list associated with the customer transaction with item information for the verified item (training, retraining of the system, In the event there is a low confidence or wrong prediction, this flags to the system that that class should be retrained and could mean that the original images that the system was trained on are not accurate so they should be flagged for a labeler to verify that the image is correct. In some embodiments, there is no labeling team and the new data is automatically retrained on. There may be cleaning steps to remove likely inaccurate data. Such an example of automatic cleaning is by calculating the loss of the objective function (typically cross-entropy loss) produced by the prediction and the truth label on a specific image, and if the loss is above a certain number (i.e. 10 for cross-entropy), then that is too high for the model to be that wrong, so we can remove this image from our master image library. In other embodiments, the labeling team is instructed to label and relabel certain data before retraining occurs. With an updated set of labeled images, we retrain the models in the pipeline and test it on a curated set of test transactions in a simulator to ensure the newly trained models outperform the previous models. In some embodiments, the labeling team would visualize a few error cases and cluster the type of errors into specific buckets called Pareto buckets (i.e. “over-exposed image”, “dark products”, “occluded item”, “apples”). Then the labeling team will relabel or gather more data to solve the highest offending bucket of errors to get more accurate data to retrain the models or change the way the data is collected with new angles or new hardware or change the training or model or inference methods or thresholds, Chaubard Col. 10 line 43-Col. 11 line 18).
Chaubard does not expressly disclose query a remote database located remotely from the data reading system to obtain the reference stock descriptor for the item if the reference stock descriptor for the item is not found within the local database.
However, Goncalves teaches query a remote database located remotely from the data reading system to obtain the reference stock descriptor for the item if the reference stock descriptor for the item is not found within the local database (merchandise checkout, image based search, Goncalves ¶36; database [140], Fig. 1, ¶43-¶44; remote from system, ¶116).
Both the Goncalves and Chaubard references are analogous in that both are directed towards/concerned with image based merchandise checkout.  Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use Goncalves’ ability to query a remote database for images in Chaubard’s system to improve the system and method with reasonable expectation that this would result in a merchandise checkout management system that could retrieve and analyze data from an image more efficiently.  
The motivation being that the difficulty with typical methods, however, is that as the database increases in size (i.e., as the number of known objects desired to be recognized increases), it becomes increasingly difficult to find the nearest-neighbors because the algorithms used for nearest-neighbor search are probabilistic. The algorithms do not guarantee that the exact nearest-neighbor is found, but that the nearest-neighbor is found with a high probability. As the database increases in size, that probability decreases, to the point that with a sufficiently large database, the probability approaches zero. Thus, the inventors have recognized a need to efficiently and reliably perform object recognition even when the database contains a large number (e.g., thousands, tens of thousands, hundreds of thousands or millions) of objects (Goncalves ¶5).	
In addition, the Examiner asserts that claim scope is not limited by claim language that suggests or makes optional but does not require steps to be performed, or by claim language that does not limit a claim to a particular structure. However, examples of claim language, although not exhaustive, that may raise a question as to the limiting effect of the language in a claim are:  (A) "adapted to" or "adapted for" clauses; (B) "wherein" clauses; and (C) "whereby" clauses (See MPEP 2111.04).  In the instant case, the recited "query a remote database located remotely from the data reading system to obtain the reference stock descriptor for the item if the reference stock descriptor for the item is not found within the local database " is not a positive method step as it do not require any actual positive recited claim steps to be performed; nor does it modify any of the positively claimed method steps.  Similarly, the recited limitation is not a positive system element since it doesn’t structurally limit the system and merely describes the intended use of the system and/or the intended result of the use of the system.  
Still further, Chaubard discloses the claimed invention except for querying a remote database. It would have been obvious to one having ordinary skill in the art before time the invention was made to have a master database be remote, since it has been held that making an old device (or database) portable or movable (or remote) without producing any new and unexpected result involves only routine skill in the art. In re Lindberg, 93 USPQ 23 (CCPA 1952).

As per claim 3, Chaubard and Goncalves disclose as shown above with respect to claim 1.  Chaubard further discloses wherein the threshold match score is based on an item class for the item (based on classified images, Chaubard Col. 8 lines 20-42).

As per claims 4 and 17, Chaubard and Goncalves disclose as shown above with respect to claims 1 and 11.  Chaubard further discloses wherein the reference stock descriptor includes a first optical code and a first set of optical characters including one or both of numeric and alphanumeric text adjacent the first optical code, and wherein the item image includes a second optical code and a second set of optical characters including one or both of numeric and alphanumeric text adjacent the second optical code, the processor further configured to: define a region-of-interest in the captured item image, the region-of-interest containing the second optical code and the second set of optical characters; and crop the captured item image to define a cropped image containing the region-of-interest, wherein the processor comparing one or more item features of the reference stock descriptor to a corresponding one or more item features of the captured item image includes the processor comparing the first set of optical characters in the reference stock descriptor to the second set of optical characters in the region-of-interest of the cropped image (These images are compared in the software to a library of pre-stored labeled images, if available, and these may include multiple images for each of the store's products, taken from different angles. If an item in the checkout area is recognized to a sufficiently high probability, meeting a preset threshold of, e.g., 90%, that item is added to the checkout list and charged to the customer. If probability is too low, below the preset threshold, the system may prompt the customer or cashier to reposition the item if it appears stacking or occlusion has occurred, or to remove and scan that item using the store's existing scanning system. This ties the new image to the known store item, adding to the library of image data so that the item is more easily identified in future transactions at not just this register, but all registers that have this system. The customer or cashier can bag the items as the accelerated scanning is occurring. Finally, the customer checks out using cash, a credit card or a debit card as they do today or through a more modern method like face recognition, tap and go, or otherwise, Chaubard Col. 2 lines 18-37; The output of this portion of the processing per product bounding box is then a series of bounding shapes, each with an associated set of “features”, which could be written words and their locations in the images, such as “Tide” or “Coca-Cola”, a bounding box around the barcode inside the product bounding shape if it is there, and the results of the template matching process which is a list of probabilities or distances of a match with each of the known classes. Then all of this information about this product within the bounding box is taken together and the system tries to infer the correct SKU (Stock Keeping Unit), UPC, or other product identifier of the product. To do this, the system preferably performs a “water-fall” classification scheme that works in the following way. First, if the barcode is read, then it is taken as truth and the process is finished. In some embodiments, the system will still continue the processing even after detecting a barcode to detect a form of theft called “tag switching”, a trick that people looking to steal from the store use, which consists of adding a tag of an inexpensive item (like ground beef) on an expensive item (like filet mignon) to get a lower price. If the barcode is not detected, the system tries to match the text read from the OCR to a database of known information about all the “known” products and to match the text to a dictionary or lexicon via Jaccard Similarity or otherwise. If there is a match above a certain threshold, then the process is finished and we have our match. In some embodiments, we will continue our processing to increase system accuracy. If there is no text or the Jaccard Similarity is not high enough, then the template matching pipeline or embedding vector method is run to produce a set of classifications, scores, or distances of all “known” products and the system will merge that information with other known information such as previous “track information” (which we will define further below) or the OCR Jaccard Similarity information or barcode information or any other information to produce a best estimated probability across all “known” products and take the argmax of those probabilities over all known products/SKUs to make a final guess as to what the product in the bounding shape is. Finally, the cleaned output of this pipeline is a list of bounding shapes for each product in the image, a guess of a product identifier per bounding shape, and an associated probability per guess per bounding shape. If this occurs at time (t), we will call this output Ft, Chaubard Col. 9 line 34-Col. 10 line 9).

As per claim 5, Chaubard and Goncalves disclose as shown above with respect to claim 4.  Chaubard further discloses where the processor is further configured to: compute an optical character match score based on the comparison of the first set of optical characters in the reference stock descriptor to the second set of optical characters in the region-of-interest of the cropped image; compare the one or more item features of the reference stock descriptor to the one or more item features of the captured item image using a trained neural network; compute a neural network match score based on the comparison; and determine a combined item score based on an aggregate of the optical character match score and the neural network match score, wherein the item match score includes the combined item score (FIG. 5 shows, in a simplified block diagram, the steps of a transaction from the point of view of the system. In the block 45, the system detects the placement of items on the conveyor belt or table. At block 46, the camera or cameras take images or video of each item and seek to match the images with system library images using computer vision, specifically deep neural networks. If the library includes any images of that product, or it is otherwise classified correctly, the system displays on the screen that items, with images, have been identified and sends that information to the POS to append that item to the current transaction. As one example, a green bounding box (box can be any shape) can be shown around each item as it is identified, as noted in the block 48. The system detects any stacked or occluded items or other miscellaneous things that might be on the conveyor belt, prompting the customer/cashier to reposition such items, as in the block 50. Following this, new images are taken by the system after items have been repositioned or moved (block 52). As noted in the block 54, if any items are still unidentified, including after any choices have been presented to the customer/cashier as in FIG. 4, the customer or cashier is prompted to scan those items using the barcode scanner, Chaubard Col. 8 lines 20-42; see also convolutional neural network, Col. 8 line 65-line 33).

As per claims 6 and 16, Chaubard and Goncalves disclose as shown above with respect to claims 1 and 11.  Chaubard further discloses wherein the processor is further configured to: classify the identified item into an item class based on the identity of the item; and select an analysis algorithm from a plurality of available analysis algorithms for comparing one or more item features of the reference stock descriptor to the corresponding one or more item features of the captured item image, wherein the selection of the analysis algorithm is based on the item class of the item (The output of this portion of the processing per product bounding box is then a series of bounding shapes, each with an associated set of “features”, which could be written words and their locations in the images, such as “Tide” or “Coca-Cola”, a bounding box around the barcode inside the product bounding shape if it is there, and the results of the template matching process which is a list of probabilities or distances of a match with each of the known classes. Then all of this information about this product within the bounding box is taken together and the system tries to infer the correct SKU (Stock Keeping Unit), UPC, or other product identifier of the product. To do this, the system preferably performs a “water-fall” classification scheme that works in the following way. First, if the barcode is read, then it is taken as truth and the process is finished. In some embodiments, the system will still continue the processing even after detecting a barcode to detect a form of theft called “tag switching”, a trick that people looking to steal from the store use, which consists of adding a tag of an inexpensive item (like ground beef) on an expensive item (like filet mignon) to get a lower price. If the barcode is not detected, the system tries to match the text read from the OCR to a database of known information about all the “known” products and to match the text to a dictionary or lexicon via Jaccard Similarity or otherwise. If there is a match above a certain threshold, then the process is finished and we have our match. In some embodiments, we will continue our processing to increase system accuracy. If there is no text or the Jaccard Similarity is not high enough, then the template matching pipeline or embedding vector method is run to produce a set of classifications, scores, or distances of all “known” products and the system will merge that information with other known information such as previous “track information” (which we will define further below) or the OCR Jaccard Similarity information or barcode information or any other information to produce a best estimated probability across all “known” products and take the argmax of those probabilities over all known products/SKUs to make a final guess as to what the product in the bounding shape is. Finally, the cleaned output of this pipeline is a list of bounding shapes for each product in the image, a guess of a product identifier per bounding shape, and an associated probability per guess per bounding shape. If this occurs at time (t), we will call this output Ft, Chaubard Col. 9 line 34-Col. 10 line 9).

As per claims 7 and 15, Chaubard and Goncalves disclose as shown above with respect to claims 1 and 11.  Chaubard further discloses wherein the reference stock descriptor further includes one or more reference descriptors associated with the item, wherein the one or more reference descriptors includes any of: text information, text size, font type, or coordinate information for a reference feature associated with the item (The output of this portion of the processing per product bounding box is then a series of bounding shapes, each with an associated set of “features”, which could be written words and their locations in the images, such as “Tide” or “Coca-Cola”, a bounding box around the barcode inside the product bounding shape if it is there, and the results of the template matching process which is a list of probabilities or distances of a match with each of the known classes. Then all of this information about this product within the bounding box is taken together and the system tries to infer the correct SKU (Stock Keeping Unit), UPC, or other product identifier of the product. To do this, the system preferably performs a “water-fall” classification scheme that works in the following way. First, if the barcode is read, then it is taken as truth and the process is finished. In some embodiments, the system will still continue the processing even after detecting a barcode to detect a form of theft called “tag switching”, a trick that people looking to steal from the store use, which consists of adding a tag of an inexpensive item (like ground beef) on an expensive item (like filet mignon) to get a lower price. If the barcode is not detected, the system tries to match the text read from the OCR to a database of known information about all the “known” products and to match the text to a dictionary or lexicon via Jaccard Similarity or otherwise. If there is a match above a certain threshold, then the process is finished and we have our match. In some embodiments, we will continue our processing to increase system accuracy. If there is no text or the Jaccard Similarity is not high enough, then the template matching pipeline or embedding vector method is run to produce a set of classifications, scores, or distances of all “known” products and the system will merge that information with other known information such as previous “track information” (which we will define further below) or the OCR Jaccard Similarity information or barcode information or any other information to produce a best estimated probability across all “known” products and take the argmax of those probabilities over all known products/SKUs to make a final guess as to what the product in the bounding shape is. Finally, the cleaned output of this pipeline is a list of bounding shapes for each product in the image, a guess of a product identifier per bounding shape, and an associated probability per guess per bounding shape. If this occurs at time (t), we will call this output Ft, Chaubard Col. 9 line 34-Col. 10 line 9; Every camera in the system has its own coordinate system. To fuse this information for each camera in a coherent way, the system will have a fixed and known position and orientation (position=x,y,z, orientation=α, β, γ and scaling=s.sub.x, s.sub.y, s.sub.z) from the global origin to each cameras coordinate system. These 9 numbers will be theoretically static for the life of the system but may require a recalibration from time to time. These parameters can be used to compute an Extrinsic Matrix, H, per camera which includes the following transformations: Rotation, Translation, and Scale. Depending on the type of camera used, each camera may also have a Intrinsic Matrix, K, which defines the deformation from the real world 3D space to the 2D space on the cameras sensor that is not modeled by H. Then H and K together will be used to convert each camera's specific point cloud in its own coordinate system to the global coordinate system, giving us a final fused point cloud across all cameras D.sub.t=[[R, G, B, X, Y, Z] . . . ], which if the cameras are positioned correctly and the products are not stacked, give us a full 360 volumetric coverage of each product.  Once we have the global fused point cloud D.sub.t, we can then run a clustering algorithm on D.sub.t to detect the existence of distinct objects. There are many clustering algorithms that can be used for this task. In one embodiment, we use a 3D CNN or RNN to cluster each specific point p=(x, y, z, R, G, B) to one of an unknown number of centroids. In other embodiments, we may be able to achieve the same thing with a simpler model like Density-based Scanning (DB-Scan) or Mean Shift, Col. 14 lines 9-37).

As per claim 9, Chaubard and Goncalves disclose as shown above with respect to claim 1.  Chaubard further discloses wherein the processor configured to compare one or more item features of the reference stock descriptor to the corresponding one or more item features of the captured item image further includes the processor employing a neural network for the comparison, the processor further configured to update neural network training parameters in response to validation feedback received, wherein the validation feedback includes exception feedback based on handling of the exception when the item match score fails to equal or exceed the threshold match score and verification feedback when the item match score for the item equals or exceeds the threshold match score (FIG. 5 shows, in a simplified block diagram, the steps of a transaction from the point of view of the system. In the block 45, the system detects the placement of items on the conveyor belt or table. At block 46, the camera or cameras take images or video of each item and seek to match the images with system library images using computer vision, specifically deep neural networks. If the library includes any images of that product, or it is otherwise classified correctly, the system displays on the screen that items, with images, have been identified and sends that information to the POS to append that item to the current transaction. As one example, a green bounding box (box can be any shape) can be shown around each item as it is identified, as noted in the block 48. The system detects any stacked or occluded items or other miscellaneous things that might be on the conveyor belt, prompting the customer/cashier to reposition such items, as in the block 50. Following this, new images are taken by the system after items have been repositioned or moved (block 52). As noted in the block 54, if any items are still unidentified, including after any choices have been presented to the customer/cashier as in FIG. 4, the customer or cashier is prompted to scan those items using the barcode scanner, Chaubard Col. 8 lines 20-42; see also convolutional neural network, Col. 8 line 65-line 33).

As per claim 13, Chaubard discloses as shown above with respect to claim 11.  Chaubard further discloses communicating, via the processor, the exception to a remote computer in operable communication with the data reading system; locking, via the processor, the data reading system to prevent the one or more data readers from capturing a second item image for a second item; receiving, via the processor, instructions for resolving the exception from the remote computer; and unlocking, via the processor, the data reading system in response to the received instructions (If probability is too low, below the preset threshold, the system may prompt the customer or cashier to reposition the item if it appears stacking or occlusion has occurred, or to remove and scan that item using the store's existing scanning system. This ties the new image to the known store item, adding to the library of image data so that the item is more easily identified in future transactions at not just this register, but all registers that have this system, Chaubard Col. 2 lines 26-32).

As per claim 14, Chaubard and Goncalves disclose as shown above with respect to claim 11.  Chaubard further discloses capturing, via a camera, an image of a customer associated with the transaction in response to the processor generating an exception; and transmitting, via the processor, the image of the customer from the camera to a remote computer (In another embodiment, the system would leverage another way to verify the age of the shopper which could be a login/password, a cameras system that performs Face Detection then either Facial Recognition or Age Recognition to verify the shoppers age, or a fingerprint scanning to perform identification of the shopper which would access a database of fingerprints matched to customer's ages, Chaubard Col. 17 lines 31-37).

As per claim 19, Chaubard and Goncalves disclose as shown above with respect to claim 17.  Chaubard further discloses computing, via the processor, a first match score for the first analysis technique based on the comparison of the one or more item features in the reference stock descriptor to the corresponding one or more item features in the captured item image; computing, via the processor, a second match score for the second analysis technique based on the comparison of the one or more item features in the reference stock descriptor to the corresponding one or more item features in the captured item image; determining, via the processor, a combined match score based on an aggregate of the first match score and the second match score; and comparing, via the processor, the combined match score to a threshold match score (If there is a match above a certain threshold, then the process is finished and we have our match. In some embodiments, we will continue our processing to increase system accuracy. If there is no text or the Jaccard Similarity is not high enough, then the template matching pipeline or embedding vector method is run to produce a set of classifications, scores, or distances of all “known” products and the system will merge that information with other known information such as previous “track information” (which we will define further below) or the OCR Jaccard Similarity information or barcode information or any other information to produce a best estimated probability across all “known” products and take the argmax of those probabilities over all known products/SKUs to make a final guess as to what the product in the bounding shape is. Finally, the cleaned output of this pipeline is a list of bounding shapes for each product in the image, a guess of a product identifier per bounding shape, and an associated probability per guess per bounding shape. If this occurs at time (t), we will call this output Ft, Chaubard Col. 9 line 34-Col. 10 line 9).

As per claim 20, Chaubard discloses as shown above with respect to claim 11.  Chaubard further discloses wherein the step of comparing, via the processor, the one or more item features in the reference stock descriptor to the corresponding one or more item features in the captured item image further comprises applying, via the processor, a trained neural network to both the reference stock descriptor and the captured item image, the method further comprising: receiving, via the processor, exception feedback based on handling of the exception when the item match score fails to equal or exceed the threshold match score; receiving, via the processor, verification feedback when the item match score for the item equals or exceeds the threshold match score; and updating, via the processor, training parameters of the trained neural network based on the exception feedback and the verification feedback (If there are still unidentified items, the customer/cashier is prompted to scan any such item with the scanner, as in the block 36. In the case where the system recognizes several possibilities for an item, each with a below-threshold possibility, the system can put on the screen several “guesses”. The customer is prompted to confirm the identity of a product, from two or more choices (block 36), and the customer makes a confirmation (block 37) on the touchscreen 20 (or using the keypad) to identify the correct product. Alongside the choices on the screen will be an image of the product in doubt, Col. 7 line 64-Col. 8 line 7; FIG. 5 shows, in a simplified block diagram, the steps of a transaction from the point of view of the system. In the block 45, the system detects the placement of items on the conveyor belt or table. At block 46, the camera or cameras take images or video of each item and seek to match the images with system library images using computer vision, specifically deep neural networks. If the library includes any images of that product, or it is otherwise classified correctly, the system displays on the screen that items, with images, have been identified and sends that information to the POS to append that item to the current transaction. As one example, a green bounding box (box can be any shape) can be shown around each item as it is identified, as noted in the block 48. The system detects any stacked or occluded items or other miscellaneous things that might be on the conveyor belt, prompting the customer/cashier to reposition such items, as in the block 50. Following this, new images are taken by the system after items have been repositioned or moved (block 52). As noted in the block 54, if any items are still unidentified, including after any choices have been presented to the customer/cashier as in FIG. 4, the customer or cashier is prompted to scan those items using the barcode scanner, Col. 8 lines 20-42; see also convolutional neural network, Col. 8 line 65-line 33; If the prediction probability is low, the system will prompt the user via the screen to select the fruit from the top guesses (e.g. the top five) as to what is the identity of the fruit or vegetable. Then the system will automatically send a message over a communication protocol to the existing system of the self checkout unit to insert that PLU. As explained above, the system takes the weight as measured by the scale multiplied by the price of that PLU to arrive at the total weight of the product, Col. 13 lines 10-18).

As per claim 21, Chaubard and Goncalves disclose as shown above with respect to claim 11.  Chaubard further discloses further comprising initiating an update process in response to a trigger, the update process including updating the reference stock descriptors stored in the local database of the data reading system responsive to receiving new image data at a remote host (These images are compared in the software to a library of pre-stored labeled images, if available, and these may include multiple images for each of the store's products, taken from different angles. If an item in the checkout area is recognized to a sufficiently high probability, meeting a preset threshold of, e.g., 90%, that item is added to the checkout list and charged to the customer. If probability is too low, below the preset threshold, the system may prompt the customer or cashier to reposition the item if it appears stacking or occlusion has occurred, or to remove and scan that item using the store's existing scanning system. This ties the new image to the known store item, adding to the library of image data so that the item is more easily identified in future transactions at not just this register, but all registers that have this system, Chaubard Col. 2 lines 18-37).

As per claim 22, Chaubard and Goncalves disclose as shown above with respect to claim 1.  Chaubard further discloses wherein the local cache of descriptors are selected based, at least in part, on at least one of a list of most common items scanned, most common items used in fraudulent transactions, most common items stolen, or item price (library of pre-stored labeled images, Chaubard Col. 2 lines 18-37).

As per claim 23, Chaubard and Goncalves disclose as shown above with respect to claim 1.  Chaubard further discloses wherein the reference stock descriptors are derived from a stock image and stored in a format that is not an image format (Thus, the system can employ an “embedding” Convolutional Neural Network that is a function that maps from a set of pixels, video, or depth pixels to a compact description or embedding vector that can be used to quickly compare the new unknown embedding vector to millions of other “known” embedding vectors that came from images with labels from the master image library with a distance function like cosine distance or Lp-Norm or otherwise. Such a comparison from between the unknown embedding vector to millions of known embedding vectors can be done in milliseconds on a Graphical Processing Unit (GPU) or other parallelized processing system, Col. 9 lines 22-33). 

As per claim 24, Chaubard and Goncalves disclose as shown above with respect to claim 23.  Chaubard further discloses wherein the reference stock descriptors include information that is organized into different groups based on different types of analysis for which the information is employed (Every camera in the system has its own coordinate system. To fuse this information for each camera in a coherent way, the system will have a fixed and known position and orientation (position=x,y,z, orientation=α, β, γ and scaling=s.sub.x, s.sub.y, s.sub.z) from the global origin to each cameras coordinate system. These 9 numbers will be theoretically static for the life of the system but may require a recalibration from time to time. These parameters can be used to compute an Extrinsic Matrix, H, per camera which includes the following transformations: Rotation, Translation, and Scale. Depending on the type of camera used, each camera may also have a Intrinsic Matrix, K, which defines the deformation from the real world 3D space to the 2D space on the cameras sensor that is not modeled by H. Then H and K together will be used to convert each camera's specific point cloud in its own coordinate system to the global coordinate system, giving us a final fused point cloud across all cameras D.sub.t=[[R, G, B, X, Y, Z] . . . ], which if the cameras are positioned correctly and the products are not stacked, give us a full 360 volumetric coverage of each product.  Once we have the global fused point cloud D.sub.t, we can then run a clustering algorithm on D.sub.t to detect the existence of distinct objects. There are many clustering algorithms that can be used for this task. In one embodiment, we use a 3D CNN or RNN to cluster each specific point p=(x, y, z, R, G, B) to one of an unknown number of centroids. In other embodiments, we may be able to achieve the same thing with a simpler model like Density-based Scanning (DB-Scan) or Mean Shift, Chaubard Col. 14 lines 9-38).

As per claim 25, Chaubard and Goncalves disclose as shown above with respect to claim 23.  Chaubard further discloses wherein the reference stock descriptors include multiple descriptors for a variation of an item, such that the multiple descriptors are retrieved from the local database or the remote database when the respective item is identified (The output of this portion of the processing per product bounding box is then a series of bounding shapes, each with an associated set of “features”, which could be written words and their locations in the images, such as “Tide” or “Coca-Cola”, a bounding box around the barcode inside the product bounding shape if it is there, and the results of the template matching process which is a list of probabilities or distances of a match with each of the known classes. Then all of this information about this product within the bounding box is taken together and the system tries to infer the correct SKU (Stock Keeping Unit), UPC, or other product identifier of the product. To do this, the system preferably performs a “water-fall” classification scheme that works in the following way. First, if the barcode is read, then it is taken as truth and the process is finished. In some embodiments, the system will still continue the processing even after detecting a barcode to detect a form of theft called “tag switching”, a trick that people looking to steal from the store use, which consists of adding a tag of an inexpensive item (like ground beef) on an expensive item (like filet mignon) to get a lower price. If the barcode is not detected, the system tries to match the text read from the OCR to a database of known information about all the “known” products and to match the text to a dictionary or lexicon via Jaccard Similarity or otherwise. If there is a match above a certain threshold, then the process is finished and we have our match. In some embodiments, we will continue our processing to increase system accuracy. If there is no text or the Jaccard Similarity is not high enough, then the template matching pipeline or embedding vector method is run to produce a set of classifications, scores, or distances of all “known” products and the system will merge that information with other known information such as previous “track information” (which we will define further below) or the OCR Jaccard Similarity information or barcode information or any other information to produce a best estimated probability across all “known” products and take the argmax of those probabilities over all known products/SKUs to make a final guess as to what the product in the bounding shape is. Finally, the cleaned output of this pipeline is a list of bounding shapes for each product in the image, a guess of a product identifier per bounding shape, and an associated probability per guess per bounding shape. If this occurs at time (t), we will call this output Ft, Chaubard Col. 9 line 34-Col. 10 line 9).

As per claim 26-27, Chaubard and Goncalves disclose as shown above with respect to claims 1 and 11.  Chaubard further discloses wherein a first item class and a second item class are based on different item prices and/or assessed risk factors for theft, and selection of the analysis algorithm includes applying a first algorithm when the item is identified as being part of the first class and applying a second algorithm when the item is identified as being part of the second class (FIG. 3 schematically indicates a retrofitting of an existing self-service checkout station in accordance with the invention. This is primarily to improve and complete the function of a self-service checkout stand by enabling the system to recognize produce and any other items that may not be capable of barcoding. The self-service stand 75 is updated with one or more cameras 76 for the purpose of identifying such non-machine readable items. A bag of oranges 78 is shown on the self-service kiosk's scale 80, which can be the platform on which barcode items are normally set for reading. The camera or cameras 76 take one or more images of the produce 78, and the process of identification as described above (and below) is initiated by the system of the invention. The kiosk's existing display screen 82 is used to display the identified item, preferably with price per pound and calculated cost of the produce item. Another screen 84 is indicated in this schematic view, illustrating a situation in which the system has a high level of confidence that the produce should be identified as oranges, but with a 4% possibility the produce could be grapefruit. The user/customer is prompted to select the correct item, which could be by touchscreen or by selection using the screen along with an input device such as shown at 86, which may be a component of the conventional system. The screen 84 can also provide prompts to the user, similar to those described above, when appropriate. Note that the screen 84, although shown separately, could be the same screen 82 as was included with the original equipment at the kiosk, Chaubard Col. 12 lines 34-61) (Examiner notes the ability to identify an item such as produce for which the price is a per pound calculation as the ability to classify items with different prices with specific algorithms).  

Conclusion
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to ANDREW B WHITAKER whose telephone number is (571)270-7563.  The examiner can normally be reached on M-F, 8am-5pm, EST.
If attempts to reach the examiner by telephone are unsuccessful, the Examiner’s supervisor, Lynda Jasmin can be reached on (571) 272-6782.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center. Status information for published applications may be obtained from Patent Center. Status information for unpublished applications is available through Patent Center for authorized users only. Should you have questions about access to Patent Center, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/uspto-
automated- interview-request-air-form	


/ANDREW B WHITAKER/Primary Examiner, Art Unit 3629
Read full office action
Prosecution Timeline

Dec 30, 2022
Application Filed
Dec 05, 2024
Non-Final Rejection mailed — §101, §103
Jun 05, 2025
Response Filed
Jul 09, 2025
Final Rejection mailed — §101, §103
Jan 09, 2026
Request for Continued Examination
Feb 03, 2026
Response after Non-Final Action
Feb 27, 2026
Non-Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/529,651
Patent 12614192
SYSTEMS AND METHODS FOR TRACKING TECHNICAL ENGAGEMENT OF DIGITAL RESOURCES
2y 4m to grant Granted Apr 28, 2026
18/401,492
Patent 12600221
REAL ESTATE NAVIGATION SYSTEM FOR REAL ESTATE TRANSACTIONS
2y 3m to grant Granted Apr 14, 2026
18/469,861
Patent 12530700
SYSTEM AND METHOD FOR DETERMINING BLOCKCHAIN-BASED CRYPTOCURRENCY CORRESPONDING TO SCAM COIN
2y 4m to grant Granted Jan 20, 2026
18/317,684
Patent 12443963
License Compliance Failure Risk Management
2y 5m to grant Granted Oct 14, 2025
18/619,142
Patent 12299696
METHODS AND SYSTEMS FOR PROCESSING SMART GAS REGULATORY INFORMATION BASED ON REGULATORY INTERNET OF THINGS
1y 1m to grant Granted May 13, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
19%
Grant Probability
38%
With Interview (+19.1%)
4y 2m (~9m remaining)
Median Time to Grant
High
PTA Risk
Based on 558 resolved cases by this examiner. Grant probability derived from career allowance rate.