DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgement is made of Applicant’s claim of priority from U.S. Provisional Application No. 63/308,387, filed February 9, 2022.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on December 11, 2025 has been entered.
Status of Claims
Claims 1-20 are pending.
Response to Arguments
Applicant’s arguments, see p. 10-14, filed November 12, 2025, with respect to the 35 USC 103 rejections have been fully considered but are moot because of the new grounds of rejection presented in the sections below. Applicant argues that the previously proposed references do not teach the newly added limitation. However, the newly presented Steele reference teaches pruning based on run-time optimization goals that are objectives such as reducing the time it takes to generated predictions and reducing the amount of CPU or other resources consumed by predictions (see Steele, Para. [0176]). Examiner asserts that this is sufficient to teach the newly added limitation “wherein the quantization or pruning operations use one or more inference run-time metric to ensure that the model is scaled for deployment depending upon available resources on CPUs available in one or more retail store”, and in combination with the Zou, Adato and Liang reference would render claim 1 obvious over prior art. Thus, the 35 USC 103 rejections are upheld.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 3-5, 7-8, 10 and 12-17 are rejected under 35 U.S.C. 103 as being unpatentable over Zou et al. (US 11,842,321 B1, filed March 17, 2021) in view of Adato et al. (US 2019/0213546 A1) further in view of Liang et al. (US 2022/0327314 A1) and Steele et al. (US 2015/0379426 A1).
Regarding claim 1, Zou teaches a system for real-time, on-site empty shelf detection, the system comprising:
a plurality of cameras configured to capture corresponding images at a plurality of product displays at a retail environment (Zou, Col. 11, lines 62-67, multiple cameras are positioned throughout the store and oriented to capture and provide images of the various fixtures and the product instances supported or held by the fixtures);
an in-store computing system configured to receive the images from the plurality of cameras, the in-store computing system comprising (Zou, Col. 12, lines 15-23, the cameras are configured to capture still images of the fixtures and to provide the still images to one or more computer systems for processing. The computer systems may use the images for performing tasks related to inventory, checkout, payroll, time scheduling, and/or other aspects of store management):
a memory storing a machine learning model for empty space detection (Zou, Col. 14, lines 4-15, the classifier may comprise a convolutional neural network (CNN). Col. 39, lines 60-66, the memory provides storage of computer-readable instructions, data structures, program modules, and other data for the operation of the servers); and
a processor configured to implement the machine learning model to analyze the images and annotate the images with indications of empty space therein (Zou, Col. 39, lines 28-38, the servers may include one or more hardware processors configured to execute one or more stored instructions. Col. 32 line 61 – Col. 33 line 3, defining a fourth bounding box, representing, an empty space, between two bounding boxes. The planogram data may be updated to indicate that the bounding box associated with the corresponding 3D coordinates is an empty space on the shelf or other fixture).
Although Zou teaches annotating an empty space on the shelf with a bounding box (Zou, Col. 32 line 61 – Col. 33 line 3), Zou does not explicitly teach “wherein the machine learning model is configured to determine a quantity of empty space at the plurality of different product displays corresponding to the images from the plurality of cameras”. However, in an analogous field of endeavor, Adato teaches the size of a vacant space may be determined and quantified by any suitable technique. A vacant space may be characterized by a linear measurement, a unit of area, or a unit of volume (e.g., cubic centimeters, cubic inches, etc.) (Adato, Para. [0593]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Zou with the teachings of Adato by including determining from the bounding box the volume of the vacant space. One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for automatically analyzing images of products displayed in retail stores for providing one or more functions associated with the products, as recognized by Adato.
Although Zou in view of Adato teaches a classifier trained using supervised learning (Zou, Col. 16 line 64 – Col. 17 line 7), they do not explicitly teach “perform one or more quantization or pruning operations on the machine learning model” and “wherein the machine learning model is not replaced after the quantization or pruning operations”. However, in an analogous field of endeavor, Liang teaches pruning may be used to generally improve systems that include video analytics engines, and the like, for example by pruning out video analytics parameters that result in errors and/or by updating a full initial machine learning model to a smaller, hence faster, machine learning model. A faster, updated machine learning model may result in a video being acquired, and analyzed by the updated machine learning model, at higher framerates and/or higher resolution (Liang, Para. [0015]). The method may further comprise the controller and/or the video analytics engine, after pruning, applying the updated machine learning model (i.e., the machine learning model is not replace after pruning) (Liang, Para. [0092]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Zou in view of Adato with the teachings of Liang by including pruning the parameters of the machine learning model and applying it after pruning (i.e., not replacing it). One having ordinary skill in the art before the effective filing date would have been motivated to combine these references because doing so would allow for a smaller, faster machine learning model, as recognized by Liang.
Although Zou in view of Adato further in view of Liang teaches a pruning operation to update a machine learning model (Liang, Para. [0015]), they do not explicitly teach “wherein the quantization or pruning operations use one or more inference run-time metric to ensure that the model is scaled for deployment depending upon available resources on CPUs available in one or more retail store”. However, in an analogous field of endeavor, Steele teaches the trees may be pruned intelligently during a second pass of the training phase, e.g., to remove a subset of the nodes based on one or more run-time optimization goals (i.e., run-time metric). The term “run-time optimization goals” may be used herein to refer to objectives associated with executing a trained model to make predictions, such as reducing the time it takes to generate predictions for a test data set or a production data set, reducing the amount of CPU or other resources consumed for such predictions (i.e., depending upon available resources on CPUs available), and so on (Steele, Para. [0176]).
Therefore, it would have been obvious to one having ordinary skill to modify the system of Zou in view of Adato further in view of Liang with the teachings of Steele by including that the pruning operation uses a run-time optimization goal (i.e., run-time metric) to ensure that the model is scaled for deployment depending on the amount of CPU or other resources consumed by the CPUs in the retail store (i.e., the in-store computing system of Zou). One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for an optimized machine-learning model for making high-quality predictions, as recognized by Steele. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.
Regarding claim 3, Zou in view of Adato further in view of Liang and Steele teaches the system of claim 1, and further teach wherein annotating the image with the indication of an empty space comprises annotating the image with a flat face representing a front of an empty shelf section (Zou, Fig. 29, empty space 2910, Col. 29, lines 36-65, the product-volume detection component may generate a new bounding box corresponding to the empty space. The created bounding box may be defined by coordinates corresponding to side faces that touch the side faces of the first and second bounding boxes, a bottom and front face that corresponds to the aligned bottom and front faces of the first and second bounding boxes).
Regarding claim 4, Zou in view of Adato further in view of Liang and Steele teaches the system of claim 3, and further teaches wherein the quantity of empty space at the product display is a volume of a cuboid region on the product display behind the flat face (Adato, Para. [0593], the size of a vacant space may be determined and quantified by any suitable technique. A vacant space may be characterized by a linear measurement, a unit of area, or a unit of volume (e.g., cubic centimeters, cubic inches, etc.)).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 4 and are incorporated herein by reference. Thus, the system recited in Claim 4 is met by Zou in view of Adato further in view of Liang and Steele.
Regarding claim 5, Zou in view of Adato further in view of Liang and Steele teaches the system of claim 3, and further teaches wherein annotating the image with the indication of an empty space further comprises annotating the image with a flat face representing a back end of the empty shelf section (Zou, Fig. 29, empty space 2910, Col. 29, lines 36-65, the product-volume detection component may generate a new bounding box corresponding to the empty space. The created bounding box may be defined by coordinates corresponding to side faces that touch the side faces of the first and second bounding boxes, a bottom and front face that corresponds to the aligned bottom and front faces of the first and second bounding boxes).
Regarding claim 7, Zou in view of Adato further in view of Liang and Steele teaches the system of claim 1, and further teaches wherein determining a type of aisle corresponding to the at least one image comprises determining a type of product located at the aisle (Adato, Para. [0585], system may access a database to access data indicating where products of a certain type, category, brand, etc. are located in a retail store and use that information to determine where an area represented in an image, for example, is located within a retail store. The area may be identified as an aisle within a store, as a certain shelf within an aisle, etc.).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 7 and are incorporated herein by reference. Thus, the system recited in Claim 7 is met by Zou in view of Adato further in view of Liang and Steele.
Regarding claim 8, Zou in view of Adato further in view of Liang and Steele teaches the system of claim 7, and further teaches wherein the type of product located at the type of aisle is a product that is stacked on a shelf (Adato, Para. [0585], the area of the retail store may be identified by analyzing one or more images of the retail store to identify in the one or more images regions corresponding to the desired area, by analyzing a store map to identify in the one or more images regions corresponding to the desired area, and so forth. For example, the area may be identified based on the products detected in images depicting a store shelf).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 8 and are incorporated herein by reference. Thus, the system recited in Claim 8 is met by Zou in view of Adato further in view of Liang and Steele.
Regarding claim 10, Zou teaches a method for empty shelf detection, the method comprising:
detecting, by a plurality of cameras each arranged at a retail location, an image corresponding to a corresponding plurality of product displays (Zou, Col. 11, lines 62-67, multiple cameras are positioned throughout the store and oriented to capture and provide images of the various fixtures and the product instances supported or held by the fixtures);
sending the images corresponding to the plurality of product displays to an in-store computing system in realtime (Zou, Col. 12, lines 15-23, the cameras are configured to capture still images of the fixtures and to provide the still images to one or more computer systems for processing. The computer systems may use the images for performing tasks related to inventory, checkout, payroll, time scheduling, and/or other aspects of store management);
implementing a machine learning model to analyze the image and annotate the images with indications of an empty space (Zou, Col. 32 line 61 – Col. 33 line 3, defining a fourth bounding box, representing, an empty space, between two bounding boxes. The planogram data may be updated to indicate that the bounding box associated with the corresponding 3D coordinates is an empty space on the shelf or other fixture).
Although Zou teaches annotating an empty space on the shelf with a bounding box (Zou, Col. 32 line 61 – Col. 33 line 3), Zou does not explicitly teach receiving the images in “realtime”, “determining a quantity of empty space at the plurality of product displays corresponding to the images” and “adding a product to the product display at a location corresponding to the empty space”. However, in an analogous field of endeavor, Adato teaches a processing device that receives images in real-time (Adato, Para. [0678]), teaches the size of a vacant space may be determined and quantified by any suitable technique. A vacant space may be characterized by a linear measurement, a unit of area, or a unit of volume (e.g., cubic centimeters, cubic inches, etc.) (Adato, Para. [0593]), and teaches a restocking event may be detected when a part of a shelf is determined to be empty, when an amount of products on a part of a shelf is determined to be below a threshold associated with the part of the shelf and/or with the product type of the products (for example, according to a planogram), and so forth (Adato, Para. [0657]).
The proposed combination as well as the motivation for combining the Zou and Adato references presented in the rejection of Claim 1, apply to Claim 10 and are incorporated herein by reference.
Although Zou in view of Adato teaches a classifier trained using supervised learning (Zou, Col. 16 line 64 – Col. 17 line 7), they do not explicitly teach “performing one or more quantization or pruning operations on the machine learning model, wherein the machine learning model is not replaced after the quantization or pruning operations”. However, in an analogous field of endeavor, Liang teaches pruning may be used to generally improve systems that include video analytics engines, and the like, for example by pruning out video analytics parameters that result in errors and/or by updating a full initial machine learning model to a smaller, hence faster, machine learning model. A faster, updated machine learning model may result in a video being acquired, and analyzed by the updated machine learning model, at higher framerates and/or higher resolution (Liang, Para. [0015]). The method may further comprise the controller and/or the video analytics engine, after pruning, applying the updated machine learning model (i.e., the machine learning model is not replace after pruning) (Liang, Para. [0092]).
The proposed combination as well as the motivation for combining the Zou, Adato and Liang references presented in the rejection of Claim 1, apply to Claim 10 and are incorporated herein by reference.
Although Zou in view of Adato further in view of Liang teaches a pruning operation to update a machine learning model (Liang, Para. [0015]), they do not explicitly teach “wherein the quantization or pruning operations use one or more inference run-time metric to ensure that the model is scaled for deployment depending upon available resources on CPUs available in one or more retail store”. However, in an analogous field of endeavor, Steele teaches the trees may be pruned intelligently during a second pass of the training phase, e.g., to remove a subset of the nodes based on one or more run-time optimization goals (i.e., run-time metric). The term “run-time optimization goals” may be used herein to refer to objectives associated with executing a trained model to make predictions, such as reducing the time it takes to generate predictions for a test data set or a production data set, reducing the amount of CPU or other resources consumed for such predictions (i.e., depending upon available resources on CPUs available), and so on (Steele, Para. [0176])
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 10 and are incorporated herein by reference. Thus, the method recited in Claim 10 is met by Zou in view of Adato further in view of Liang and Steele.
Regarding claim 12, Zou in view of Adato further in view of Liang and Steele teaches the method of claim 10, and further teach wherein annotating the image with the indication of an empty space comprises annotating the image with a flat face representing a front of an empty shelf section (Zou, Fig. 29, empty space 2910, Col. 29, lines 36-65, the product-volume detection component may generate a new bounding box corresponding to the empty space. The created bounding box may be defined by coordinates corresponding to side faces that touch the side faces of the first and second bounding boxes, a bottom and front face that corresponds to the aligned bottom and front faces of the first and second bounding boxes).
Regarding claim 13, Zou in view of Adato further in view of Liang and Steele teaches the method of claim 12, and further teach wherein the quantity of empty space at the product display is a volume of a cuboid region on the product display behind the flat face (Adato, Para. [0593], the size of a vacant space may be determined and quantified by any suitable technique. A vacant space may be characterized by a linear measurement, a unit of area, or a unit of volume (e.g., cubic centimeters, cubic inches, etc.)).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 13 and are incorporated herein by reference. Thus, the method recited in Claim 13 is met by Zou in view of Adato further in view of Liang and Steele.
Regarding claim 14, Zou in view of Adato further in view of Liang and Steele teaches the method of claim 12, and further teaches wherein annotating the image with the indication of an empty space further comprises annotating the image with a flat face representing a back end of the empty shelf section (Zou, Fig. 29, empty space 2910, Col. 29, lines 36-65, the product-volume detection component may generate a new bounding box corresponding to the empty space. The created bounding box may be defined by coordinates corresponding to side faces that touch the side faces of the first and second bounding boxes, a bottom and front face that corresponds to the aligned bottom and front faces of the first and second bounding boxes).
Regarding claim 15, Zou in view of Adato further in view of Liang and Steele teaches the method of claim 14, and further teaches wherein the quantity of empty space at the product display is a volume of a cuboid region between the flat face representing the front of the empty shelf section and the flat face representing the back end of the empty shelf section (Adato, Para. [0593], the size of a vacant space may be determined and quantified by any suitable technique. A vacant space may be characterized by a linear measurement, a unit of area, or a unit of volume (e.g., cubic centimeters, cubic inches, etc.)).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 15 and are incorporated herein by reference. Thus, the method recited in Claim 15 is met by Zou in view of Adato further in view of Liang and Steele.
Regarding claim 16, Zou in view of Adato further in view of Liang and Steele teaches the method of claim 10, and further teaches wherein determining a type of aisle corresponding to the at least one image comprises determining a type of product located at the aisle (Adato, Para. [0585], system may access a database to access data indicating where products of a certain type, category, brand, etc. are located in a retail store and use that information to determine where an area represented in an image, for example, is located within a retail store. The area may be identified as an aisle within a store, as a certain shelf within an aisle, etc.).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 16 and are incorporated herein by reference. Thus, the method recited in Claim 16 is met by Zou in view of Adato further in view of Liang and Steele.
Regarding claim 17, Zou in view of Adato further in view of Liang and Steele teaches the method of claim 16, and further teaches wherein the type of product located at the type of aisle is a product that is stacked on a shelf (Adato, Para. [0585], the area of the retail store may be identified by analyzing one or more images of the retail store to identify in the one or more images regions corresponding to the desired area, by analyzing a store map to identify in the one or more images regions corresponding to the desired area, and so forth. For example, the area may be identified based on the products detected in images depicting a store shelf).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 17 and are incorporated herein by reference. Thus, the method recited in Claim 17 is met by Zou in view of Adato further in view of Liang and Steele.
Claims 2 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Zou et al. (US 11,842,321 B1, filed March 17, 2021) in view of Adato et al. (US 2019/0213546 A1) further in view of Liang et al. (US 2022/0327314 A1) and Steele et al. (US 2015/0379426 A1), as applied to claims 1, 3-5, 7-8, 10 and 12-17 above, and further in view of Kumar (US 11,468,400 B1).
Regarding claim 2, Zou in view of Adato further in view of Liang and Steele teaches the system of claim 1, as described above.
Although Zou in view of Adato further in view of Liang and Steele teaches a classifier that comprises a convolutional neural network (Zou, Col. 14, lines 4-15), they do not explicitly teach “wherein the processor is configured to conduct a drift analysis by either adjust a frequency of imaging by the at least one camera or by causing the machine learning model to be updated upon detecting a predetermined threshold of drift”. However, in an analogous field of endeavor, Kumar teaches determining whether accumulated drift is greater than a drift threshold and if the accumulated drift is greater than the drift threshold, the stable buffer is reset (Kumar, Col. 13, lines 48-64).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Zou in view of Adato further in view of Liang and Steele with the teachings of Kumar by including conducting a drift analysis by updating the machine learning model (i.e., resetting stable buffer) when drift exceeds a threshold. One having ordinary skill in the art would have been motivated to combine the proposed references because doing so would allow for preventing accuracy degradation of the model. Thus, the claimed invention would have obvious to one having ordinary skill in the art before the effective filing date.
Regarding claim 11, Zou in view of Adato further in view of Liang and Steele teaches the method of claim 10, as described above.
Although Zou in view of Adato further in view of Liang and Steele teaches a classifier that comprises a convolutional neural network (Zou, Col. 14, lines 4-15), they do not explicitly teach “further comprising conducting a drift analysis by either adjusting a frequency of imaging by the at least one camera or by causing the machine learning model to be updated upon detecting a predetermined threshold of drift”. However, in an analogous field of endeavor, Kumar teaches determining whether accumulated drift is greater than a drift threshold and if the accumulated drift is greater than the drift threshold, the stable buffer is reset (Kumar, Col. 13, lines 48-64).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang, Steele and Kumar references presented in the rejection of Claim 2, apply to Claim 11 and are incorporated herein by reference. Thus, the method recited in Claim 11 is met by Zou in view of Adato further in view of Liang, Steele and Kumar.
Claims 6 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Zou et al. (US 11,842,321 B1, filed March 17, 2021) in view of Adato et al. (US 2019/0213546 A1) ) further in view of Liang et al. (US 2022/0327314 A1) and Steele et al. (US 2015/0379426 A1), as applied to claims 1, 3-5, 7-8, 10 and 12-17 above, and further in view of Yi et al. (US 10,943,096 B2).
Regarding claim 6, Zou in view of Adato further in view of Liang and Steele teaches the system of claim 1, further comprising an image modeling system remote from the retail environment (Zou, Col. 21, lines 36-39, a remote system may receive and store image date received from the cameras and may analyze the image data), the image modeling system communicatively coupled to the inference server (Zou, Col. 21, lines 24-35, the environment communicable couped to a system (i.e., remote system) comprising one or more servers via one or more networks) and comprising a model development pipeline that includes:
receiving annotations of the (Zou, Col. 17, lines 19-27, classifiers may be trained on manually annotated images to identify different items (i.e., empty locations));
forming a trained model usable to identify empty shelf regions, the trained model being based on the(Zou, Col. 16 line 64 – Col. 17 line 7, the classifier can be trained using supervised learning, based on training images that have been manually annotated to show image segments corresponding to product instances or product lanes); and
wherein a model deployment platform is configured to:
receive, in a realtime data stream, one or more shelf camera images from cameras installed at the retail location (Adato, Para. [0678], processing device may continuously analyze the plurality of images and/or continuously receive real-time images); and
generate an output data stream indicative of shelf and product availability information based on the trained model generated via the model development pipeline (Zou, Col. 35, lines 22-32, the output data comprises information about the event. For example, where the event comprises an item being removed from an inventory location, the output data may comprise an item identifier indicative of the particular item that was removed from the inventory location and a user identifier of a user that removed the item. Output data may also include planogram data, such as coordinates of product volumes within the facility).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 6 and are incorporated herein by reference.
Although Zou in view of Adato further in view of Liang and Steele teaches providing images of the various fixtures and product instances (Zou, Col. 11, lines 62-67), they do not explicitly teach “creating a filtered data set of image samples of a retail shelf, the image samples meeting predefined quality criteria”. However, in an analogous field of endeavor, Yi teaches performing a data cleaning operation to identify a subset of images in the group of images which are “noisy” or “dirty”, e.g., being incorrectly-labeled and/or having very poor image quality. The disclosed data cleaning technique for raw training dataset includes an iterative operation which repeats a common data cleaning procedure for each and every group of identically-labeled face image (Yi, Col. 15, lines 6-26) which outputs high-quality training dataset comprising groups of cleaned and balanced images (Yi, Col. 25, lines 20-30).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Zou in view of Adato further in view of Liang and Steele with the teachings of Yi by including a data cleaning pipeline to clean the shelf images to create a filtered data set that meets quality criteria. One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for generating a clean and balanced training dataset, as recognized by Yi. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.
Regarding claim 18, Zou in view of Adato further in view of Liang and Steele teaches the method of claim 10 further comprising:
obtaining the machine learning model from an image modeling system remote from the retail environment (Zou, Col. 21, lines 36-39, a remote system may receive and store image date received from the cameras and may analyze the image data), the image modeling system communicatively coupled to the inference server (Zou, Col. 21, lines 24-35, the environment communicable couped to a system (i.e., remote system) comprising one or more servers via one or more networks) and comprising a model development pipeline that is executable on the one or more in store computing systems, wherein the model development pipeline includes:
receiving annotations of the filtered data set of image samples identifying one or more empty locations (Zou, Col. 17, lines 19-27, classifiers may be trained on manually annotated images to identify different items (i.e., empty locations));
forming a trained model usable to identify empty shelf regions, the trained model being based on the filtered data set of image samples and associated annotations (Zou, Col. 16 line 64 – Col. 17 line 7, the classifier can be trained using supervised learning, based on training images that have been manually annotated to show image segments corresponding to product instances or product lanes);
wherein a model deployment platform is configured to:
receive, in a realtime data stream, one or more shelf camera images from cameras installed at the retail location (Adato, Para. [0678], processing device may continuously analyze the plurality of images and/or continuously receive real-time images); and
generate an output data stream indicative of shelf and product availability information based on the trained model generated via the model development pipeline (Zou, Col. 35, lines 22-32, the output data comprises information about the event. For example, where the event comprises an item being removed from an inventory location, the output data may comprise an item identifier indicative of the particular item that was removed from the inventory location and a user identifier of a user that removed the item. Output data may also include planogram data, such as coordinates of product volumes within the facility).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 18 and are incorporated herein by reference.
Although Zou in view of Adato further in view of Liang and Steele teaches providing images of the various fixtures and product instances (Zou, Col. 11, lines 62-67), they do not explicitly teach “a data cleaning pipeline stage that is executable on the one or more in-store computing systems to create a filtered data set of image samples of a retail shelf, the image samples meeting predefined quality criteria”. However, in an analogous field of endeavor, Yi teaches performing a data cleaning operation to identify a subset of images in the group of images which are “noisy” or “dirty”, e.g., being incorrectly-labeled and/or having very poor image quality. The disclosed data cleaning technique for raw training dataset includes an iterative operation which repeats a common data cleaning procedure for each and every group of identically-labeled face image (Yi, Col. 15, lines 6-26) which outputs high-quality training dataset comprising groups of cleaned and balanced images (Yi, Col. 25, lines 20-30).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang, Steele and Yi references presented in the rejection of Claim 6, apply to Claim 18 and are incorporated herein by reference. Thus, the method recited in Claim 18 is met by Zou in view of Adato further in view of Liang, Steele and Yi.
Regarding claim 19, Zou teaches a real time empty shelf detection system comprising:
one or more in-store computing systems at a retail location, the one or more in-store computing systems implementing a model development pipeline and a model deployment platform (Zou, Col. 12, lines 15-23, the cameras are configured to capture still images of the fixtures and to provide the still images to one or more computer systems for processing. The computer systems may use the images for performing tasks related to inventory, checkout, payroll, time scheduling, and/or other aspects of store management);
wherein the model development pipeline includes:
receiving annotations of the filtered data set of image samples identifying one or more empty locations (Zou, Col. 17, lines 19-27, classifiers may be trained on manually annotated images to identify different items (i.e., empty locations));
forming a trained model usable to identify empty shelf regions, the trained model being based on the filtered data set of image samples and associated annotations (Zou, Col. 16 line 64 – Col. 17 line 7, the classifier can be trained using supervised learning, based on training images that have been manually annotated to show image segments corresponding to product instances or product lanes); and
wherein the model deployment platform is configured to:
generate an output data stream indicative of shelf and product availability information based on the trained model generated via the model development pipeline (Zou, Col. 35, lines 22-32, the output data comprises information about the event. For example, where the event comprises an item being removed from an inventory location, the output data may comprise an item identifier indicative of the particular item that was removed from the inventory location and a user identifier of a user that removed the item. Output data may also include planogram data, such as coordinates of product volumes within the facility).
Although Zou teaches providing images of the various fixtures and product instances (Zou, Col. 11, lines 62-67), Zou does not explicitly teach the model deployment platform is configured to “receive, in a realtime data stream, one or more shelf camera images from cameras installed at the retail location”. However, in an analogous field of endeavor, Adato teaches continuously analyzing the plurality of images and/or continuously receive real-time images (Adato, Para. [0678]).
The proposed combination as well as the motivation for combining the Zou and Adato references presented in the rejection of Claim 1, apply to Claim 19 and are incorporated herein by reference.
Although Zou in view of Adato teaches a classifier trained using supervised learning (Zou, Col. 16 line 64 – Col. 17 line 7), they do not explicitly teach “performing one or more quantization or pruning operations on the machine learning model, wherein the machine learning model is not replaced after the quantization or pruning operations”. However, in an analogous field of endeavor, Liang teaches pruning may be used to generally improve systems that include video analytics engines, and the like, for example by pruning out video analytics parameters that result in errors and/or by updating a full initial machine learning model to a smaller, hence faster, machine learning model. A faster, updated machine learning model may result in a video being acquired, and analyzed by the updated machine learning model, at higher framerates and/or higher resolution (Liang, Para. [0015]). The method may further comprise the controller and/or the video analytics engine, after pruning, applying the updated machine learning model (i.e., the machine learning model is not replace after pruning) (Liang, Para. [0092]).
The proposed combination as well as the motivation for combining the Zou, Adato and Liang references presented in the rejection of Claim 1, apply to Claim 19 and are incorporated herein by reference.
Although Zou in view of Adato further in view of Liang teaches a pruning operation to update a machine learning model (Liang, Para. [0015]), they do not explicitly teach “wherein the quantization or pruning operations use one or more inference run-time metric to ensure that the model is scaled for deployment depending upon available resources on CPUs available in one or more retail store”. However, in an analogous field of endeavor, Steele teaches the trees may be pruned intelligently during a second pass of the training phase, e.g., to remove a subset of the nodes based on one or more run-time optimization goals (i.e., run-time metric). The term “run-time optimization goals” may be used herein to refer to objectives associated with executing a trained model to make predictions, such as reducing the time it takes to generate predictions for a test data set or a production data set, reducing the amount of CPU or other resources consumed for such predictions (i.e., depending upon available resources on CPUs available), and so on (Steele, Para. [0176])
The proposed combination as well as the motivation for combining the Zou, Adato, Liang and Steele references presented in the rejection of Claim 1, apply to Claim 19 and are incorporated herein by reference.
Although Zou in view of Adato further in view of Liang and Steele teaches providing images of the various fixtures and product instances (Zou, Col. 11, lines 62-67), they do not explicitly teach “a data cleaning pipeline stage that is executable on the one or more in-store computing systems to create a filtered data set of image samples of a retail shelf, the image samples meeting predefined quality criteria”. However, in an analogous field of endeavor, Yi teaches performing a data cleaning operation to identify a subset of images in the group of images which are “noisy” or “dirty”, e.g., being incorrectly-labeled and/or having very poor image quality. The disclosed data cleaning technique for raw training dataset includes an iterative operation which repeats a common data cleaning procedure for each and every group of identically-labeled face image (Yi, Col. 15, lines 6-26) which outputs high-quality training dataset comprising groups of cleaned and balanced images (Yi, Col. 25, lines 20-30).
The proposed combination as well as the motivation for combining the Zou, Adato, Liang, Steele and Yi references presented in the rejection of Claim 6, apply to Claim 19 and are incorporated herein by reference. Thus, the system recited in Claim 19 is met by Zou in view of Adato further in view of Liang, Steele and Yi.
Regarding claim 20, Zou in view of Adato further in view of Liang, Steele and Yi teaches the real time empty shelf detection system of claim 19, and further teaches wherein generating an output data stream indicative of shelf and product availability information based on the trained model generated via the model development pipeline comprises annotating the one or more shelf camera images from the cameras installed at the retail location with a flat face corresponding to the empty shelf regions therein (Zou, Fig. 29, empty space 2910, Col. 29, lines 36-65, the product-volume detection component may generate a new bounding box corresponding to the empty space. The created bounding box may be defined by coordinates corresponding to side faces that touch the side faces of the first and second bounding boxes, a bottom and front face that corresponds to the aligned bottom and front faces of the first and second bounding boxes).
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Zou et al. (US 11,842,321 B1, filed March 17, 2021) in view of Adato et al. (US 2019/0213546 A1) further in view of Liang et al. (US 2022/0327314 A1) and Steele et al. (US 2015/0379426 A1), as applied to claims 1, 3-5, 7-8, 10 and 12-17 above, and further in view of Shah et al. (US 2014/0299663 A1).
Regarding claim 9, Zou in view of Adato further in view of Liang and Steele teaches the system of claim 7, as described above.
Although Zou in view of Adato further in view of Liang and Steele teaches determining a type of aisle based on the type of product on the shelf in that aisle (Adato, Para. [0585]), they do not explicitly teach “wherein the type of product located at the type of aisle is a product that is arranged on a hanger”. However, in an analogous field of endeavor, Shah teaches identifying different types of products on the display hanger and determining a quantity of each of the different types of products (Shah, Para. [0043]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the system of Zou in view of Adato further in view of Liang and Steele with the teachings of Shah by determining a type of aisle based on the type of products on display hangers. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for monitoring inventory in a store, as recognized by Shah. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Emma Rose Goebel whose telephone number is (703)756-5582. The examiner can normally be reached Monday - Friday 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached at (571) 272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Emma Rose Goebel/Examiner, Art Unit 2662
/AMANDEEP SAINI/Supervisory Patent Examiner, Art Unit 2662