Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
1. A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/27/2026 has been entered.
Independent claims 1, 7, and 13 are currently amended. Claims 1-18 are pending for examination. Claims 2-6 depend from claim 1, claims 8-12, and claims 14-18 depend from claim 13.
Claim Rejections - 35 USC § 101
2. 35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more, when analyzed as per MPEP 2106.
Step 1 analysis:
Claims 1-6 are to manufacture, claims 7-12 are to process comprising a series of steps, and claims 13-18 are to a device, which are statutory (Step 1: Yes).
Step 2A Analysis:
Claim 1 recites:
1. (Currently Amended) A non-transitory computer-readable recording medium having stored therein an information processing program that causes a computer to execute a process comprising:
(i) obtaining a video of a store's interior where a plurality of products are displayed as target products for purchase;
(ii) analyzing respective image frames of the obtained video in time series;
(iii) identifying for each image frame, based on a result of the analyzing, a first- type region that covers a target product from the plurality of products, a second-type region that covers a customer as a person purchasing the target product captured in the video, and a relationship that recognizes interaction between the first-type region and the second-type region;
(iv) detecting, from the video, a number of times and/or a time period when the relationship is identified for each of the target products in the first-tvpe region that is identified in each image frame; [[and]]
(v) by generating a correspondence relationship obtained by associating each of the target products, the identified relationship, and the number of times and/or the time period when the relationship is identified , extracting, from the video of the store's interior, a product that increases purchase intention of the customer; and
wherein the extracting includes:
counting the number of relationships in which a same relationship between a same customer and a same product identified for a number of times within a predetermined time period is counted as one time, and
counting the time period in which a same relationship between a same customer and a same product identified across the predetermined number of image frames is counted as the time period of the relationship corresponding to the predetermined number of image frames.
Step 2A Prong 1 analysis: This part of the eligibility analysis evaluates whether the claim recites a judicial exception. As explained in MPEP 2106.04, subsection II, a claim “recites” a judicial exception when the judicial exception is “set forth” or “described” in the claim.
Claims 1-18 recite abstract idea.
The highlighted limitations comprising, “ analyzing respective image frames of the obtained video in time series; identifying for each image frame, based on a result of the analyzing, a first- type region that covers a target product from the plurality of products, a second-type region that covers a customer as a person purchasing the target product captured in the video, and a relationship that recognizes interaction between the first-type region and the second-type region; detecting, from the video, a number of times and/or a time period when the relationship is identified for each of the target products in the first-tvpe region that is identified in each image frame; and generating a correspondence relationship obtained by associating each of the target products, the identified relationship, and the number of times and/or the time period when the relationship is identified.” under their broadest regional interpretation, fall within the mental process groupings of abstract ideas because they cover concepts performed in the human mind, including observation, evaluation, judgment, and opinion. See MPEP 2106.04(a)(2), subsection III. That is, other than reciting “by a computer” nothing in the claim elements precludes the steps from practically being performed in the mind. For example, but for the “by the computer ” language, the claim encompasses a person obtaining a video of a physical store for analyzing the images of the video wherein the video images/frames provide information of the products and the interaction of shoppers and how a shopper looks with interest at an item and associate that person with that item including the activity of purchasing or mere looking with interest number of times . Using time series for this analysis can be done manually wherein the time-series helps in analyzing the video data collected over time to find trends, patterns, and seasonality, with the goal of forecasting future values, and using statistical techniques to model and predict future outcomes. The mere nominal recitation of by a computer does not take the claim limitations out of the mental process grouping.
The currently added limitations, “ extracting, from the video of the store's interior, a product that increases purchase intention of the customer; and wherein the extracting includes: counting the number of relationships in which a same relationship between a same customer and a same product identified for a number of times within a predetermined time period is counted as one time, and counting the time period in which a same relationship between a same customer and a same product identified across the predetermined number of image frames is counted as the time period of the relationship corresponding to the predetermined number of image frames.”, under their broadest regional interpretation, fall within the mental process groupings of abstract ideas because they cover concepts performed in the human mind, including observation, evaluation, judgment, and opinion. See MPEP 2106.04(a)(2), subsection III. That is, other than reciting “by a computer” nothing in the claim elements precludes the steps from practically being performed in the mind. For example, but for the “by the computer ” language, the claim encompasses a person viewing the video of the physical store analyzes the image frames of the video to evaluate and count the number of times a person or a potential shopper views or feels an identified or a targeted product in the identified region of the shop and the time period spent therein and making an opinion about the shopper’s increased purchase intention. The mere nominal recitation of by a computer does not take the claim limitations out of the mental process grouping.
See MPEP 2106.04(a)(2) Abstract Idea Groupings [R-07.2022] II. MENTAL PROCESSES: claims do recite a mental process when they contain limitations that can practically be performed in the human mind, including for example, observations, evaluations, judgments, and opinions. Examples of claims that recite mental processes include:• a claim to "collecting information, analyzing it, and displaying certain results of the collection and analysis," where the data analysis steps are recited at a high level of generality such that they could practically be performed in the human mind, Electric Power Group v. Alstom, S.A., 830 F.3d 1350, 1353-54, 119 USPQ2d 1739, 1741-42 (Fed. Cir. 2016); • a claim to collecting and comparing known information (claim 1), which are steps that can be practically performed in the human mind, Classen Immunotherapies, Inc. v. Biogen IDEC, 659 F.3d 1057, 1067, 100 USPQ2d 1492, 1500 (Fed. Cir. 2011); and • a claim to identifying head shape and applying hair designs, which is a process that can be practically performed in the human mind, …….
Thus, claims 1 and its dependent claims 2-6 recite an abstract idea.
Since the limitations of the other two independent claims 7 and 13 are similar to the limitations of claim 1, they are analyzed on the same basis reciting mental processes. Accordingly, claim 7 and its dependent claims 8-12 and claim 13 and its dependent claims 14-18 recite “Mental Processes”.
Thus, all pending claims 1-18 recite an abstract idea.
Step 2A Prong 2 analysis: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application. See MPEP 2106.04(d).
Claims 1-18 The judicial exception is not integrated into a practical application.
Claim 1 recites the additional limitations of using generic computer components executing the steps of “ (i)obtaining a video of a store's interior where a plurality of products are displayed as target products for purchase; (ii) analyzing respective image frames of the obtained video in time series; (iii) identifying for each image frame, based on a result of the analyzing, a first- type region that covers a target product from the plurality of products, a second-type region that covers a customer as a person purchasing the target product captured in the video, and a relationship that recognizes interaction between the first-type region and the second-type region; (iv) detecting, from the video, a number of times and/or a time period when the relationship is identified for each of the target products in the first-tvpe region that is identified in each image frame; (v) by generating a correspondence relationship obtained by associating each of the target products, the identified relationship, and the number of times and/or the time period when the relationship is identified, (vi) . extracting, from the video of the store's interior, a product that increases purchase intention of the customer; and wherein the extracting includes: counting the number of relationships in which a same relationship between a same customer and a same product identified for a number of times within a predetermined time period is counted as one time, and counting the time period in which a same relationship between a same customer and a same product identified across the predetermined number of image frames is counted as the time period of the relationship corresponding to the predetermined number of image frames.
The limitations “(i) obtaining a video of a store's interior where a plurality of products are displayed as target products for purchase are mere data gathering recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05. The limitations in step (i) are recited as being performed by a computer. The computer is recited at a high level of generality and is used as a tool to perform the generic computer function of receiving data. See MPEP 2106.05(f). Further, in the limitations of steps (ii), (iii), (iv), (v) and (vi) the computer is used to perform an abstract idea, as discussed above in Step 2A, Prong One, such that it amounts to no more than mere instructions to apply the exception using a generic computer. See MPEP 2106.05(f).
Even when viewed individually and in combination, these additional elements do not integrate the recited judicial exception into a practical application because they do not add any meaningful limits on practicing the abstract idea (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES). Since the other two independent claims 7 and 13 recite similar limitations, they are analyzed on the same basis as claim1 as being directed to an abstract idea.
Dependent claims 2-5 recite the steps of analyzing the image frame in time-series, , identifying, counting, generating product information, analyzing an image, generating skeletal frame information of the person, registering information which are mere extension of the steps discussed for claims 1 and under their broadest reasonable interpretation, cover performance in mind. The computer is recited at a high level of generality. In limitations of these steps the computer is used to perform an abstract idea, as discussed above in Step 2A, Prong One, such that it amounts to no more than mere instructions to apply the exception using a generic computer. See MPEP 2106.05(f). The other steps of outputting and displaying information are mere transmitting/displaying as extra post solution insignificant activity. See MPEP 2106.05(g) (“whether the limitation is significant”). These limitations do not impose any meaningful limits on the claim. The computer is recited at a high level of generality and is used as a tool to perform the generic computer function of outputting and displaying data.
Reference dependent claim 6, it recites applying machine learning model to the obtained video for identifying a person an object in a store and the interaction between the two. The identifying steps of a person and an object in the store and their interaction as discussed for claim relate to mental processes. The recitation of “using a machine learning model” merely indicates a field of use or technological environment in which the judicial exception is performed. Although the additional element “using a machine learning model-” limits the identified judicial exceptions “identifying a person and an object in the video in a region in a store and their interaction” this type of limitation merely confines the use of the abstract idea to a particular technological environment (machine learning model) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h).
Even when viewed in combination, these additional elements in the dependent claims 2-6 do not integrate the recited judicial exception into a practical application because they do not add any meaningful limits on practicing the abstract idea. (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES). Since the limitations of the other dependent claims 8-12, and 14-18 are similar to the limitations of claims 2-6, they are analyzed on the same basis as directed to an abstract idea.
Thus, all pending claims 1-18, as per Step 2A, Prong Two, are directed to the judicial exception.
Step 2B analysis: This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05.
The claims 1-18 do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Since claims are as per Step 2A are directed to an abstract idea, they have to be analyzed per Step 2B, if they recite an inventive step, i.e., the claim recite additional elements or a combination of elements that amount to “Significantly More” than the judicial exception in the claim.
As discussed above with respect to Step 2A Prong Two, the additional elements in the claims 1-18 amount to no more than mere instructions to apply the exception using a generic computer components, and generally linking the judicial exception to a particular technological environment or field of use. The same analysis applies here in 2B, i.e., mere instructions to apply the exception using a generic computer components, and generally linking the judicial exception to a particular technological environment or field of use using a generic computer components cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Additional elements of obtaining/gathering data, outputting data and displaying data were found to be insignificant extra-solution activity in Step 2A, Prong Two, because they were determined to be insignificant limitations as necessary data outputting and displaying. However, a conclusion that an additional element is insignificant extra-solution activity in Step 2A, Prong Two should be re-evaluated in Step 2B. See MPEP 2106.05, subsection I.A. At Step 2B, the evaluation of the insignificant extra-solution activity consideration takes into account whether or not the extra-solution activity is well understood, routine, and conventional in the field. See MPEP 2106.05(g). As discussed in Step 2A, Prong Two above, the recitations of outputting data and displaying data are recited at a high level of generality. These elements amount to transmitting and displaying data over a network and are well-understood, routine, conventional activity. See MPEP 2106.05(d), subsection II. The background of the example does not provide any indication that the computer components are anything other than a generic, off the shelf computer component and the Symantec, TLI, OIP Techs, Versata court decisions cited in MPEP 2106.05(d) (ii) indicate that mere receiving, acquiring, transmitting, and displaying steps using a generic computer are well-understood, routine, conventional functions when they are claimed in a merely generic manner (as it is here). Accordingly, a conclusion that the outputting and displaying data , as recited here, are well-understood, routine conventional activities and are supported under Berkheimer Option 2. See MPEP 2106.05 (f) 2:
Even when considered in combination, the additional elements in claims 1-18 represent mere instructions to implement an abstract idea or other exception on a computer and insignificant extra-solution activity, which do not provide an inventive concept. (Step 2B: NO).
Thus, claims 1-18 are patent ineligible.
Claim Rejections - 35 USC § 103
3. The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
3.1. Claims 1-5, 7-11, and 13-17 are rejected under 35 U.S.C. 103 as being unpatentable over Lipton et al. [US 2008/0018738 A1 cited in the IDS filed 02/16/2024, hereinafter Lipton in view of Cheriyadat et al. [US 20100322474A1], herein after Cheriyadat.
Reference claim 1, Lipton teaches A non-transitory computer-readable recording medium having stored therein an information processing program that causes a computer to execute a process [see paras 0031—0034 and claims 1 and 16] comprising:
obtaining a video of a store's interior where a plurality of products are displayed as target products for purchase; analyzing respective image frames the obtained video [ See paras 0031—0034, “[0031]…….. a system for video monitoring a retail business process comprising: a video analytics engine to process video obtained by a video camera and to generate video primitives regarding the video; a user interface to define at least one activity of interest regarding an area being viewed, wherein each activity of interest identifies at least one of a rule or a query regarding the area being viewed; and an activity inference engine to process the generated video primitives based on each defined activity of interest and to determine if an activity of interest occurred in the video……….. The video monitoring of a retail business includes a video including image frames of the inside of a store and these image frames are analyzed. Lipton does not teach using time-series. Cheriyadat teaches using time-series for analyzing video images , see para 0043 [“ The apparatus may include an image source that provides a time series of video images. For example, the image source can be a video camera that is configured to generate a time series of video images by continuously capturing video images in real time. Alternately, the image source can be a data storage device that stores a time series of video images, which is provided to the image analysis device. The image analysis device can include a processor in a computer and said image recording device may be embedded in said computer or externally connected to the computer.” ].. Therefore, in view of the teachings of Cheriyadat it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Lipton to include the concept of using time-series while analyzing video images captured of the interior of store, because as known use of time-series was well-understood, routine, conventional activities previously known to the industry by using a statistical method and therefore could be used with Lipton’s method and system for analyzing video images of the store collected intervals to identify patterns like trends and seasonality for the items being observed and looked at by customers visiting the store and to forecast future values.
identifying for each image frame, based on a result of the analyzing, a first- type region that covers a target product from the plurality of products (see para 0034, and para 0213--0214“ a video camera to obtain video of an area; a video analytics engine to process the obtained video and generate video primitives regarding the video; and an activity inference engine to process the generated video primitives based on at least one activity of interest regarding the area being viewed to determine if an activity of interest occurred in the video, wherein each activity of interest defines at least one of a rule or a query selectively identified by a user regarding the area being viewed._” and ‘" [0213]…….the automated video surveillance system 100 may generate a report 302 based on the rule 148 or query 150: how many people 304 enter the drug store aisle 306 during a time period, e.g., eight hours? , and “[0214.] ……the automated video surveillance system 100 may generate a report 310 based on the rule 148 or query 150: how many people 312 loiter in front of the cosmetics 314 during the time period, e.g. eight hours? “. These excerpts describe identifying , based on the result of video analysis covering different regions, such as pharmaceutical drugs, cosmetics, etc. which can include the target products being viewed or looked at or purchased by different customers];
a second-type region that covers a customer as a person purchasing the target product captured in the video, and a relationship that recognizes interaction between the first-type region and the second-type region [[See paras 0100—0114, which describe the second area being captured by the video relating to a person who is the target for selling the displayed objects in a retail store “ 0100] Blob video primitives may be generated when a blob is detected. A blob may refer to a single frame instance of a spatially continuous moving target. Blob video primitives may include generic primitive data and blob primitive data. The blob primitive data may be spatial descriptors. The following exemplary blob primitive data may include the following exemplary information: [0101] Area: Number of pixels comprising the blob.[0102] Perimeter: Number of pixels comprising a boundary of the blob. [0103] Bounding box: (x,y) coordinates of top-left and bottom-right of a blob bounding box.[0104] Centroid: (x,y) coordinates of a blob centroid.[0105] Foot location: (x,y) coordinates of a location of a bottom of the object, e.g., the feet of a human, the wheels of a vehicle or a shopping cart, etc.[0106] Number of objects/humans: A number of individual human objects detected in the blob. [0107] Human head locations: (x,y) location of heads and the radius of the heads detected in the blob.[0108] Color properties: histogram of blob colors and shades, e.g., 10 bins (7 colors, 3 shades) in HSV color space. [0109] Shape: Bitmask of an object shape. [0110] Skin-tone: Proportion and bitmask of pixels with skin-tone coloration. [0111] Blob imagery: Image of the blob. [0112] Target video primitives may describe a snapshot of a moving target and may include generic primitive data and target primitive data. A target may refer to a complete description of a target over time, e.g., a sequence of blobs. The following exemplary target primitive data may include the following exemplary information: [0113] Target identifier: A GUID [Globally unique identifier] for each target. [0114] Target age: Time since target was first seen.”. These excerpts describe that the video images when analyzed relate to a blob who represents a person identified as a targeted person moving in around the displayed objects in a retail space and the person relates to the covered second type region.] .The video analysis, and para 0212 describe a relationship that recognizes interaction between the first-type region [cosmetics in the reference] and the second-type region [the person who is the customer] [0212] With reference again to FIG. 1, the report generation engine 80 may accumulate events into a report that may provide an overview of retail business processes. For example, for the retail business process of intelligence data gathering, an exemplary report may include people counting and shopping behavior. For people counting, if it is desired to count the number of people who enter or exit a store, the report generation engine may be used to agglomerate individual entry and exit events, detected via a tripwire(s), for example, and provide a time-based histogram of consumer behavior. ……... The report generation engine 180 may provide a report that may outline how many people travel through a particular area of a store, how many people stop in front a particular display, how long people stop, on average, and, if the POS data is available, how the POS data compare with sales of the respective product.”; and
detecting, from the video, a number of times and/or a time period when the relationship is identified for each of the target products in the first-tvpe region that is identified in each image frame [see para 0034, and para 0213--0214“ a video camera to obtain video of an area; a video analytics engine to process the obtained video and generate video primitives regarding the video; and an activity inference engine to process the generated video primitives based on at least one activity of interest regarding the area being viewed to determine if an activity of interest occurred in the video, wherein each activity of interest defines at least one of a rule or a query selectively identified by a user regarding the area being viewed._” and ‘" [0213]…….the automated video surveillance system 100 may generate a report 302 based on the rule 148 or query 150: how many people 304 enter the drug store aisle 306 during a time period, e.g., eight hours? , and “[0214.] ……the automated video surveillance system 100 may generate a report 310 based on the rule 148 or query 150: how many people 312 loiter in front of the cosmetics 314 during the time period, e.g. eight hours? “ These excerpts describe detecting , based on the result of video analysis covering different regions, such as pharmaceutical drugs, cosmetics, etc. which can include the target products being viewed or looked at or purchased by different customers including the time period such as in eight hours how many peopled the drug region and how many people entered the cosmetics region based on predetermined rules and based on the analysis a report is generated including the number of people visited different regions in a predetermined time period [ corresponds to the claimed generating a correspondence relationship obtained by associating each of the target products, the identified relationship, and the number of times and/or the time period when the relationship is identified] and further see paras 0178, 01183, 0204 and 0212 “ [0178]…….Person grabs more than "x" items from a shelf.[...] ", [0183:] ……A number of items scanned by the point of sale (POS) system does not match a number of items detected by the video analytics engine.", “ [204]……..FIG. 2D depicts an example of an application to monitor a shelf 230 of high value items 232, also known as shelf clearing. Here, the system is counting the number of times a person 234 grabs an item off the shelf. For example, a store policy in such application may dictate that if more than four items are grabbed, the activity should be flagged for later investigation.") which show associating the identified relationship between a person and the item by detecting from the POS data that the item/cosmetics covered in the first region [grabbing of items] has been bought. Also, see paras 0145 and 0154, “ [0145] Time: Is current time in a prescribed time window? Is the time a recurring or repetitive pattern?”, “ [0154] Dwell time in AOI: Measure the time targets spend in an AOI, and report the time. Reporting can happen for all targets, only for targets dwelling at least a predefined amount of time…”, wherein AOI refers to area of interest and Lipton also discloses generating a time stamp of the video frames which provide the time period taken for any relationship to be established between a person and an object being considered by the person in the store.
Lipton further suggests the limitations extracting, from the video of the store's interior, a product that increases purchase intention of the customer; and wherein the extracting includes: counting the number of relationships in which a same relationship between a same customer and a same product identified for a number of times within a predetermined time period is counted as one time, and counting the time period in which a same relationship between a same customer and a same product identified across the predetermined number of image frames is counted as the time period of the relationship corresponding to the predetermined number of image frames See paras 0045 0178, 0194, 0204 “….[0178] Detection of clearing out shelf space: Person grabs more than "x" items from a shelf……[0204] FIG. 2D depicts an example of an application to monitor a shelf 230 of high value items 232, also known as shelf clearing. Here, the system is counting the number of times a person 234 grabs an item off the shelf. For example, a store policy in such application may dictate that if more than four items are grabbed, the activity should be flagged for later investigation.”. Para 0045 teaches generating, for each product, product information and outputting the generated product information to the display device, “processing the structured input according to prescribed rules, and producing results of the processing as output.”, and para 0198 teaches displaying “Monitor a store for spills, fallen merchandise or displays.”. Also, see paras 0145 and 0154, “ [0145] Time: Is current time in a prescribed time window? Is the time a recurring or repetitive pattern?”, “ [0154] Dwell time in AOI: Measure the time targets spend in an AOI, and report the time. Reporting can happen for all targets, only for targets dwelling at least a predefined amount of time…”, wherein AOI refers to area of interest and Lipton also discloses generating a time stamp of the video frames which provide the time period taken for any relationship to be established between a person and an object being considered by the person in the store and also counts the number of times a person grabs the targeted item.
Regarding claim 2, the limitations, “ The non-transitory computer-readable recording medium according to claim 1, wherein the process further includes: first analyzing the each image frame in time series; identifying the customer based on a result of the first analyzing; second analyzing the each image frame including the identified customer; identifying, in a repeated manner, based on a result of the second analyzing, a first-type region that covers the target produc
counting, based on the identified relationship, type-by-type count of the identified relationship; generating, for each of the target products, product information in which the type-by-type count of the relationship and the target product covered in the first-type region are held in a corresponding manner; and outputting the generated product information to the display device [See paras 0045 0178, 0194, 0204 “….[0178] Detection of clearing out shelf space: Person grabs more than "x" items from a shelf……[0204] FIG. 2D depicts an example of an application to monitor a shelf 230 of high value items 232, also known as shelf clearing. Here, the system is counting the number of times a person 234 grabs an item off the shelf. For example, a store policy in such application may dictate that if more than four items are grabbed, the activity should be flagged for later investigation.”. Para 0045 teaches generating, for each product, product information and outputting the generated product information to the display device, “processing the structured input according to prescribed rules, and producing results of the processing as output.”, and para 0198 teaches displaying “Monitor a store for spills, fallen merchandise or displays.”. Also, see paras 0145 and 0154, “ [0145] Time: Is current time in a prescribed time window? Is the time a recurring or repetitive pattern?”, “ [0154] Dwell time in AOI: Measure the time targets spend in an AOI, and report the time. Reporting can happen for all targets, only for targets dwelling at least a predefined amount of time…”, wherein AOI refers to area of interest and Lipton also discloses generating a time stamp of the video frames which provide the time period taken for any relationship to be established between a person and an object being considered by the person in the store.
Reference claim 3, the limitations, “ The non-transitory computer-readable recording medium according to claim 1, wherein the process further includes: first analyzing the each image frame in time series; identifying the customer based on a result of the first analyzing; second analyzing the each image frame including the identified customer; identifying, in a repeated manner, based on a result of the second analyzing, a first-type region that covers the target product, a second-type region that covers the customer purchasing the target product in the store captured in the video, and a relationship that recognizes interaction between the first-type region and the second-type region; counting, based on the identified relationship, time period taken for a particular relationship to be established from among a plurality of relationships; generating, for each of the target products, product information in which the time period taken for the particular relationship to be established and the target product covered in the first-type region are held in a corresponding manner” are already covered in the discussion for claims 1 and 2 above.; and
displaying the generated product information in a display device [see para 0045 and as discussed for claim 2 above].
Regarding claim 4, the limitations, “The non-transitory computer-readable recording medium according to claim 1, wherein the process further includes: identifying, based on the identified relationship and a rule which is set in advance, a section of interest of the target product that attracted interest of the customer [covered in the analysis of claim 1]; and for the limitations, “registering, in a storage, information in which the identified section of interest and the target product covered in the first-type region are held in a corresponding manner” see Lipton paras 0066, “ FIG. 1 illustrates an exemplary automated video surveillance system 100. A video camera 102 may be positioned to view an area of a retail enterprise to obtain video data. Optionally, the video data from the video camera 102 may be stored in a video database 104. The video data from the video camera 102 may be processed by a video analytics engine 120 to produce video primitives. The video primitives may be stored in a video primitive database 140.”. These excerpts also teach generating, based on information registered in the storage, for each of the target product, product information in which the section of interest and headcount of customers who showed interest in the section of interest are held in a corresponding manner [ see paras 0189, 0190, 0204, 0210, 0214, 0215 which describe counting the headcount showing interest in a product ; and outputting the generated product information to a display device” as already discussed above for claims 2-4, see para 0045].
Regarding claim 5, the limitations, “ The non-transitory computer-readable recording medium according to claim 1, wherein the process further includes: first analyzing an image frame of the second-type region; generating, based on a result of the first analyzing, skeletal frame information of the customer; identifying, based on the generated skeletal frame information, manner of grasping the target product, which is covered in the first-type region, by a customer covered in the second-type region [ already covered in the discussion and analysis for claim 1 above, wherein the grasping relates to a person showing interest for an item being viewed,]. Regarding the limitations, “ identifying, based on the identified manner of grasping, whether the customer took an action of showing interest in design of the target product or took an action of showing interest in raw materials of the target product; identifying, based on the identified action, section of interest of the target product that attracted interest of the customer; and registering, in a storage, information in which the identified section of interest and the target product are held in a corresponding manner, see para 0212 as discussed for claim 1, “ [0212] With reference again to FIG. 1, the report generation engine 80 may accumulate events into a report that may provide an overview of retail business processes. For example, for the retail business process of intelligence data gathering, an exemplary report may include people counting and shopping behavior. For people counting, if it is desired to count the number of people who enter or exit a store, the report generation engine may be used to agglomerate individual entry and exit events, detected via a tripwire(s), for example, and provide a time-based histogram of consumer behavior. ……... The report generation engine 180 may provide a report that may outline how many people travel through a particular area of a store, how many people stop in front a particular display, how long people stop, on average, and, if the POS data is available, how the POS data compare with sales of the respective product.”;. ‘.
Regarding claims 7-11 and 13-17, their limitations are similar to the limitations of claims 1-5 and accordingly, they are analyzed and rejected as being unpatentable over Lipton in view of Cheriyadat on the same basis as claims 1-5.
3.2.. Claims 6, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Lipton in view of Cheriyadat in view of Zalewski et al. [US Patent: 10, 573, 134 B1], hereinafter Zalewski.
Regarding claim 6, Lipton in view of Cheriyadat teaches and renders obvious the limitations of claim 1, as analyzed above. The limitations, “. The non-transitory computer-readable recording medium according to claim 1, wherein the identifying inputs the video to a machine learning model and identifies the first-type region, the second-type region, and the relationship, and the machine learning model is a HOID (Human Object Interaction Detection) model in which machine learning is implemented in such a way that first-type region information that indicates a first-type class indicating a person targeted for selling a product and indicates a region in which the person appears, second-type region information that indicates a second-type class indicating an object, which includes a product, and indicates a region in which the object appears, and interaction between the first-type class and the second-type class are identified.”, are similar to the ones already analyzed and covered in the analysis of claim 1 except that Lipton in view of Cheriyadat does not teach that the video is analyzed using a machine learning model. Zalewski, in the same field of tracking human interactions with items in a retail store, teaches using machine learning models to detect objects in which the shopper show interest [see col.138, lines 18-39 and Figs 59, 60A-60C , “ (727) FIG. 59 shows the machine learning model trained that is trained 5910 to label additional shopping behavior as take 5951 and return 5952 events, but also classify whether a shopper has high interest in an item 5954 or whether there exists a high churn 5953 risk associated with an item. The machine learning algorithm 5900 is trained based on interaction data 5920-5923. In one configuration, in addition to using sensed interaction data for training the model, other data representing features tied to account history are used. It one embodiment, features are engineered to classify labels both of which may be selected from but not limited to that listed on the tables of FIG. 60A-B, 6000 and 6001. (728) FIGS. 60A-60C illustrates example tables, which some possible features that can be processed, in accordance with one or more embodiments. Multiple machine learning models may be used to characterize and track all relevant activity in a retail outlet. For example, FIG. 60B shows multiple classifiers and example relationships between classifiers and outputs. An item manipulation model may identify taken, returned or misplaced items as well as who manipulated the item. Other relationships are shown in FIG. 60B but shown for example and should not be deemed to be limiting.”] . Therefore, in view of the teachings of Zalewski it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method and system of Lipton in view of Cheriyadat of analyzing the video of a human interaction with an item in a physical store for using machine learning just as human object interaction detection model because, as shown in Zalewski it enables to label the shopping behavior of a human showing an interest for an item and secondly, since the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claims 12 and 18, their limitations are similar to the limitations of claim 1 and accordingly, they are analyzed and rejected as being unpatentable over Lipton in view of Cheriyadat in view of Zalewski on the same basis as claim 6.
Response to Arguments
4.1. Rejection of claims 1-18 under 35 USC 101:
Applicant's arguments filed 09/25/2025, on pages 14-15 have been fully considered but they are not persuasive. Examiner disagrees with the Applicant’s arguments, “ The claims are amended to be more clearly directed toward statutory subject matter. For example, amended claim 1 recites the following: identifying for each image frame, based on a result of the analyzing, a first-type region that covers a target product from the plurality of products, a second-type region that covers a customer as a person purchasing the target product captured in the video, and a relationship that recognizes interaction between the first-type region and the second-type region; detecting, from the video, a number of times and/or a time period when the relationship is identified for each of the target products in the first-type region that is identified in each image frame; by generating a correspondence relationship obtained by associating each of the target products, the identified relationship, and the number of times and/or the time period when the relationship is identified, extracting, from the video of the store's interior, a product that increases purchase intention of the customer; and wherein the extracting includes: counting the number of relationships in which a same relationship between a same customer and a same product identified for a number of times within a predetermined time period is counted as one time, and
counting the time period in which a same relationship between a same customer and a same product identified across the predetermined number of image frames is counted as the time period of the relationship corresponding to the predetermined number of image frames. (Underline emphasis is added.) . As evidenced by the above, the amended claims are not directed toward an abstract idea. For example, the feature "extracting, from the video of the store's interior, a product that increases purchase intention of the customer" of amended claim 1 is not a mental process and does not fall under the abstract idea groupings (Step 2A, Prong1: No). Specifically, the claim features specify a correspondence relationship based on the analysis of an image frame and specific operations for extracting information from the image frame. (Step 2A, No). Further, the claims are amended to be more directly tied to e.g., the practical application of identification and characterization of a relationship between target product and customer, thereby providing an improved understanding of how a target product is being perceived by potentially buyers. (Step 2B, Yes). Amended independent claims 7 and 13 are similarly amended and directed toward statutory subject matter based on similar reasoning. Accordingly, withdrawal of the rejection of claims 1, 9, 12 and all claims depending therefrom is respectfully requested. “, because, even when viewed individually and in combination, the additional elements, as analyzed above in paragraph 2, do not integrate the recited judicial exception into a practical application to because they do not add any meaningful limits on practicing the abstract idea (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES) and do not amount to “Significantly More”. Since the other two independent claims 7 and 13 recite similar limitations, they are analyzed on the same basis as claim1 as being directed to an abstract idea.
The limitations comprising, “ analyzing respective image frames of the obtained video in time series; identifying for each image frame, b………a second-type region that covers a customer as a person purchasing the target product captured in the video, and a relationship that recognizes interaction between the first-type region and the second-type region; detecting, from the video, ……; and generating a correspondence relationship obtained by associating each of the target products, the identified relationship, and the number of times and/or the time period when the relationship is identified.” under their broadest regional interpretation, fall within the mental process groupings of abstract ideas because they cover concepts performed in the human mind, including observation, evaluation, judgment, and opinion. See MPEP 2106.04(a)(2), subsection III. That is, other than reciting “by a computer” nothing in the claim elements precludes the steps from practically being performed in the mind. For example, but for the “by the computer ” language, the claim encompasses a person obtaining a video of a physical store for analyzing the images of the video wherein the video images/frames provide information of the products and the interaction of shoppers and how a shopper looks with interest at an item and associate that person with that item including the activity of purchasing or mere looking with interest number of times . Using time series for this analysis can be done manually wherein the time-series helps in analyzing the video data collected over time to find trends, patterns, and seasonality, with the goal of forecasting future values, and using statistical techniques to model and predict future outcomes. The mere nominal recitation of by a computer does not take the claim limitations out of the mental process grouping. The currently added limitations, “ extracting, from the video of the store's interior, …….and wherein the extracting includes: ……counting the number of relationships in which a same relationship between a same customer and a same product identified ……. and counting the time period in which a same relationship between a same customer and a same product identified across the predetermined number of image frames is counted as the time period of the relationship corresponding to the predetermined number of image frames.”, under their broadest regional interpretation, fall within the mental process groupings of abstract ideas because they cover concepts performed in the human mind, including observation, evaluation, judgment, and opinion. See MPEP 2106.04(a)(2), subsection III. That is, other than reciting “by a computer” nothing in the claim elements precludes the steps from practically being performed in the mind. For example, but for the “by the computer ” language, the claim encompasses a person viewing the video of the physical store analyzes the image frames of the video to evaluate and count the number of times a person or a potential shopper views or feels an identified or a targeted product in the identified region of the shop and the time period spent therein and making an opinion about the shopper’s increased purchase intention. The mere nominal recitation of by a computer does not take the claim limitations out of the mental process grouping.
When analyzed per Step 2B, as discussed above with respect to Step 2A Prong Two, the additional elements in the claims 1-18 amount to no more than mere instructions to apply the exception using a generic computer components, and generally linking the judicial exception to a particular technological environment or field of use. The same analysis applies here in 2B, i.e., mere instructions to apply the exception using a generic computer components, and generally linking the judicial exception to a particular technological environment or field of use using a generic computer components cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
In view of the foregoing, the rejection of pending claims 1-18 under 35 USC 101 is sustainable and maintained.
4.2. Rejection of claims 1-18 under 35 USC 102 and 35 USC 103:
Applicant’s arguments, see pages 16-17, filed 01/27/2026, have been fully considered and are not persuasive as the currently added limitations are also suggested by the reference Lipton. Lipton does suggest the newly added limitations extracting, from the video of the store's interior, a product that increases purchase intention of the customer; and wherein the extracting includes: counting the number of relationships in which a same relationship between a same customer and a same product identified for a number of times within a predetermined time period is counted as one time, and counting the time period in which a same relationship between a same customer and a same product identified across the predetermined number of image frames is counted as the time period of the relationship corresponding to the predetermined number of image frames See paras 0045 0178, 0194, 0204 “….[0178] Detection of clearing out shelf space: Person grabs more than "x" items from a shelf……[0204] FIG. 2D depicts an example of an application to monitor a shelf 230 of high value items 232, also known as shelf clearing. Here, the system is counting the number of times a person 234 grabs an item off the shelf. For example, a store policy in such application may dictate that if more than four items are grabbed, the activity should be flagged for later investigation.”. Para 0045 teaches generating, for each product, product information and outputting the generated product information to the display device, “processing the structured input according to prescribed rules, and producing results of the processing as output.”, and para 0198 teaches displaying “Monitor a store for spills, fallen merchandise or displays.”. Also, see paras 0145 and 0154, “ [0145] Time: Is current time in a prescribed time window? Is the time a recurring or repetitive pattern?”, “ [0154] Dwell time in AOI: Measure the time targets spend in an AOI, and report the time. Reporting can happen for all targets, only for targets dwelling at least a predefined amount of time…”, wherein AOI refers to area of interest and Lipton also discloses generating a time stamp of the video frames which provide the time period taken for any relationship to be established between a person and an object being considered by the person in the store and also counts the number of times a person grabs the targeted item.
In view of the foregoing, the rejection of the independent claims 1, 7 and 13 and their dependent claims 2-6, 8-12, and 14-18 under 35 USC 103 is sustainable and maintained.
5. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
NPL reference:
(i) J. Kröckel and F. Bodendorf, "Customer Tracking and Tracing Data as a Basis for Service Innovations at the Point of Sale," 2012 Annual SRII Global Conference, San Jose, CA, USA, 2012, pp. 691-696, retrieved from IP. Com on 03/24/2026, describes [see Abstract] that for stationary retailers , such as grocery sellers, customized service offers requiring video recording the customer’s movement and actions at the point of sale, because the extracted recorded video information of the customer’s movements provides a variety of relevant information about customer behavior. Based on the analysis, applications for retail managers, sales personnel and automated customer services are introduced.
(ii) Suguna et al. “An Efficient Real time Product Recommendation using Facial Sentiment Analysis," 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), Coimbatore, India, 2019, pp. 1-6, retrieved from IP. Com on 10/23/2025 and cited in the Final Rejection mailed 10/27/2025 describes Amazon Rekognition a web service includes a simple, easy-to-use API that can quickly analyze any image file that's stored in Amazon S3 and the same can be utilized to build an application for a retail store, so that the application captures the customer faces, performs facial analysis and recommends appropriate products by displaying the targeted advertisements.
Foreign reference:
(iii) WO 2022235637 A1 cited in the Non-Final Rejection mailed 06/25/2025 [See claims 21 and 38] describe a processor facilitating online shopping for customers in physical retail stores including making determinations that a customer located in a physical retail store showing interest in a category of products and at least one action is associated with a target product which is out of stock in the physical retail store based on automatic analysis of images captured by one or more cameras located in the physical retail store. In response to these determinations providing a notification to the customer to order the target product via an online platform.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YOGESH C GARG whose telephone number is (571)272-6756. The examiner can normally be reached Max-Flex.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey A. Smith can be reached at 571-272-6763. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/YOGESH C GARG/Primary Examiner, Art Unit 3688