DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice to Applicant
The following is a Final Office Action. In response to Examiner’s Non-Final
Rejection of 6/24/25, Applicant, on 12/23/25, amended claims. Claims 1-10 and 12-21 are pending in this application and have been rejected below.
Notice to Applicant
Applicant’s amendments are acknowledged.
The previous 112b rejections are withdrawn in light of the amendments and explanations. Examiner notes that claim 1-2 is being interpreted as classifying the image as either “before”, “during,” or “after” (40-44, FIG. 2), and any of those assist in the “workflow validation” as best understood from the Specification (e.g. [0003] – to show lawn needed care; snow was present). Applicant’s Remarks pointing to paragraphs 15-16 and 18 (as published/filed – same numbering) and FIG. 2 also provides support for the amendments to the independent claims.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-10 and 12-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e. an abstract idea) without reciting significantly more the claimed invention is directed to non-statutory subject matter.
Claims 1-10 and 12-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e. an abstract idea) without reciting significantly more.
Step One - First, pursuant to step 1 in MPEP 2106.03, the claim 1 is directed to an apparatus which is a statutory category.
Step 2A, Prong One - MPEP 2106.04 - The claim 1 recites–
“A … learning system for validating workflows, comprising;
….receiving at least one digital image relating to work being performed by a contractor; and
a workflow validation …:
processing the digital image using a filter … model to determine whether the digital image is suitable for validating whether a work action has been completed, the filter … configured to reject the digital image if the filter … model determines that the digital image is not suitable for validating whether the work action has been completed (perform image analysis… that image is suitable for determining that work action has completed; based on Applicant’s Remarks and FIG. 2 and [0018] as published, this refers to the filter determining image is “suitable” or “unsuitable” – akin to comparing to a threshold of “good image” or “bad image”); and
processing the digital image using an expert … model to classify whether the digital image depicts a scene occurring before, during, or after performance of the work action (perform image classification… that the scene in image is either “before” performance, “during,” or “after” work completed).”
As drafted, this is, under its broadest reasonable interpretation, within the Abstract idea grouping of “certain methods of organizing human activity – managing personal behavior ore relationships or interactions between people (following rules or instructions – validating work of contractors) and Mental Processes (concepts performed in the human mind (including an evaluation); as here we have work being performed by a contractor/person, looking at the image to see if the image is suitable for determining if the work has been completed (disclosure gives no details for how an image is “good/suitable” vs “bad/unsuitable”, so this is viewed as same as Mental process of person stating “image is good” or judging “image is bad”), and then sorting/classifying the image to see if the image is of a scene before, during, or after the performance of the work of the contractor. This is managing interactions between people as it is helping manage workers to ensure that tasks/jobs are completed by looking at pictures that could be before, during, or after work has been performed. Here, there are also no details on how either machine learning model is operating, and it encompasses the scope of a person looking at image to evaluate it as “work completed” and to say “the picture is before/during/after” the work.
Step 2A, Prong Two - MPEP 2106.04 - This judicial exception is not integrated into a practical application. In particular, the claim 1 recites additional elements that are:
“A machine learning system for validating workflows, comprising;
a computing device receiving at least one digital image relating to work being performed by a contractor; and
a workflow validation software module executing on the computing device, the workflow validation software module:
processing the digital image using a filter machine learning model to determine whether the digital image is suitable for validating whether a work action has been completed, the filter machine learning configured to reject the digital image if the filter machine model determines that the digital image is not suitable for validating whether the work action has been completed; and
processing the digital image using an expert machine learning model to classify whether the digital image depicts a scene occurring before, during, or after performance of the work action if the digital image is not rejected by the filter machine learning model.
Individually or in combination, MPEP 2106.05f applies –the claim involves a computer, memory storing instructions that are executing, where a computing device processes the image using a filter machine learning model (specification [0018] mentions “whether the image is suitable”, but no other details are given; [0013, 0018] mention “filter machine learning model” but no other details are given) to see if work “is completed” and a second “expert” machine learning model (See [0018] as published – no details are given as to what this may require) to sort/classify images for being time-based as before, during, or after performance of work”. Accordingly, the limitations, individually or in combination are considered “apply it [the abstract idea] on a computer”; merely uses a computer as a tool to perform an abstract idea; See July 2024 Subject Matter Eligibility Update, Example 47, claim 2; the “machine learning engines” here are “mere instructions to implement abstract idea on a computer at MPEP 2106.05f; see also MPEP 2106.05h “field of use” for combination of computer with a 1st determination [work completed]; and a 2nd determination [before/during/after work]). Accordingly, we have two separate evaluations [one for “is image suitable for validating whether work action completed”?] and a second, unrelated one, for determining if it’s “before, during, or after” performance of work. There is no technical involvement between the learning models at this time in claim 1.
Step 2B in MPEP 2106.05 - The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of a computing system, is treated as MPEP 2106.05(f) (Mere Instructions to Apply an Exception – “Thus, for example, claims that amount to nothing more than an instruction to apply the abstract idea using a generic computer do not render an abstract idea eligible.” Alice Corp., 134 S. Ct. at 235); MPEP 2106.05h (field of use)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.
Independent claim 12 is directed to a method at step 1, which is a statutory category. Claim 12 recites similar limitations as claim 1 and is rejected for the same reasons at step 2a, prong one, 2a, prong 2, and step 2b.
Claims 2 and 13 also further narrows the abstract idea, as Applicant’s [0022] as published states “It is noted that post image-level validation of a work order can be automatically evaluated to be complete based on an aggregation of the per image results. This aggregation can be performed via a statistical or learned process and could incorporate the use of neural network based models.” Claim 2 recites additional elements treated under step 2a, prong 2 and step 2B [expert machine learning model] which are treated similarly as in claim 1.
Claims 3 and 14 narrow the abstract idea by validating location. The use of the computer is MPEP 2106.05f (apply it [abstract idea] on a computer).
Claims 4, 15 narrow the abstract idea by validating the location using semantic information. This is only found in the disclosure at [0016] as published. As best understood, this refers to some words in the image being used to compare/confirm a location. This is similar to July 2024 Subject Matter Eligibility Update, Example 48, claim 1, analyzing speech being mathematical analysis or mental evaluation. The use of the computer is MPEP 2106.05f (apply it [abstract idea] on a computer).
Claims 5, 16; 6, 17; 7, 18 narrow the abstract idea by validating the location using metadata information in claim 5, 16 ; “locations matches an address of a property” in claims 6, 17; “determine whether time or date matches proposed work order” in claims 7, 18. This is only found in the disclosure at [0016] as published where it states “the system can utilize metadata contained within the input images in order to identify the location where an image was taken (e.g., metadata usually associated with but not limited to JPG file types contains information such as GPS coordinates and time/date of capture). The system can utilize this information to ensure that the location matches the address of a given property and the time/data of the proposed work order.” As best understood, claims 5, 16; 6, 17 refers to comparing metadata (GPS coordinates) to an address; claims 7, 18 refers to comparing time and date of picture to the time and date of the work order. This is similar to July 2024 Subject Matter Eligibility Update, Example 48, claim 1, analyzing speech being mathematical analysis or mental evaluation. The use of the computer is MPEP 2106.05f (apply it [abstract idea] on a computer).
Claims 8, 19 narrow the abstract idea by “performing image forensics”, for similar reasons as claim 6, because the example in Applicant’s [0016] as published states “image forensics (e.g., the location of where an image was taken (e.g., using GPS coordinates or other location information) could be processed to verify that an image was actually taken at a location where work is described as having been performed As best understood, claims 5, 16; 6, 17 refers to comparing metadata (GPS coordinates) to an address. This is similar to July 2024 Subject Matter Eligibility Update, Example 48, claim 1, analyzing speech being mathematical analysis or mental evaluation. The use of the computer is MPEP 2106.05f (apply it [abstract idea] on a computer).
Claims 9, 20 narrow the abstract idea by stating that one of the machine learning models is a known algorithm of Laplace Redux. This is only found in the disclosure at [0017] as published, where it states “The Laplace Redux (LA) allows of an additional measure of uncertainty for a given input if required.” As best understood, this refers to some words in the image being used to compare/confirm a location. To extent machine learning executed “by a computer,” it is also “apply it [abstract idea] on a computer” (MPEP 2106.05f).
Claims 10, 21 narrow the abstract idea by processing the image using a “pre-defined workflow type using one or more hierarchical machine learning processes, the filter model and the expert model forming the one or more hierarchical machine learning processes.” This is interpreted as using the same “filter model” and “expert model” from claim 1, but “tailored” for a specific type/kind of work (e.g. lawn cutting, snow removal – see [0003], or any work at any site/property by a contractor), and narrows the abstract idea by using pre-defined sets rules for specific business situations to check for work completion. The support is only found in Applicant’s [0016] and [0019] as published. As best understood, this refers to having specific rules/model for a specific kind of work. This is similar to July 2024 Subject Matter Eligibility Update, Example 48, claim 1, analyzing speech being mathematical analysis or mental evaluation. The use of the computer is MPEP 2106.05f (apply it [abstract idea] on a computer).
Therefore, the claim(s) are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter.
For more information on 101 rejections, see MPEP 2106.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 7, 10, 12-14, 18, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US 2021/0027485), and Sudry (US 2022/0207458).
Concerning claim 1, Zhang discloses:
A machine learning system for validating workflows (Zhang – see par 69 - the system 100 can provide real-time monitoring of locations and automatic detection of issues using machine learning; The system 100 can automatically generate and assign tasks in order to address the issues detected, then monitor the progress and completion of the tasks using the machine vision platform, providing reminders and status updates along the way; see par 75 - The system 100 includes cameras 110a, 110b, a local computer system 120, a remote computer system 130, and various client devices 140a-140c; see par 161 - completion of the tasks can be detected and verified (e.g., by detecting, from processing of subsequently captured images, that the condition prompting creation of the task is no longer present). The conditions detected using image data and machine learning model processing can be used to corroborate whether tasks that a user indicates have been completed have actually been completed.), comprising;
a computing device receiving at least one digital image relating to work being performed by a contractor (Zhang – See par 69 – The system 100 can automatically generate and assign tasks in order to address the issues detected, then monitor the progress and completion of the tasks using the machine vision platform, providing reminders and status updates along the way; see par 77, FIG. 1A - The cameras 110a and 110b respectively provide image data 114a, 114b representing the images 111, 112 to the computer system 120.).
Examiner notes that Zhang discloses people performing tasks. The title of the person being a “contractor” is not entitled to patentable weight, as the title of the person is just descriptive of the worker performing work in the image. This is “nonfunctional descriptive material”, not entitled to patentable weight, as the title of “contractor” has no functional relationship with the computer. See MPEP 2111.05. Nonetheless, for purposes of compact prosecution, Sudry discloses “contractor” performing work in an image (Sudry – see par 39 – work orders for contractors… for tasks; par 91-92 - The plurality of images from which the images for evaluation are selected may comprise frames of a video footage captured in the building, optionally by a Site-Tracker in accordance with an embodiment of the disclosure, and optionally stored in image database 141; par 101 – record of performance for contractor).
Zhang and Sudry disclose:
a workflow validation software module executing on the computing device, the workflow validation software module (Zhang - See par 161 - For example, tasks can be created to address detected conditions, tasks assigned to users, users can be notified and reminded of the tasks, and completion of the tasks can be detected and verified (e.g., by detecting, from processing of subsequently captured images, that the condition prompting creation of the task is no longer present); see par 165 - Embodiments of the invention and all of the functional operations described in this specification can be implemented in …computer software, firmware, or hardware, ... Embodiments of the invention can be implemented as one or more computer program products, e.g., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus.):
processing the digital image using a filter machine learning model to determine whether the digital image is suitable for validating whether a work action has been completed, the filter machine learning configured to reject the digital image if the filter machine model determines that the digital image is not suitable for validating whether the work action has been completed (Zhang – see par 76, FIG. 1A - In the example of FIG. 1A, machine vision and machine learning are used to detect the status of areas of a restaurant in real-time and provide feedback to various workers at the restaurant; see par 77, FIG. 1A – cameras 110a, 110b capture image data of a location (e.g. area of a restaurant; food available at restaurant; see par 79 - In stage (B), the computing system 120 processes the sensor data and generates input for one or more machine learning models. For example, the computing system 120 receives the image data 114a, 144b and the audio data 116 and can use a data pre-processor 121 to extract feature values to be provided as input. The data preprocessor 121 may perform a variety of other tasks to manipulate the sensor data and prepare input to the neural networks, according to a set of predetermined settings 122. These settings 122 can be customized for the particular restaurant and even for individual sensors;. To facilitate data processing, each set of sensor data is associated with an accompanying set of metadata that indicates, for example, a timestamp indicating a time of capture, a sensor identifier (e.g., indicating which camera), a location identifier (e.g., indicating the particular restaurant, and or the portion of the restaurant where the sensor is located); see par 82 - To improve efficiency, the region proposal network (RPN) can be arranged to share full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN can be a fully convolutional network that simultaneously predicts object bounds and objectness (e.g., a score indicative of how likely the region represents any of multiple types of objects as opposed to background or areas not of interest) at each position.); and
processing the digital image using an expert machine learning model to classify (Zhang – see par 76 - The system 110 has been set up with the cameras 110a, 110b and a microphone 115 installed, and with the computer system 120 having trained machine learning models loaded and ready to classify and predict whether a set of predetermined conditions are present. As discussed further below, the machine learning models can be trained so that, given image data, the models can detect the locations of objects in the image data, classify the status of the objects in the image data, and provide confidence scores indicating the confidence in the detection and status classification. The system 100 can then use the output of the machine learning models to automatically create, assign, track, and otherwise manage tasks to cause any unfavorable conditions to be corrected) whether the digital image depicts a scene occurring before, during, or after performance of the work action, if the digital image is not rejected by the filter machine learning model (Zhang – See par 83 - On top of these convolutional features, an RPN can be constructed by adding a few additional convolutional layers that simultaneously regress region bounds and objectness scores at each location on a regular grid. The RPN is thus a kind of fully convolutional network (FCN) and can be trained end-to-end specifically for the task for generating detection proposals. see par 98 - During stage (D), the computer system 120 processes the outputs of the models 123 using a post-processing module 124, which can filter or otherwise adjust and interpret the results from the models 123. For example, the module 124 may access a set of rules 125 that indicate rules and thresholds for the post-processing actions. These rules and thresholds may be different for different locations (e.g., different restaurant buildings) and for different models 123, and can even be tailored for specific cameras 110a, 110b. The post-processing module 124 actions can remove detected objects that have confidence scores less than a threshold indicated by the rule set 125; See par 106 - In stage (F), a task tracking module 128 tracks progress of assigned tasks over time. The computer system 120 stores the information specifying the pending and completed tasks in a task data store 129. As new images are captured by the cameras 110a, 110b and processed using the machine learning models 123, the task tracking module 128 can evaluate the object detection and status classification outputs to determine whether detected conditions that prompted the creation of the tasks are still present. Even when tasks have been marked as complete, the module 128 can evaluate whether the detected objects and conditions corroborate that the task is complete or not; … how tasks to correct the condition were carried out (e.g., how quickly tasks were completed after being assigned), and differences in occurrence of the condition and differences in completion of corresponding tasks for different time periods).
Zhang discloses using classification for progress of tasks and whether completion occurs. To any extent this does not disclose the “classify” into “before, during, or after performance,” Sudry discloses the limitations (Sudry – see par 39 – CAE = construction action element; represents a construction task to be performed; See par 48, 126 - classifier database 142 for storing or managing access to classifiers respectively designated to process images captured from construction sites and evaluate progress of construction of the site; See par 95- classifier comprised in classifier database 142 may be designated to evaluate images comprising of views of a given object to classify a state of a given CAE (construction action element – represents a task) associated with the given object. Optionally, the classifier is a classifier designated evaluate and classify the at least one image as indicating the CAE to be in one of a plurality of possible states, or to provide respective likelihoods of the CAE to be in two or more of a plurality of possible states. The classifiers stored in classifier database 142 may be generated through a machine learning process that trains classifiers using reference images of an object designated as indicating a particular state of a CAE; see par 96 - By way of example, if a state of flow model 200 as shown in FIG. 2A is such that the CAE of electrical second phase 204 has a state of ready state 212, and an object associated with CAE 204 is an electrical outlet, then the Binder Module may select a first classifier that is designated to evaluate the one or more images of electrical outlet 302 and determine whether or not CAE 204 is in ready state 212… “intermediate state 213”… 2nd intermediate state…. Completed state 215).
Both Zhang and Sudry are analogous art as they are directed to analyzing images of people performing tasks (See Zhang Abstract, par 76; Sudry Abstract, par 21 – process path for loan application). Zhang discloses having employees (par 73) or other workers (par 76, 104) for performing tasks, object detection for whether region represents objects of interest or not (see par 82), and discloses classification of outputs to determine whether detections conditions for tasks are still present and analyzing images to track progress of assigned tasks over time (par 106). Sudry improves upon Zhang by disclosing person performing task can be a contractor (Sudry par 39, 101) and that machine learning process trains classifiers for different states (ready, “not ready”, intermediate, completed) of a task associated with an object (Sudry par 48, 95, 96, FIG. 2A). One of ordinary skill in the art would be motivated to have contractors performing tasks and classify images into different states of the scene including “not ready” for performance to efficiently improve upon the classification of images and tracking progress of assigned tasks over time in Zhang.
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the machine learning models to analyze images for task completion in Zhang, to further include contractors performing tasks and classifying images into different states relative to task performance as disclosed in Sudry, since the claimed invention is merely a combination of old elements, and in combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable and there is a reasonable expectation of success.
Concerning independent claim 12, Zhang and Sudry disclose:
A machine learning method for validating workflows (Zhang – see par 69 - the system 100 can provide real-time monitoring of locations and automatic detection of issues using machine learning; The system 100 can automatically generate and assign tasks in order to address the issues detected, then monitor the progress and completion of the tasks using the machine vision platform, providing reminders and status updates along the way; see par 161 - completion of the tasks can be detected and verified (e.g., by detecting, from processing of subsequently captured images, that the condition prompting creation of the task is no longer present). The conditions detected using image data and machine learning model processing can be used to corroborate whether tasks that a user indicates have been completed have actually been completed), the method comprising:
receiving at a computing device least one digital image relating to work being performed by a contractor (Zhang – See par 69 – The system 100 can automatically generate and assign tasks in order to address the issues detected, then monitor the progress and completion of the tasks using the machine vision platform, providing reminders and status updates along the way; see par 77, FIG. 1A - The cameras 110a and 110b respectively provide image data 114a, 114b representing the images 111, 112 to the computer system 120).
Sudry discloses “contractor” performing work in an image [same as claim 1] – (Sudry – see par 39 – work orders for contractors… for tasks; par 91-92 - The plurality of images from which the images for evaluation are selected may comprise frames of a video footage captured in the building, optionally by a Site-Tracker in accordance with an embodiment of the disclosure, and optionally stored in image database 141; par 101 – record of performance for contractor).
processing the digital image using a filter machine learning model executed by the computing device (Zhang - see par 165 - Embodiments of the invention and all of the functional operations described in this specification can be implemented in …computer software, firmware, or hardware, ... Embodiments of the invention can be implemented as one or more computer program products, e.g., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus) to determine whether the digital image is suitable for validating whether a work action has been completed, the filter machine learning configured to reject the digital image if the filter machine model determines that the digital image is not suitable for validating whether the work action has been completed ([same as claim 1] - Zhang – see par 76, FIG. 1A - In the example of FIG. 1A, machine vision and machine learning are used to detect the status of areas of a restaurant in real-time and provide feedback to various workers at the restaurant; see par 77, FIG. 1A – cameras 110a, 110b capture image data of a location (e.g. area of a restaurant; food available at restaurant; see par 79 - In stage (B), the computing system 120 processes the sensor data and generates input for one or more machine learning models. For example, the computing system 120 receives the image data 114a, 144b and the audio data 116 and can use a data pre-processor 121 to extract feature values to be provided as input. The data preprocessor 121 may perform a variety of other tasks to manipulate the sensor data and prepare input to the neural networks, according to a set of predetermined settings 122; par 82 [same as cl. 1 above] …); and
processing the digital image using an expert machine learning model to classify (Zhang – see par 76 - The system 110 has been set up with the cameras 110a, 110b and a microphone 115 installed, and with the computer system 120 having trained machine learning models loaded and ready to classify and predict whether a set of predetermined conditions are present. As discussed further below, the machine learning models can be trained so that, given image data, the models can detect the locations of objects in the image data, classify the status of the objects in the image data, and provide confidence scores indicating the confidence in the detection and status classification. The system 100 can then use the output of the machine learning models to automatically create, assign, track, and otherwise manage tasks to cause any unfavorable conditions to be corrected) whether the digital image depicts a scene occurring before, during, or after performance of the work action, if the digital image is not rejected by the filter machine learning model ([same as claim 1] – Zhang see par 83; par 98 - During stage (D), the computer system 120 processes the outputs of the models 123 using a post-processing module 124, which can filter or otherwise adjust and interpret the results from the models 123. … The post-processing module 124 actions can remove detected objects that have confidence scores less than a threshold indicated by the rule set 125; See par 106 - In stage (F), a task tracking module 128 tracks progress of assigned tasks over time. The computer system 120 stores the information specifying the pending and completed tasks in a task data store 129. As new images are captured by the cameras 110a, 110b and processed using the machine learning models 123, the task tracking module 128 can evaluate the object detection and status classification outputs to determine whether detected conditions that prompted the creation of the tasks are still present. Even when tasks have been marked as complete, the module 128 can evaluate whether the detected objects and conditions corroborate that the task is complete or not; … how tasks to correct the condition were carried out (e.g., how quickly tasks were completed after being assigned), and differences in occurrence of the condition and differences in completion of corresponding tasks for different time periods).
Sudry discloses the “classify” into “before, during, or after performance,” same as claim 1 - disclose the “classify” into “before, during, or after performance,” (Sudry [same as claim 1] – see par 39; See par 48, 126 - classifier database 142 for storing or managing access to classifiers respectively designated to process images captured from construction sites and evaluate progress of construction of the site; See par 95- classifier comprised in classifier database 142 may be designated to evaluate images comprising of views of a given object to classify a state of a given CAE (construction action element – represents a task) associated with the given object… The classifiers stored in classifier database 142 may be generated through a machine learning process that trains classifiers; see par 96 - By way of example, if a state of flow model 200 as shown in FIG. 2A … select a first classifier that is designated to evaluate the one or more images of electrical outlet 302 and determine whether or not CAE 204 is in ready state 212… “intermediate state 213”… 2nd intermediate state…. Completed state 215).
It would be obvious to combine Zhang and Sudry for the same reasons as claim 1.
Concerning claims 2 and 13, Zhang and Sudry disclose:
The machine learning system of claim 1, wherein the workflow validation software module processes classification of the digital image by the expert machine learning model to validate completion of the work action (Zhang – see par 106 - In stage (F), a task tracking module 128 tracks progress of assigned tasks over time. The computer system 120 stores the information specifying the pending and completed tasks in a task data store 129. As new images are captured by the cameras 110a, 110b and processed using the machine learning models 123, the task tracking module 128 can evaluate the object detection and status classification outputs to determine whether detected conditions that prompted the creation of the tasks are still present. Even when tasks have been marked as complete, the module 128 can evaluate whether the detected objects and conditions corroborate that the task is complete or not.
See also Sudry – see par 39 – CAE = construction action element; represents a construction task to be performed; See par 48, 126 - classifier database 142 for storing or managing access to classifiers respectively designated to process images captured from construction sites and evaluate progress of construction of the site; See par 95- classifier comprised in classifier database 142 may be designated to evaluate images comprising of views of a given object to classify a state of a given CAE (construction action element – represents a task) associated with the given object. see par 96 - By way of example, if a state of flow model 200 as shown in FIG. 2A is such that the CAE of electrical second phase 204 has a state of ready state 212, and an object associated with CAE 204 is an electrical outlet, then the Binder Module may select a first classifier that is designated to evaluate the one or more images of electrical outlet 302 and determine whether or not CAE 204 is in ready state 212… “intermediate state 213”… 2nd intermediate state…. Completed state 215.)
It would be obvious to combine Zhang and Sudry for the same reasons as claim 1.
Concerning claims 3 and 14, Zhang and Sudry disclose:
The machine learning system of claim 1, wherein the workflow validation software module validates a location of the digital image (Zhang – see par 120 - In the example, after the condition of “litter present” is detected, the computer system 120 can continue to monitor the associated location and determine if the condition has changed; see par 122 - FIG. 2 is a diagram showing an example of an image 200 having an overlay of annotations indicating results of machine learning analysis. The image 200 shows a dining area, with results of a neural network model that has been trained to detect tables, indicate the locations of the tables in image data, and classify the tables as clean, dirty, or occupied
Sudry – see par 35 - Site-Tracker 120 may comprise one or more of: a data storage device configured to store images captured by the image capture device, a wireless communication module configured to transmit information including images captured by the image capturing device to an external device, by way of example, hub 130, and a position tracking device for tracking movement and position of itself. The position tracking device may comprise one or more of: a Global Positioning System (GPS) tracking device; see par 85 - If a location (or an expected location) of a given object within the building is known, by way of example, as coordinates within a 3D representation of the building, and a 6 DOF camera position within the building at which a given frame of the video was captured is known, then that information can be used to select frames from the video that comprise views of the object that was captured by the camera at one or more desired angles of view and within a desired distance.).
It would be obvious to combine Zhang and Sudry for the same reasons as claim 1.
Concerning claims 7 and 18, Zhang and Sudry disclose:
The machine learning system of claim 1, wherein the workflow validation software module determines whether a time or date of the digital image matches a time or date of a proposed work order ([0016] as published - In addition to performing verification via the semantic information contained within the image data, the system can utilize metadata contained within the input images in order to identify the location where an image was taken (e.g., metadata usually associated with but not limited to JPG file types contains information such as GPS coordinates and time/date of capture). The system can utilize this information to ensure that the location matches the address of a given property and the time/data of the proposed work order.
Zhang discloses the limitations based on broadest reasonable interpretation in light of the specification – see par 79 - The data preprocessor 121 may perform a variety of other tasks to manipulate the sensor data and prepare input to the neural networks, according to a set of predetermined settings 122. These settings 122 can be customized for the particular restaurant and even for individual sensors (e.g., to use different settings for different cameras 110a, 110b). To facilitate data processing, each set of sensor data is associated with an accompanying set of metadata that indicates, for example, a timestamp indicating a time of capture;
See also Sudry see par 92 - The plurality of images may be timestamped, and the at least one image may be selected based on time of capture, as compared to a timestamp for a state change of the corresponding CAE. Optionally, the selection of a given image comprises transposing the presumed object location to select a region of interest (ROI) within the image that includes the view of the object for further processing. The images may be selected responsive to their timestamp as well, so that the selected images are captured during the time window when the CAE is scheduled to be in the expected state).
It would be obvious to combine Zhang and Sudry for the same reasons as claim 1.
Concerning claims 10 and 21, Zhang and Sudry disclose:
The machine learning system of claim 1, wherein the workflow validation software module processes the input image based upon a pre-defined workflow type using one or more hierarchical machine learning processes, the filter model and the expert model forming the one or more hierarchical machine learning processes (Zhang – see page 80 – stage C, one or more machine learning models 123; In some implementations, different models are used for different types of applications. For example, one model may be trained to process data representing images of a dining area and another model may be trained to process data representing images of display cases. Different applications and different areas may involve detecting different types of objects with different status classifications. For example, one model configured to detect people, chairs, tables, food, and litter can be used to process image data 114a for the dining area, and a different model configured to detect different types of food, amounts of food present, and display areas can be used to process the image data 114b for the display case; See par 77, 79, FIG. 1A – stage (B) – data pre-processor 121 – to prepare input; See par 98, stage D – output of models 123 using a post-process module 124, which can filter or otherwise adjust and interpret the results from the models 123. For example, the module 124 may access a set of rules 125 that indicate rules and thresholds for the post-processing actions; The post-processing module 124 actions can remove detected objects that have confidence scores less than a threshold indicated by the rule set 125; (disclosing hierarchy from Stage B (121) to Stage D (124); where stage D can also remove some objects).
Claims 4 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US 2021/0027485), and Sudry (US 2022/0207458), as applied to claims 1-3, 7, 10, 12-14, 18, and 21 above, and further in view of Gaffey (US 2006/0242419).
Concerning claims 4 and 15, Zhang discloses monitoring a location (See par 120, 122). Sudry discloses:
The machine learning system of claim 3, wherein the workflow validation software module validates the location using semantic information …(Sudry – see par 52 – assignment parameters for construction objects; see par 53 – assignment parameter may be a semantic parameter, such as an object’s name; see par 74 - Optimization of an ICP (initial camera position) towards an OCP (optimized camera position) may be performed responsive to one or more measures of position discrepancy based on a comparison between a simplified site image based on a captured image and a corresponding expected site image based on an ICP (or subsequent PCP) and a 3D representation of the building. Optionally, the measure of discrepancy is a combined measure of discrepancy comprising measures of discrepancy determined based on an evaluation of any two or more of semantically segmented images, boundary maps, depth maps, and corner maps).
It is not clear if this is considered semantic information “contained within the digital image.”
Gaffey discloses the limitations (Gaffey – see par 32 - The verification process may consist of actions such as enhancing the image using conventional or proprietary image enhancement software, checking to make sure that the image makes sense to the associated CSI code and location indications, and that the comments inserted by the field make sense; For example, if the data capturer has been asked to verify that a construction activity has been completed, and the documentation data shows that it is not, then the documentation schedule is adjusted to continue to request verification until the activity is complete.).
Zhang, Sudry, and Gaffey are analogous art as they are directed to analyzing images of people performing tasks (See Zhang Abstract, par 76; Sudry Abstract, par 21 – process path for loan application; Gaffey Abstract, par 32). Zhang discloses monitoring a location (See par 120, 122). Sudry discloses that images are semantically segmented, and compared to boundary maps for a work site (See par 74) and that frames of videos can be at location for object (see par 85) and that site tracker storing images captures can have a position tracking device with GPS (See par 35). Gaffey improves upon Zhang and Sudry by disclosing that images have location indications and comments that make sense in a verification for verifying activities are completed (See par 32). One of ordinary skill in the art would be motivated to have comments and location indications for an image in verifying activities are completed to efficiently improve upon the classification of images and tracking progress of assigned tasks over time in Zhang and the comparison between semantically segmented images with boundary maps at a work site (See par 74) and checking locations of images (See par 35, 85) in Sudry.
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the machine learning models to analyze images for task completion in Zhang, to further include contractors performing tasks and classifying images into different states relative to task performance as disclosed in Sudry, to further include images with location indications and comments for verifying activities are completed as disclosed in Gaffey, since the claimed invention is merely a combination of old elements, and in combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable and there is a reasonable expectation of success.
Claims 5-6, 8, 16-17, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US 2021/0027485), and Sudry (US 2022/0207458), as applied to claims 1-3, 7, 10, 12-14, 18, and 21 above, and further in view of Malnati (US 2016/0314546).
Concerning claims 5 and 16, Zhang discloses monitoring a location (See par 120, 122). Sudry discloses that images are semantically segmented, and compared to boundary maps for a work site (See par 74) and that frames of videos can be at location for object (see par 85) and that site tracker storing images captures can have a position tracking device with GPS (See par 35).
Malnati discloses:
The machine learning system of claim 3, wherein the workflow validation software module validates the location of the digital image using metadata contained within the digital image (Malnati – see par 42, 86 - The processed task execution data from the metadata processor 209 is transferred to a result validation and scoring device 211. The result validation and scoring device 211 collaborates with a validation database 213 to validate the execution data. For instance, if the validation data includes a plurality of street level images of a building under inspection, the result validation and scoring device 211 matches each image of the plurality of images with images stored in the validation database 213, in order to authenticate the building under inspection; In such an instance, the result validation and scoring device 211 may communicate with the validation database via the API 20. Additionally, the result validation and scoring device 211 may assign a score to a performed task).
Zhang, Sudry, and Malnati are analogous art as they are directed to analyzing images of people performing tasks (See Zhang Abstract, par 76; Sudry Abstract, par 21 – process path for loan application; Malnati Abstract, par 41). Zhang discloses monitoring a location (See par 120, 122). Sudry discloses that images are semantically segmented, and compared to boundary maps for a work site (See par 74) and that frames of videos can be at location for object (see par 85) and that site tracker storing images captures can have a position tracking device with GPS (See par 35). Malnati improves upon Zhang and Sudry by disclosing that images metadata and street level images for validating the person is at the building for a task assigned (See par 32). One of ordinary skill in the art would be motivated to have metadata and street level images to efficiently improve upon the classification of images and tracking progress of assigned tasks over time in Zhang and the comparison between semantically segmented images with boundary maps at a work site (See par 74) and checking locations of images (See par 35, 85) in Sudry.
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the machine learning models to analyze images for task completion in Zhang, to further include contractors performing tasks and classifying images into different states relative to task performance as disclosed in Sudry, to further include images with metadata and street level images used for validating a building for a task as disclosed in Malnati, since the claimed invention is merely a combination of old elements, and in combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable and there is a reasonable expectation of success.
Concerning claims 6 and 17, Zhang discloses monitoring a location (See par 120, 122). Sudry discloses that images are semantically segmented, and compared to boundary maps for a work site (See par 74) and that frames of videos can be at location for object (see par 85) and that site tracker storing images captures can have a position tracking device with GPS (See par 35).
Malnati discloses:
The machine learning system of claim 3, wherein the workflow validation software module determines whether the location matches an address of a property ([0016] as published - In addition to performing verification via the semantic information contained within the image data, the system can utilize metadata contained within the input images in order to identify the location where an image was taken (e.g., metadata usually associated with but not limited to JPG file types contains information such as GPS coordinates and time/date of capture). The system can utilize this information to ensure that the location matches the address of a given property and the time/data of the proposed work order.
Malnati discloses the limitations based on broadest reasonable interpretation in light of the specification – see par 87 - The street level images of the building under inspection can be obtained via Google images based on an initial address of the property, a GPS location of the mobile device 108, and other similar techniques. Upon a successful match of the captured street level images, the communicating device 101 may assign a high score to the inspection task).
It would be obvious to combine Zhang and Sudry and Malnati for the same reasons as claim 5. In addition, Zhang discloses a computer can have a GPS receiver (See par 168). Sudry discloses having a position tracking device of a GPS with a Site Tracker that captures images (See par 35). It would be obvious to have GPS locations in combination with images for scoring the task in Malnati to improve upon the GPS disclosures in Zhang and Sudry.
Concerning claims 8 and 19, Malnati discloses:
The machine learning system of claim 1, wherein the workflow validation software module performs image forensics on the digital image (Applicant’s [0016] as published states “the systems and methods of the present disclosure can perform location validation and image forensics (e.g., the location of where an image was taken (e.g., using GPS coordinates or other location information) could be processed to verify that an image was actually taken at a location where work is described as having been performed.”
Malnati discloses the limitations based on broadest reasonable interpretation in light of the specification – see par 87 - The street level images of the building under inspection can be obtained via Google images based on an initial address of the property, a GPS location of the mobile device 108, and other similar techniques. Upon a successful match of the captured street level images, the communicating device 101 may assign a high score to the inspection task.
It would be obvious to combine Zhang and Sudry and Malnati for the same reasons as claim 5 and 6.
Claims 9 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US 2021/0027485), and Sudry (US 2022/0207458), as applied to claims 1-3, 7, 10, 12-14, 18, and 21 above, and further in view of Daxberger, et al., “Laplace Redux – Effortless Bayesian Deep Learning,” 2021, Advances in neural information processing systems, Vol. 34, 20089-20103.
Concerning claims 9 and 20, Zhang discloses using machine learning models having confidence scores in detection and status classification of images (See par 76, 98) where neural network models for processing image data can include a deep fully convolutional network (See par 81-86). Sudry discloses assigning a CAE (par 39 – Construction action element represents a task to be performed) to an object with confidence using a neural network (See par 54) where classifiers are with a machine learning process (See par 95).
Daxberger discloses:
The machine learning system of claim 1, wherein at least one of the filter machine learning model or the expert machine learning model performs a Laplace Redux method ([0017] as filed states “For some workflows, it is of high importance for the model hierarchy to give an appropriate measure of prediction certainty. As such, for the filter model, expert model, or both models, the Laplace Redux method may be utilized. The Laplace Redux (LA) allows of an additional measure of uncertainty for a given input if required.”
Daxberger 2021, “Laplace Redux – Effortless Bayesian Deep Learning,” Abstract - The Laplace approximation (LA) is a classic, and arguably the simplest family of approximations for the intractable posteriors of deep neural networks).
Zhang, Sudry, and Daxberger are analogous art as they are directed to analyzing images using machine learning or neural networks (See Zhang Abstract, par 81-86; Sudry Abstract, par 54; Daxberger Abstract, page 7, Section 4.1 – “image classification”; page 10, Section 4.4 “Further application” - images). Zhang discloses using machine learning models having confidence scores in detection and status classification of images (See par 76, 98) where neural network models for processing image data can include a deep fully convolutional network (See par 81-86). Sudry discloses assigning a CAE (par 39 – Construction action element represents a task to be performed) to an object with confidence using a neural network (See par 54) where classifiers are with a machine learning process (See par 95). Daxberger improves upon Zhang and Sudry by disclosing the use of a known Laplace Redux. One of ordinary skill in the art would be motivated to use a known Laplace Redux with image analysis to efficiently improve upon the classification of images and tracking progress of assigned tasks over time where it can be a “deep” fully convolutional network (See par 81-86) along with confidence scores (See par 76, 98) in Zhang and the classifiers for images using a neural network along with confidence in Sudry (See par 54).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the machine learning models to analyze images for task completion in Zhang, to further include contractors performing tasks and classifying images into different states relative to task performance as disclosed in Sudry, to further include use of a known Laplace Redux algorithm as disclosed in Daxberger, since the claimed invention is merely a combination of old elements, and in combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable and there is a reasonable expectation of success.
Response to Arguments
Applicant’s arguments 12/23/25 have been considered but are not persuasive and/or moot over the revised rejections necessitated by the amendments.
With regards to 101, Applicant argues the claims are “specific” and have unconventional machine learning models because the “filter machine learning model” is filtering images prior to processing by “expert machine learning model.” Remarks, page 9. In response, Examiner respectfully disagrees with the analysis. At step 2a, prong two, for practical application, we are looking for improvement to computing technology. The first determination is only claimed as a the “result” – the image is assessed as being “suitable” or not for validating work of a contractor. However, here, all we have is a “bare assertion” of an improvement – with no details on the “filter learning model”, or the “expert learning model”; how any of the steps related to learning, as the models are already present and just determine the “result” – “is the image suitable? if so, is image one of before, during, or after performance.” Accordingly, at this time, this is viewed as MPEP 2106.04(d)(1) “Conversely, if the specification explicitly sets forth an improvement only in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine that the claim improves technology or a technical field.”
Applicant then argues that by filtering or not including images that are “not suitable,” the “expert machine learning model” does not waste computing time and resources. Remarks, page 9. In response, Examiner respectfully disagrees. This argument of “we could have done something more complicated, but did not” is not persuasive here with the current claims. This is similar to relying on the computer itself for eligibility - See MPEP 2106.05f – “claiming the improved speed or efficiency inherent with applying the abstract idea on a computer" does not integrate a judicial exception into a practical application or provide an inventive concept.” There are no details on the additional elements in the filtering step here. The “filtering” appears to just be any image being stated as “useful/suitable/relevant” in any manner. Claim limitations individually or as a whole, when viewed in light of the specification, falls within MPEP 2106.05a example that is not sufficient to show an improvement in computer functionality: Accelerating a process of analyzing audit log data when the increased speed comes solely from the capabilities of a general-purpose computer, FairWarning IP, LLC v. Iatric Sys., 839 F.3d 1089, 1095, 120 USPQ2d 1293, 1296 (Fed. Cir. 2016), as here, we have no technical details. The same process would occur manually – a user would “filter” images as being “suitable or not”, then can sort the remaining images as being “before, during, or after” the person’s work occurs. Examiner also notes that this is not similar to Abstract Idea Example 39, in the January 2019 Guidance, was an improvement with specific steps of how the “training” of a neural network occurred, along with a disclosure discussing the technical issues with the image analysis that was occurring. See MPEP 2106.04(a)(1)(vii) where details related to a two-stage training system are present.
Examiner suggests considering for amendment for 101 purposes: Combining different portions of the specification that are more focused on “details of additional elements”– to [0015] “ Still further, the system can learn (e.g., via machine learning) as validations are conducted by the system, so as to improve the speed and accuracy of future validations conducted by the system.”; [0018] “expert model scores the image, and if the score indicates a low confidence level, step 38 occurs, wherein the image is marked as void and is not utilized for further processing by the system. If the confidence level is not low, the image is classified as either depicting a scene that occurs before (step 40), during (step 42), or after (step 44) performance of a work action”; [0021] unsupervised training for particular work orders?; [0023] “ have in place automated flagging of potentially fraudulent activity. To this end, the system utilizes the metadata attached to each image to perform several checks related to assessing whether the image may have been manipulated.”; [0024] computer vision models giving real-time feedback to the user as to which frames contain imagery suitable for the work order that they are looking to upload.
With regards to 103, Applicant argues, that Zhang does not filter images if not suitable. Remarks, page 10. In response, Examiner respectfully disagrees. Paragraph 76-77 has a pre-processor to extract features to be provided as input. The preparation of extracting features discloses the limitations. Paragraph 82 of Zhang discloses looking in images for whether are area of interest or not. This also discloses the limitations. Paragraph 98 then filters results and has rules and thresholds for post-processing, which discloses the amended limitations as indicated in the revised rejection, and then paragraph 106 evaluates different time periods for the tasks and images, as well as Sudry then having a classifier to evaluate progress at a construction site.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IVAN R GOLDBERG whose telephone number is (571)270-7949. The examiner can normally be reached 830AM - 430PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Anita Coupe can be reached at 571-270-3614. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/IVAN R GOLDBERG/Primary Examiner, Art Unit 3619