Last updated: April 19, 2026
Application No. 17/876,088
MANAGING DISPLAY DEVICES USING MACHINE LEARNING

Final Rejection §101§102§103
Filed
Jul 28, 2022
Examiner
BALAKRISHNAN, VIJAY MURALI
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
Hughes Network Systems LLC
OA Round
2 (Final)
This examiner grants 43% of cases after interview

— +85.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 14 resolved cases, 2023–2026
Examiner Intelligence

BALAKRISHNAN, VIJAY MURALI View full profile →
Grants 43% of resolved cases
Career Allow Rate
6 granted / 14 resolved
-12.1% vs TC avg
Strong +86% interview lift
Without
With
+85.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 12m
Avg Prosecution
26 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
26.4%
-13.6% vs TC avg
§103
31.5%
-8.5% vs TC avg
§102
13.2%
-26.8% vs TC avg
§112
24.3%
-15.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 14 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
	This final action is in response to the amendment and remarks filed on 11/04/2025 for application 17/876,088. 
	Claims 1, 7-9, 13, 17, and 19-20 have been amended. Claims 6 and 18 are cancelled. 
	Claims 1-5, 7-17, and 19-20 remain pending in the application. Claims 1, 3, and 20 are the pending independent claims.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
	The amendment filed 11/04/2025 has been entered.
	Applicant’s amendment to the claims with respect to resolving claim objections has been considered, and overcomes the objection set forth in the office action mailed 08/04/2025. Consequently, the objection is withdrawn.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-5, 7-17, and 19-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).
Independent Claims (Claim 1, Claim 13, Claim 20):
Step 1: Claim 1 is drawn to a method, claim 13 is drawn to a system/apparatus, and claim 20 is drawn to a product. Therefore, each of these claims falls under one of the four categories of statutory subject matter (process/method, machine/apparatus, manufacture/product, or composition of matter).
Step 2A Prong 1: Claims 1, 13, and 20 each recite a judicially recognized exception of an abstract idea.
	Claim 1 recites, inter alia:
processing the image data to evaluate status of display devices based on input of image data corresponding to the display devices – This limitation amounts to a process of observing an image (including its associated data/information) of a display, and based on a process of human reasoning, making a determination on the status of its associated device – for example,  observing an image of a display that is missing an important content element (e.g., a button), and thereby identifying the associated device as malfunctioning. Therefore, it recites a process of evaluation capable of being performed in the human mind or using pen and paper.
selecting a classification for a status of the display device based on the image data, wherein the classification is selected from among the predetermined set of classifications; and the predetermined set of classifications comprises: a normal state; a broken user interface state; and a setup screen state – Similarly to the limitation recited above, this limitation further amounts to a process of observing an image of a display to make a determination on the status of an associated device – for example, given categories of “functioning” and “malfunctioning” devices, observing an image of a display that is missing an important content element (e.g., a button), and thereby identifying the associated device as “malfunctioning”. The recited classifications (“normal state”, “a broken user interface state”, “a setup screen state”) likewise fall within the scope of status determinations that a person could make based on mere observation. Therefore, it recites a process of evaluation capable of being performed in the human mind or using pen and paper.
determining, based on the selected classification, that the output of the display device is not correct or that the display device is not in a desired operating state; – This limitation further expands on the abstract idea recited in the independent claim [see Step 2A Prong 1 analysis of claim 1 on pages 3-4] of observing images to make a determination based on reasoning – in this instance, making a determination that the associated device is malfunctioning or displaying data incorrectly. Therefore, it recites a process of evaluation capable of being performed in the human mind or using pen and paper.
based on determining that the output of the display device is not correct or that the display device is not in a desired operating state, selecting a corrective action to improve output of the display device; – This limitation amounts to, based on a process of reasoning, identifying generic troubleshooting actions that may resolve a previously determined display issue (i.e., that a device is malfunctioning or displaying data incorrectly) – for example, changing a display setting, restarting the device, closing/re-opening an application, (i.e., simple troubleshooting steps which would be identifiable to a person as possibly resolving a display issue). Therefore, it recites a process of evaluation capable of being performed in the human mind that is simply being performed in a computer environment.
	Claims 13 and 20 recite substantially similar abstract idea limitations to those 	recited in claim 1, and therefore recite the same judicial exception.
Step 2A Prong 2: The following additional elements recited in claims 1, 13, and 20 do not integrate the recited judicial exceptions into a practical application.
	Claim 1 additionally recites:
A method performed by one or more computers; [processing/selecting] by the one or more computers; – These limitations amount to mere instructions to implement an abstract idea on a computer or computer components.
receiving, by the one or more computers, image data over a communication network, the image data representing an image provided for presentation by a display device – This limitation amounts to an insignificant pre-solution step of gathering data for use in a claimed process. Therefore, it recites insignificant extra-solution activity.
[processing] using a machine learning model that has been trained [to evaluate]; wherein the machine learning model has been trained based on training data examples that include image data from multiple display devices and include examples for different classifications in a predetermined set of classifications; [selecting] based on the output that the machine learning model generated – These limitations generically invoke a trained machine learning (ML) model merely as a tool to perform an existing mental process of observing various images and making determinations based on reasoning. Therefore, they amount to no more than mere instructions to apply an exception using an ML model.
sending, to the display device, an instruction for the display device to perform the selected corrective action – This limitation amounts to a mere post-solution step of outputting a result of a claimed process to a device, which is insignificant extra-solution activity.
	Claim 13 recites substantially similar additional elements to those recited in claim 	1, and further recites:
A system comprising: one or more computers; and one or more computer-readable media storing instructions that are operable, when executed by the one or more computers, to cause the system to perform operations comprising: – This limitation amounts to mere instructions to implement an abstract idea on a computer or computer components.
Claim 20 recites substantially similar additional elements to those recited in claim 1, and further recites:
One or more computer-readable media storing instructions that are operable, when executed by one or more computers, to cause the one or more computers to perform operations comprising: – This limitation amounts to mere instructions to implement an abstract idea on a computer or computer components.
Step 2B: The additional elements recited in claims 1, 13, and 20, viewed individually or as a combination, do not provide an inventive concept or otherwise amount to significantly more than the recited abstract ideas themselves.
	Claim 1 additionally recites:
A method performed by one or more computers; [processing/selecting] by the one or more computers; – Mere instructions to implement an abstract idea on a computer or computer components do not provide an inventive concept or significantly more to the recited abstract idea.
receiving, by the one or more computers, image data over a communication network, the image data representing an image provided for presentation by a display device – Receiving and transmitting data is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Receiving or transmitting data over a network”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
[processing] using a machine learning model that has been trained [to evaluate]; wherein the machine learning model has been trained based on training data examples that include image data from multiple display devices and include examples for different classifications in a predetermined set of classifications; [selecting] based on the output that the machine learning model generated – Merely invoking a trained machine learning (ML) model as a tool to perform an existing mental process does not provide an inventive concept or significantly more to the recited abstract idea.
sending, to the display device, an instruction for the display device to perform the selected corrective action – Transmitting data is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Receiving or transmitting data over a network”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 13 recites substantially similar additional elements to those recited in claim 	1, and further recites:
A system comprising: one or more computers; and one or more computer-readable media storing instructions that are operable, when executed by the one or more computers, to cause the system to perform operations comprising: – Mere instructions to implement an abstract idea on a computer or computer components do not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 20 recites substantially similar additional elements to those recited in claim 	1, and further recites:
One or more computer-readable media storing instructions that are operable, when executed by one or more computers, to cause the one or more computers to perform operations comprising: – Mere instructions to implement an abstract idea on a computer or computer components do not provide an inventive concept or significantly more to the recited abstract idea.
As such, claims 1, 13, and 20 are not patent eligible.
Dependent Claims (Claims 2-5 and 7-12, Claims 14-17 and 19):
Dependent claims 2-5, 7-12, 14-17, and 19 narrow the scope of independent claims 1 and 13, and thus merely narrow the recited judicial exceptions. With respect to the independent claims, the recited judicial exceptions are not meaningfully integrated into a practical application, and also do not amount to significantly more than the recited abstract ideas themselves. The dependent claims recite abstract idea limitations similar to those recited within the independent claims, as they also do not provide anything more than mathematical concepts or mental processes that are capable of being performed in the human mind and/or using pen and paper. The dependent claims also do not recite any further additional elements that successfully integrate the recited judicial exceptions into a practical application or amount to significantly more than the recited abstract ideas themselves. Consequently, claims 2-5, 7-12, 14-17, and 19 are also rejected under 35 U.S.C. 101.
Step 1: Claims 2-5 and 7-12 are drawn to a method, and claims 14-17 and 19 are drawn to a system/apparatus. Therefore, each of these claims falls under one of the four categories of statutory subject matter (process/method, machine/apparatus, manufacture/product, or composition of matter).
Step 2A Prong 1: Claims 2-5, 7-12, 14-17, and 19 each recite a judicially recognized exception of an abstract idea.
	Claim 2-5 recite the same judicial exception as claim 1.
	Claim 7 recites, inter alia:
wherein the corrective action comprises at least one of changing content to display, changing a display setting, changing a network setting, changing an operating mode, restarting the display device, closing or re-opening an application, initiating a content refresh cycle, restoring one or more settings to a default or reference state, or clearing or refilling a cache of content – This limitation further expands on the abstract idea recited in parent claim 6 (i.e., identifying relevant troubleshooting actions in response to a determined display issue) by merely reciting a set of simple troubleshooting steps which would be identifiable to a person as possibly resolving a display issue. It therefore still recites a process of evaluation capable of being performed in the human mind that is simply being performed in a computer environment.
	Claim 8 recites, inter alia:
wherein selecting the corrective action comprises using rules that specify different corrective actions to perform for different classifications in the predetermined set of classifications – This limitation further expands on the abstract idea recited in parent claim 6 (i.e., identifying relevant troubleshooting actions in response to a determined display issue) by merely invoking a generic set of rules which should be observed when identifying a relevant action. It therefore still recites a process of evaluation capable of being performed in the human mind that is simply being performed in a computer environment.
	Claim 9 recites, inter alia:
tracking a status of the display device over time to verify whether normal operation of the display device occurs after instructing the corrective action to be performed – This limitation amounts to merely continuing to observe a device after it performed an identified troubleshooting action to make a determination on whether not the determined display issue is resolved. It therefore still recites a process of evaluation capable of being performed in the human mind that is simply being performed in a computer environment.
	Claim 10 recites, inter alia:
determining a classification for each of the images; tracking status of the display device [via] records indicating the classifications determined for the images – This limitation further expands on the abstract idea recited in the independent claim [see Step 2A Prong 1 analysis of claim 1 on pages 3-4] of observing images to make a determination based on reasoning – in this instance, making a determination on an associated device for a series of images, and also continuing to observe images to make and record additional determinations on the device over time. Therefore, it recites a process of evaluation capable of being performed in the human mind or using pen and paper.
	Claim 11 recites, inter alia:
provide, in response to receiving input image data, a set of scores comprising a score for each of the classifications in the predetermined set of classifications – This limitation further expands on the abstract idea recited in the independent claim [see Step 2A Prong 1 analysis of claim 1 on pages 3-4] of observing images to make a determination based on reasoning – in this instance, given a set of categories regarding the status of associated devices (e.g., “functioning”, “malfunctioning”, etc.), determining a numerical score for each category. Therefore, it recites a process of evaluation capable of being performed in the human mind or using pen and paper.
	Claim 12 recites the same judicial exception as claim 1.
	Claims 14-17 recite the same judicial exception as claim 13.
	Claims 19 recites substantially similar abstract idea limitations to those recited 	in claim 7, and therefore recites the same judicial exception.
Step 2A Prong 2: Claims 7, 9, 11, and 19 do not recite any further additional elements besides those already recited in the independent claims, and the following additional elements recited in claims 2-5, 8, 10, 12, and 14-17 also do not integrate the recited judicial exceptions into a practical application.
	Claim 2 additionally recites:
wherein the machine learning model is a convolutional neural network – Wherein the independent claim merely invoked a trained ML model as a tool to perform an existing mental process [see Step 2A Prong 2 analysis of claim 1 on page 5], this limitation further specifies the ML model as being a convolutional neural network (CNN). Therefore, it merely limits the use of the recited judicial exception to the technological environment of CNNs without providing anything more.
	Claim 3 additionally recites:
training the machine learning model based on training data examples from multiple display devices, each of the training examples comprising an image and a label indicating a classification for the image – Wherein the independent claim merely invoked a trained ML model as a tool to perform a mental process [see Step 2A Prong 2 analysis of claim 1 on page 5], this limitation simply further specifies training the ML model via a generic supervised learning procedure to perform the mental process. It therefore continues to merely invoke an ML model as a tool to perform an existing mental process of observing various images and making determinations based on reasoning.
[training examples comprising] a screen capture image; [classification for] the screen capture image – Specifying the observed images as being screen captures (i.e., screenshots) amounts to merely specifying a particular data source or type of data to be manipulated, which is insignificant extra-solution activity.
	Claim 4 additionally recites:
comprising providing an application programming interface (API) that enables remote devices to request classification of image data using the API; wherein receiving the image data comprises receiving the image data using the API; and wherein providing the output indicating the selected classification comprises providing the output using the API – Wherein claim 1 recited steps of receiving and transmitting image data over a network [see Step 2A Prong 2 on page 4 and Step 2B on page 6], these limitations further specify using an API to perform the receiving and transmitting. They therefore recite insignificant implementation steps regarding the pre-solution gathering and post-solution outputting of data, which is insignificant extra-solution activity.
	Claim 5 additionally recites:
wherein providing the output comprises providing the output to the display device, to a server associated with the display device, or to a client device of an administrator for the display device – This limitation merely recites a post-solution step of outputting a result of a claimed process to a device or server, which is insignificant extra-solution activity.
	Claim 8 additionally recites:
[selecting the corrective action comprises] using stored rules – This limitation amounts to a generic recitation of storing data for use in a claimed process, which is insignificant extra-solution activity.
	Claim 10 additionally recites:
for each of multiple display devices: receiving a series of different screen capture images obtained at different times; – Specifying the observed images as being screen captures (i.e., screenshots) amounts to merely specifying a particular data source or type of data to be manipulated, which is insignificant extra-solution activity.
[determining classification] using the machine learning model – This limitation  generically invokes an ML model as a tool to perform an existing mental process; therefore, it amounts to no more than mere instructions to apply an exception using an ML model.
[tracking status] by storing records – This limitation amounts to a generic recitation of storing data for use in a claimed process, which is insignificant extra-solution activity.
	Claim 12 additionally recites:
wherein the received image data is a down-sampled version of an image generated by the display device – This limitation amounts to further specifying the received images as being down-sampled (i.e., compressed) versions, and therefore further specifies a particular data source or type of data to be manipulated, which is insignificant extra-solution activity.
[version of] a screen captured image – Specifying the observed images as being screen captures (i.e., screenshots) amounts to merely specifying a particular data source or type of data to be manipulated, which is insignificant extra-solution activity.
	Claims 14-17 recite substantially similar additional elements to those recited in 	claims 2-5, and therefore also do not integrate the recited judicial exceptions into 	a practical application.
Step 2B: The additional elements recited in claims 2-5, 8, 10, 12, and 14-17, viewed individually or as a combination, do not provide an inventive concept or otherwise amount to significantly more than the recited abstract ideas themselves.
	Claim 2 additionally recites:
wherein the machine learning model is a convolutional neural network –Merely limiting the use of the recited judicial exception to the technological environment of CNNs without providing anything more does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 3 additionally recites:
training the machine learning model based on training data examples from multiple display devices, each of the training examples comprising an image and a label indicating a classification for the image – Merely invoking an ML model as a tool to perform an existing mental process does not provide an inventive concept or significantly more to the recited abstract idea.
[training examples comprising] a screen capture image; [classification for] the screen capture image – Using screenshots (i.e., screen captures) to troubleshoot computer problems is well-understood, routine, and conventional activity (see Huang, “Exploring the antecedents of screenshot-based interactions in the context of advanced computer software learning” [Abstract and page 2 Introduction]), and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 4 additionally recites:
comprising providing an application programming interface (API) that enables remote devices to request classification of image data using the API; wherein receiving the image data comprises receiving the image data using the API; and wherein providing the output indicating the selected classification comprises providing the output using the API – Using APIs to receive and transfer data over a network (e.g., web APIs over the internet) is well-understood, routine, and conventional activity (see Ehsan et al., “RESTful API Testing Methodologies: Rationale, Challenges, and Solution Directions”, [page 1 Introduction]) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 5 additionally recites:
wherein providing the output comprises providing the output to the display device, to a server associated with the display device, or to a client device of an administrator for the display device – Transmitting data is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Receiving or transmitting data over a network”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 8 additionally recites:
[selecting the corrective action comprises] using stored rules – Storing and retrieving information in memory is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Storing and retrieving information in memory”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 10 additionally recites:
for each of multiple display devices: receiving a series of different screen capture images obtained at different times; – Using screenshots (i.e., screen captures) to troubleshoot computer problems is well-understood, routine, and conventional activity (see Huang, “Exploring the antecedents of screenshot-based interactions in the context of advanced computer software learning” [Abstract and page 2 Introduction]), and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
[determining classification] using the machine learning model – Merely invoking an ML model as a tool to perform an existing mental process does not provide an inventive concept or significantly more to the recited abstract idea.
[tracking status] by storing records – Storing and retrieving information in memory is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Storing and retrieving information in memory”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
	Claim 12 additionally recites:
wherein the received image data is a down-sampled version of an image generated by the display device – Down-sampling image data (e.g., creating a thumbnail) is well understood, routine, and conventional activity [see instant specification ¶ 0004, 0077] and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
[version of] a screen captured image – Using screenshots (i.e., screen captures) to troubleshoot computer problems is well-understood, routine, and conventional activity (see Huang, “Exploring the antecedents of screenshot-based interactions in the context of advanced computer software learning” [Abstract and page 2 Introduction]), and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
	Claims 14-17 recite substantially similar additional elements to those recited in 	claims 2-5, and therefore also do not provide an inventive concept or significantly 	more to the recited abstract idea.
As such, claims 2-5, 7-12, 14-17 and 19 also are not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 5, 7-11, 13-15, 17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gupte et al., (Pub. No. US 20220198640 A1, “Systems and Methods for Visual Anomaly Detection in a Multi-Display System”, filed 12/18/2020, cited in IDS dated 08/03/2023), hereinafter Gupte, in view of Feiz et al. (“Understanding Screen Relationships from Screenshots of Smartphone Applications”, published 22 March 2022), hereinafter Feiz, and Liu et al. (“Owl Eyes: Spotting UI Display Issues via Visual Understanding”, available arXiv 7 Sep 2020, included in Notice of References Cited mailed 08/04/2025), hereinafter Liu.
	Regarding claim 1, Gupte teaches A method performed by one or more computers, (“FIG. 1 shows a block diagram of a system 100 for detecting a visual anomaly in content displayed via a multi-display system in accordance with an example embodiment. As shown in FIG. 1, system 100 includes one or more computing device 102A-102N, a multi-display system 108, a cloud services platform 122, and an admin computing device 112.. Computing device(s) 102A-102N and admin computing device 112 are computing devices via which a user is enabled to run applications, visit web pages compatible with various web browsers, etc. Computing device(s) 102A-102N and admin computing device 112 may be any type of mobile computing device, such as… a stationary computing device such as a desktop computer or PC (personal computer), etc” [Gupte ¶ 0023]) comprising:
	receiving, by the one or more computers, image data over a communication network, the image data representing an image provided for presentation by a display device; (“The contents generated by a corresponding computing device of computing device 102A-102N may be displayed on a respective display of multi-display system 108. For example, each of computing device(s) 102A-102N may execute an application that renders a respective application window (e.g., application windows 114A-114N) on a display device coupled to the computing device” [Gupte ¶ 0025]; “Referring again to FIG. 1, anomaly detector 110 is configured to detect such anomalies. For instance, anomaly detector 110 may analyze images representative of the contents being displayed by each of computing device(s) 102A-102N. For example, as shown in FIG. 1, each of computing device(s) 102A-102N may comprise a respective agent (e.g., agents 116A-116N). Agent 116A may be configured to periodically generate an image or screenshot of the contents generated by computing device 102A…and agent 116N may be configured to periodically generate an image or screenshot of the contents generated by computing device 102N…Each of agents 116A-116N may generate the image and provide the image to cloud services platform 122 in accordance with a predetermined frequency (e.g., once every 1 minute, 5 minutes, etc.)…Agent 116A may provide images (shown as image 118A) to a storage system (not shown) communicatively coupled to anomaly detector 110, e.g., via the network by which its computing device 102A and cloud services platform 122 are coupled” [Gupte ¶ 0031-0032]; Each computing device (i.e., computer), wherein each computing device is coupled to a display device that can present application windows (i.e., content), can generate an image or screenshot of content (i.e., image data) and provide the image to the cloud services platform over a network)
	processing, by the one or more computers, the image data using a machine learning model that has been trained to evaluate status of display devices based on input of image data corresponding to the display devices, wherein the machine learning model has been trained based on training data examples that include image data from multiple display devices and include examples for different classifications in a predetermined set of classifications; (see Fig. 1 of Drawings – Cloud Services Platform 122 includes Anomaly Detector 110; “Cloud services platform 122 may comprise a group or collection of one or more servers or nodes (e.g., computing devices) that are each hosted on a network such as the Internet (e.g., in a “cloud-based” embodiment) to store, manage, and process data” [¶ 0023]; “Anomaly detector 110 retrieves images 118A-118N from the storage system and analyzes images 118A-118N to determine whether anomalies are present therein…Anomaly detector 110 may utilize machine-learning based techniques to determine whether a visual anomaly is present in any of images 118A-118N received by anomaly detector 110. For instance, anomaly detector 110 may comprise a plurality of classification models 106” [Gupte ¶ 0035]; “Anomaly detector 310 is an example of anomaly detector 110, as described above with reference to FIG. 1. Anomaly detector 310 comprises an image retriever 304, a model selector 308, a plurality of visual anomaly detection models 306A-306N, a monitor 312, and a portal 326. Detection models 306A-306N are examples of detection models 106, as described above with reference to FIG. 1.” [Gupte ¶ 0040]; “Portal 326 may also be utilized to configure anomaly detector 326 to operate in different modes. For instance, a first mode may be a training mode in which anomaly detector 326 utilizes previously-collected images to train a supervised machine learning algorithm for generation of visual anomaly detection models 306A-306N…A second mode may be a real-time detection mode in which anomaly detector 310 detects visual anomalies in images 118A-118C as described above” [Gupte ¶ 0054]; “Training data 502 represents images (e.g., images 118A-118N, as described above with reference to FIGS. 1 and 3) that were previously generated (e.g., by agents 116A-116N, as described above with reference to FIG. 1) over the course of a past predetermined time period, such as, but not limited to, one or more days, weeks, months, or year…Training data 502 may comprise hundreds, thousands, or hundreds of thousands of such images. Each of the images in training data 704 may be labeled to indicate whether or not a visual anomaly exists therein. For instance, images that have known visual anomalies may be labeled as positively-labeled images 508, and images that have no visual anomalies may be labeled as negatively-labeled images” [Gupte ¶ 0068-0069]; The cloud services platform, which can comprise one or more computing devices (i.e., computers), can process images using an anomaly detector comprising visual anomaly detection models (i.e., machine learning models), wherein the detection models can be trained on previously collected data that was generated by agents (i.e., training data from multiple display devices) which can include hundreds of thousands of images and may be labeled positively or negatively indicating whether or not a visual anomaly exists (i.e., include examples for different classifications))
	selecting, by the one or more computers, a classification for a status of the display device based on the output that the machine learning model generated based on the image data, wherein the classification is selected from among the predetermined set of classifications; (“For instance, anomaly detector 110 may comprise a plurality of classification models 106. Each of classification models 106 is configured to generate a score for images 118A-118N received from a particular computing device of computing device(s) 102A-102N” [Gupte ¶ 0035]; “The generated score indicates a likelihood that an image processed thereby comprises a visual anomaly in accordance with the computing device identifier received for that image… The score may comprise a value between 0.0 and 1.0, where higher the number, the greater the likelihood that an image comprises a visual anomaly” [Gupte ¶ 0036]; “Anomaly detector 110 may determine that an anomaly is present in an image if the score has a predetermined relationship with a predetermined threshold. For example, anomaly detector 110 may compare the score with a predetermined threshold to determine whether or not the value exceeds the predetermined threshold (e.g., a score of 0.85)” [Gupte ¶ 0037]; “In accordance with an embodiment, anomaly detector 110 may be configured to generate a plurality of scores, where each score is indicative of a particular type of visual anomaly” [Gupte ¶ 0038]; The score, between predetermined range of 0.0 and 1.0, can indicate presence/absence of a visual anomaly (i.e., classification of status), and can further indicate type of a visual anomaly); and
	the predetermined set of classifications comprises a normal state ([Gupte ¶ 0035-0038] as detailed above; Absence of a visual anomaly (i.e., normal state) may be determined based on a score value of predetermined 0.0-1.0 range);
	determining, based on the selected classification, that the output of the display device is not correct or that the display device is not in a desired operating state; (“In response to determining that a visual anomaly exists, anomaly detector 110 may perform one or more automated actions that cause the visual anomaly to be removed” [Gupte ¶ 0039])
 	based on determining that the output of the display device is not correct or that the display device is not in a desired operating state, selecting a corrective action to improve output of the display device ([Gupte ¶ 0039] as detailed above; An automated corrective action is selected in response to the detected anomaly); and 
sending, to the display device, an instruction for the display device to perform the selected corrective action. (“The action performed may depend on the determined type of visual anomaly detected…Another example of an action includes causing a computing device that provided the image (e.g., one of computing device(s) 102A-102N) to be automatically restarted. For example, anomaly detector 110 may provide a command (shown as command 120), to the computing device of computing device(s) 102A-102N displaying the content with the visual anomaly, that automatically causes the computing device to be restarted” [Gupte ¶ 0039]; Based on determining that a visual anomaly exists (i.e., that the output of the display device is not correct), a command (i.e., instruction) can be sent to the device to perform a particular corrective action (e.g., automatic restart)).
However, Gupte does not expressly teach the predetermined set of classifications compris[ing] a setup screen state.
In the same field of endeavor, Feiz teaches a means of visually analyzing screen captures of device screens via a machine learning model (“Understanding screen relationships is a difficult task as instances of the same screen may have visual and structural variation, for example due to different content in a database-backed application, scrolling, dialog boxes opening or closing, or content loading delays. At the same time, instances of different screens from the same app may share some similarities in terms of design, structure, and content. This paper uses a dataset of screenshots from more than 1K iPhone applications to train two ML models that understand similarity in different ways: (1) a screen similarity model that combines a UI object detector with a transformer model architecture to recognize instances of the same screen from a collection of screenshots from a single app, and (2) a screen transition model that uses a siamese network architecture to identify both similarity and three types of events that appear in an interaction trace: the keyboard or a dialog box appearing or disappearing, and scrolling” [Feiz Abstract]) wherein the predetermined set of classifications comprises a setup screen state (“We built an annotation interface to enable workers to group screenshots as a card sorting task (see Figure 2)… The workers used our interface to group the screenshots of a single app in the dataset, and we instructed them to stop when each group of screenshots in their opinion represented a different screen…To create a final set of grouping annotations, we consider two screenshots to belong to the same group if both annotators put the pair in the same group…The final set of consensus groups has an average of 6.94 groups per app (standard deviation: 2.63) with an average of 3.09 screenshots per group (standard deviation: 1.15). Finally, in order maximize the size of our similarity dataset, we (1) added all consecutive “same” screenshots to our grouped screens, and (2) created a dataset containing every combination of two screens from the annotated groups and labeled them as same if they belonged to the same group and different otherwise. Going forward, we refer to this dataset as the screen similarity dataset” [Feiz page 6 Screen Similarity: Annotation]; see Figure 2 – “Annotation interface for grouping same screenshots of an app as a card-sorting task” – see, e.g., top right group (i.e., predetermined classification) including “Settings” page (i.e., setup screen) [Feiz page 6]; “Training Procedure: We trained the Similarity Transformer model on our screen similarity dataset (Section 4.1)… Each example consisted of two featurized screens and a label describing their relationship. We trained our model to predict the similarity label from the two input screens by minimizing the binary cross-entropy loss” [Feiz page 7 Screen Similarity: Modeling]; Via the pre-training step of grouping screenshots of applications into predetermined groups (i.e., classifications) with each group representing a particular screen type (e.g., setup screen), the disclosed model learns to implicitly identify screen type of new examples based on similarity to other screens from the same application).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated wherein the predetermined set of classifications comprises a setup screen state as taught by Feiz into Gupte because they are both directed towards visually analyzing screen captures of device screens via a machine learning model. Given that Gupte already teaches detecting visual anomalies via a plurality of classification models, with each model being configured for a particular device (“For instance, anomaly detector 110 may comprise a plurality of classification models 106. Each of classification models 106 is configured to generate a score for images 118A-118N received from a particular computing device of computing device(s) 102A-102N…  For instance, a first classification model of classification models 106 analyzes images 118A from computing device 102A to detect visual artifacts that are atypical for that computing device” [Gupte ¶ 0035-0036]), a person of ordinary skill in the art would recognize the value of incorporating the teachings of Feiz to further specialize configuration of classification models for particular use cases, e.g., detecting anomalies for a particular application running on a particular device. Based on learning, via the screen similarity model and screen transition model of Feiz, both screen types and relationships between screens respectively, the resulting detection model would be better trained for recognizing what is truly visually atypical for a given application on a given device, thereby boosting detection accuracy with a reasonable expectation of success.
However, the combination of Gupte and Feiz does not expressly teach the predetermined set of classifications compris[ing] a broken user interface state.
In the same field of endeavor, Liu teaches a means of detecting visual anomalies of device screen captures via a machine learning model (“According to our pilot study of crowdtesting bug reports, display issues such as text overlap, blurred screen, missing image always occur during GUI rendering on different devices due to the software or hardware compatibility. They negatively influence the app usability, resulting in poor user experience. To detect these issues, we propose a novel approach, OwlEye, based on deep learning for modelling visual information of the GUI screenshot. Therefore, OwlEye can detect GUIs with display issues and also locate the detailed region of the issue in the given GUI for guiding developers to fix the bug” [Liu Abstract]) wherein the predetermined set of classifications comprises a broken user interface state (“During the manually examination process, we notice that there are different types of UI issues, a categorization of these issues would facilitate the design and evaluation of related approach. Following the Card Sorting [59] method, we classify those UI issues into five categories including component occlusion, text overlap, missing image, null value and blurred screen with details as follows” [Liu pages 2-3 Categorizing UI Display Issues]; see Figure 2 – “Examples of five categories of UI display issues” – including (c) Missing image and (d) NULL value [Liu page 3]; “This paper proposes OwlEye to automatically detect and localize UI display issues in the screenshots of the application under test, as shown in Figure 3. Given one UI screenshot, our CNN-based model can first classify if it relates with any display issues via the visual understanding” [Liu page 3]; see Figure 3 – “Overview of OwlEye” – including CNN-based Issues Detection (Sec 3.1) [Liu page 4]; Upon training on screenshots labeled with predetermined issue types (including user interfaces with missing/null elements (i.e., broken user interface states)), the disclosed model learns to identify and classify screenshots of new examples depicting similar display issues).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated wherein the predetermined set of classifications comprises a broken user interface state as taught by Liu into the combination of Gupte and Feiz because both Gupte and Liu are directed towards detecting visual anomalies of device screen captures via a machine learning model. While Gupte does not expressly limit the scope of the term “visual anomalies”, which are described as “display content that was not expected to be displayed” [Gupte ¶ 0030], the disclosure largely describes means of detection of content obstruction/mispositioning, or display of undesired/wrong content type at a given time, rather than expressly disclosing means of detecting, e.g., visual glitches / broken visuals in displayed content. As such, a person of ordinary skill in the art would recognize the value of incorporating the teachings of Liu to improve the overall detection model by enabling detection of a wider scope of possible “visual anomalies”.
	Regarding claim 13, it is a system/apparatus claim that corresponds to the method of claim 1, which is already taught by the combination of Gupte, Feiz, and Liu as detailed above. Gupte further teaches a system comprising: one or more computers; and one or more computer-readable media storing instructions that are operable, when executed by the one or more computers, to cause the system to perform operations comprising: the claimed functions ([Gupte ¶ 0023, 0025], as detailed above; “Furthermore, FIG. 8 depicts an exemplary implementation of a computing device 1000 in which embodiments may be implemented, including system 100, computing device(s) 102A-102N…and/or each of the components described therein…The description of computing device 800 provided herein is provided for purposes of illustration” [Gupte ¶ 0085]; “Computing device 800 also has one or more of the following drives: a hard disk drive 814 for reading from and writing to a hard disk…The drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computer” [Gupte ¶ 0087). Consequently, it is rejected for the same reasons as claim 1.
	Regarding claim 20, it is a product claim that corresponds to the method of claim 1, which is already taught by the combination of Gupte, Feiz, and Liu as detailed above. Gupte further teaches One or more computer-readable media storing instructions that are operable, when executed by one or more computers, to cause the one or more computers to perform operations comprising: the claimed functions ([Gupte ¶ 0023, 0025, 0085, 0087], as detailed above). Consequently, it is rejected for the same reasons as claim 1.
	Regarding claim 2, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1, and Gupte further teaches wherein the machine learning model is a convolutional neural network (“Visual anomaly detection models 506A-506N may be artificial neural network-based models, convolution neural network-based models, K-nearest neighbor-based models, decision tree-models, support vector machine-based models, etc” [Gupte ¶ 0072]).
	Regarding claim 3, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1, and Gupte further teaches further comprising training the machine learning model based on training data examples from multiple display devices, each of the training examples comprising a screen capture image and a label indicating a classification for the screen capture image (“Training data 502 represents images (e.g., images 118A-118N, as described above with reference to FIGS. 1 and 3) that were previously generated (e.g., by agents 116A-116N, as described above with reference to FIG. 1) over the course of a past predetermined time period, such as, but not limited to, one or more days, weeks, months, or year…Training data 502 may comprise hundreds, thousands, or hundreds of thousands of such images. Each of the images in training data 704 may be labeled to indicate whether or not a visual anomaly exists therein” [Gupte ¶ 0068-0069]; “The computing device may further periodically provide images or screenshots of the content” [Gupte ¶ 0019]; “. For example, as shown in FIG. 1, each of computing device(s) 102A-102N may comprise a respective agent (e.g., agents 116A-116N). Agent 116A may be configured to periodically generate an image or screenshot of the contents generated by computing device 102A” [Gupte ¶ 0031]).
	Regarding claim 5, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1, and Gupte further teaches wherein providing the output comprises providing the output to the display device, to a server associated with the display device, or to a client device of an administrator for the display device(“In response to determining that a visual anomaly exists, anomaly detector 110 may perform one or more automated actions that cause the visual anomaly to be removed. The action performed may depend on the determined type of visual anomaly detected. One example of an action includes, but is not limited to, providing an alert 124 to a computing device of an administrator (e.g., admin computing device 112) indicating that the visual anomaly has been detected in an image, along with specifying the identifier associated with the image” [Gupte ¶ 0039]; The alert that indicates presence of a visual anomaly (i.e., provides output) can be sent to an admin computing device (i.e., client device of an administrator)).
	Regarding claim 7, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1, and Gupte further teaches wherein the corrective action comprises at least one of changing content to display, changing a display setting, changing a network setting, changing an operating mode, restarting the display device, closing or re-opening an application, initiating a content refresh cycle, restoring one or more settings to a default or reference state, or clearing or refilling a cache of content (For example, anomaly detector 110 may provide a command (shown as command 120), to the computing device of computing device(s) 102A-102N displaying the content with the visual anomaly, that automatically causes the computing device to be restarted” [Gupte ¶ 0039]).
	Regarding claim 8, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1, and Gupte further teaches wherein selecting the corrective action comprises using stored rules that specify different corrective actions to perform for different classifications in the predetermined set of classifications. (“anomaly detector 110 may be configured to generate a plurality of scores, where each score is indicative of a particular type of visual anomaly…The action performed may depend on the determined type of visual anomaly detected…For instance, if another application window is positioned over the application window (i.e., application window 114A) that includes content intended to be displayed via multi-display system 108, anomaly detector 110 may provide a command (shown as command 120) to the computing device (i.e., computing device 102A) displaying that application window. The command causes the problematic window to be minimized” [Gupte ¶ 0039]).
	Regarding claim 9, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1, and Gupte further teaches tracking a status of the display device over time to verify whether normal operation of the display device occurs after instructing the corrective action to be performed (“Each of agents 116A-116N may generate the image and provide the image to cloud services platform 122 in accordance with a predetermined frequency (e.g., once every 1 minute, 5 minutes, etc.)” [Gupte ¶ 0032]; Agents can be configured to continue periodically sending screenshots of contents generated by respective computing devices to the cloud services platform, wherein the cloud services platform comprises the anomaly detector [see Fig. 1] which is configured to identify presence or absence of visual anomalies (i.e., verify normal operation); the system thereby implicitly tracks status of computing devices over time)
	Regarding claim 10, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1, and Gupte further teaches for each of multiple display devices: 
	receiving a series of different screen capture images obtained at different times; (“each of computing device(s) 102A-102N may comprise a respective agent (e.g., agents 116A-116N). Agent 116A may be configured to periodically generate an image or screenshot of the contents generated by computing device 102A…and agent 116N may be configured to periodically generate an image or screenshot of the contents generated by computing device 102N” [Gupte ¶ 0031]; “Each of agents 116A-116N may generate the image and provide the image to cloud services platform 122 in accordance with a predetermined frequency (e.g., once every 1 minute, 5 minutes, etc.)” [Gupte ¶ 0032])
	determining a classification for each of the screen capture images using the machine learning model; (“For instance, anomaly detector 110 may comprise a plurality of classification models 106. Each of classification models 106 is configured to generate a score for images 118A-118N received from a particular computing device of computing device(s) 102A-102N” [Gupte ¶ 0035]; “The generated score indicates a likelihood that an image processed thereby comprises a visual anomaly in accordance with the computing device identifier received for that image… The score may comprise a value between 0.0 and 1.0, where higher the number, the greater the likelihood that an image comprises a visual anomaly” [Gupte ¶ 0036]; “Anomaly detector 110 may determine that an anomaly is present in an image if the score has a predetermined relationship with a predetermined threshold. For example, anomaly detector 110 may compare the score with a predetermined threshold to determine whether or not the value exceeds the predetermined threshold (e.g., a score of 0.85)” [Gupte ¶ 0037]) and     
	tracking status of the display device by storing records indicating the classifications determined for the screen capture images. (“Agent 116A may provide images (shown as image 118A) to a storage system (not shown) communicatively coupled to anomaly detector 110, e.g., via the network by which its computing device 102A and cloud services platform 122 are coupled” [Gupte ¶ 0032])
	Regarding claim 11, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1, and Gupte further teaches wherein the machine learning model is configured to provide, in response to receiving input image data, a set of scores comprising a score for each of the classifications in the predetermined set of classifications (“For instance, anomaly detector 110 may comprise a plurality of classification models 106. Each of classification models 106 is configured to generate a score for images 118A-118N received from a particular computing device of computing device(s) 102A-102N” [Gupte ¶ 0035]; “The generated score indicates a likelihood that an image processed thereby comprises a visual anomaly in accordance with the computing device identifier received for that image… The score may comprise a value between 0.0 and 1.0, where higher the number, the greater the likelihood that an image comprises a visual anomaly” [Gupte ¶ 0036])
Regarding claims 14-15, 17, and 19, they are system/apparatus claims that correspond to the method of claims 2-3, 5, and 7, which are already taught by the combination of Gupte, Feiz, and Liu as detailed above. Consequently, they are rejected for the same reasons as claims 2-3, 5, and 7.
Claims 4 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Gupte, Feiz, and Liu, as applied to claims 1 and 13 above, further in view of Ryou et al., (“Automatic Detection of Visibility Faults by Layout Changes in HTML5 Web Pages”, available conference April 2018), hereinafter Ryou.
Regarding claim 4, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1.
However, the combination does not expressly teach providing an application programming interface (API) that enables remote devices to request classification of image data using the API; wherein receiving the image data comprises receiving the image data using the API; and wherein providing the output indicating the selected classification comprises providing the output using the API.
In the same field of endeavor, Ryou teaches a method of automatically detecting visual display issues (“In this paper, we first define the problem that functionalities of HTML5 web pages may become unusable due to layout changes, and propose a technique to detect the problem automatically. We show that our implementation detects such problems in real-world HTML5 web pages” [Ryou Abstract]) that provid[es] an application programming interface (API) that enables remote devices to request classification of image data using the API; (“When a user touches a button of a dialogue page, the document may change its layout still remaining in the same page by using Document Object Model (DOM) APIs. DOM is a platform and language-independent interface that enables programs to access and update the content, structure, and style of HTML documents dynamically [2]. Because JavaScript can run on any devices with a browser, it facilitates development of portable web pages” [Ryou page 1 Introduction]; “Web pages can handle events by registering JavaScript functions called event handlers and executing them when their corresponding events are triggered. An event handler may be registered to a DOM element by following ways: The DOM API function addEventListener registers a given event handler to a given DOM element” [Ryou page 3 Event Objects]; “Definition 4: A UI state is a set of event objects. We denote an event object and its visibility as a pair [equation] For example, the bottom-right layout in Figure 2 can be represented as follow: {btn1, normal, btn2, normal btnYES, full-cover, btnNO, full-cover}. Execution of event handlers may change the visibility of event objects. In Figure 2, if a user triggers a click event on btn2 in the top-left layout, divdialog rises above other elements changing its display property from none to block…Definition 5: A transition denotes that an event changes a UI state to another UI state by executing its event handler…Now that we have defined UI states and their transitions, we define UI state graphs with UI states as vertices and transitions as directed edges for a given web page and a browser… we assume that execution of an event handler always results in a unique UI state” [Ryou page 4 UI State Graphs]; The DOM API, which registers event listeners to elements on a web page, enables a click event in a browser (i.e., event on a remote client device) to trigger (i.e., request) creation of a unique UI state, wherein a UI state classifies elements in a web page (i.e., image data) based on their visibility).
wherein receiving the image data comprises receiving the image data using the API; and
wherein providing the output indicating the selected classification comprises providing the output using the API (see Fig. 5: Two roles of Proxy [Ryou page 6]; “This section presents a tool that analyzes HTML5 web pages and reports possible VFs in them. As Figure 4 illustrates, the tool consists of two parts: UI State Graph Crawler and Bug Detector” [Ryou page 5 Automatic Detection of Visibility Faults]; “UI State Graph Crawler consists of two parts: UI State Graph Builder that takes a web page and a browser, and Proxy that manages connections between the page and the browser” [Ryou page 5 UI State Graph Crawler]; “UI State Graph Builder: For a given web page and a browser, UI State Graph Builder collects the UI states and transitions between them to build their UI state graph. In order to collect UI states, it collects all the event objects and their visibility in the web page and the browser… To collect event objects, UI State Graph Builder keeps track of DOM API function calls that register or remove event handlers such as addEventListener and removeEventListener. It monitors such behaviors by defining wrappers of DOM APIs, which record which event handlers are registered to or removed from which DOM element and event type” [Ryou pages 5-6 UI State Graph Builder]; “Proxy: Since HTML5 web pages may request resources like images even after the pages have been fully loaded, possibly changing layouts, we run UI State Graph Builder on a browser connected to network using Proxy… Figure 5(a) illustrates that when a browser requests the source code of a web page for the first time, Proxy intercepts the request, receives the requested page, and returns the web page with inserted UI State Graph Builder to the browser. In order to load wrappers for DOM APIs before executing the target web page, Proxy inserts the wrappers and relevant code right after the <head> tag of the HTML document.” [Ryou page 6 Proxy]; “As for a Proxy, we used a simple Python script on mitmproxy [56]” [Ryou page 8 Implementation and Experiment Setup]; Proxy, which manages connection between the web page (e.g., web server) and the browser (e.g., client device), inserts UI State Graph Builder into the web page returned to the browser (i.e., when providing outputted image data), wherein UI State Graph Builder classifies elements (based on visibility via UI states) by using DOM API to keep track of event handlers registered to each element. Proxy further uses the Python API of mitmproxy to manage the connection [see attached reference MIT “mitmproxy” [page 2 Python API]] between the web page and browser (i.e., uses API to receive requests from browser (i.e., receive image data) and to return web pages (i.e., provide output))). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated providing an application programming interface (API) that enables remote devices to request classification of image data using the API; wherein receiving the image data comprises receiving the image data using the API; and wherein providing the output indicating the selected classification comprises providing the output using the API as taught by Ryou into the combination because both Gupte and Neuhauser are directed towards methods of automatically detecting display issues. Although Gupte does not explicitly discuss APIs, it does teach application of the claimed visual anomaly detection method to web pages (“Examples of content items, include, but are not limited to, images, photos, videos, web pages, documents, and/or any content that may be included in each of application windows 114A-114N” [Gupte ¶ 0025]), and further teaches connection of claimed elements over a network such as the Internet (“Each of computing device(s) 102A-102N, multi-display system 108, cloud services platform 122, and/or admin computing device 112 may be communicatively coupled via a network, which may comprise one or more networks such as local area networks (LANs), wide area networks (WANs), enterprise networks, the Internet, etc., and may include one or more of wired and/or wireless portions” [Gupte ¶ 0022]; “Cloud services platform 122 may comprise a group or collection of one or more servers or nodes (e.g., computing devices) that are each hosted on a network such as the Internet (e.g., in a “cloud-based” embodiment) to store, manage, and process data” [Gupte ¶ 0023]). Therefore, a person of ordinary skill in the art would recognize the value of incorporating the teachings of Ryou (e.g., web APIs) to further enable implementation of a “cloud-based” embodiment of the visual anomaly detection method of Gupte, thereby expanding its application to Internet-based systems.
Regarding claim 16, it is a system/apparatus claim that corresponds to the method of claim 4, which is already taught by the combination of Gupte, Feiz, Liu, and Ryou as detailed above. Consequently, it is rejected for the same reasons as claim 4.
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Gupte, Feiz, and Liu, as applied to claim 1 above, further in view of Neuhauser et al., (Pub. No. US 20150039637 A1, “Systems Apparatus and Methods for Determining Computer Apparatus Usage Via Processed Visual Indicia”, published 02/05/2015), hereinafter Neuhauser.
Regarding claim 12, the combination of Gupte, Feiz, and Liu teaches the limitations of parent claim 1.
However, the combination does not expressly teach wherein the received image data is a down-sampled version of a screen capture image generated by the display device.
In the same field of endeavor, Neuhauser teaches a method of visually monitoring screens of devices via screen captures (“The present disclosure is directed to monitoring processor-based devices, such as cell phones, computer tablets, personal computers, laptops, and the like for device usage. More specifically, the present disclosure is directed to visually monitoring screens of devices to determine usage and/or media exposure” [Neuhauser ¶ 0001]; “Using a device such as the one disclosed in FIG. 1, a configuration may be arranged within applications module 124, or any other suitable module downloaded and/or stored in memory 118, to automatically generate screenshots on device 100. A screenshot, also known as screen dump, screen capture, screen grab, or print screen may be considered an image taken by the device to record the visible items displayed on the screen, monitor, television, or another visual output device… Once saved, the image file may be processed on the device 100 or transmitted externally for further processing” [Neuhauser ¶ 0029]) wherein the received image data is a down-sampled version of a screen capture image generated by the display device. (“Starting from 301, a screenshot image may be processed by convolving the image with a Gaussian kernel and successively down-sampling each direction by a predetermined amount (e.g., 1/2)” [Neuhauser ¶ 0039])
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated wherein the received image data is a down-sampled version of a screen capture image generated by the display device as taught by Neuhauser into the combination because both Gupte and Neuhauser directed towards methods of visually monitoring screens of devices via screen captures. As is commonly understood in the art, down-sampling images is an effective technique for reducing noise and highlighting important features (“Preferably, multiscale images are produced for edge detection using Gaussian pyramids which successively low-pass filter and down-sample the original image…As regions containing text will have significantly higher values of average edge density, strength and variance of orientations than those of non-text regions, these characteristics may be used to generate a feature map 305 which suppresses the false regions and enhances true candidate text regions” [Neuhauser ¶ 0041]) while also minimizing computational complexity due to the reduced dimensionality of the data. Therefore, a person of ordinary skill in the art would recognize the value of incorporating the down-sampling techniques taught by Neuhauser into Gupte to achieve these known benefits.
Response to Arguments
	The remarks filed 11/04/2025 have been fully considered.
	Applicant’s remarks [Remarks pages 8-10] traversing the non-eligible subject matter rejections under 35 U.S.C. 101 set forth in the office action mailed 08/04/2025, in view of claims 1, 7-9, 13, 17, and 19-20 as amended, have been considered but are not persuasive.
	Applicant alleges that the claims are not directed to an abstract idea but to a specific, practical application that improves the functioning of computer systems for managing networked display devices.
	The examiner respectfully disagrees. Applicant is directed towards the grounds of rejection under 35 U.S.C. 101 with respect to amended claims 1, 7-9, 13, 17, and 19-20 set forth above. Applicant’s arguments are further summarized and addressed below.
	Applicant argues that the claimed invention employs classification as a crucial input to a technical control loop that directly and automatically alters the operational state of a physical, remote machine, and that the manipulation of the state of a machine moves the claims beyond the realm of abstract thought.
	In response, the examiner notes that claims can recite a mental process even if they are claimed as being performed on a computer or in a computer environment (see MPEP § 2106.04(a)(2)(III)(C)). While the claims at issue do invoke generic computer hardware to perform limitations within a display environment, the claims as a whole are not materially directed towards any type of specific or technical computer implementation, but rather towards a procedure of observing image capture of a device display, making a determination on status of the device based on the display, and coming up with possible actions for resolving any identified issues. As explained in the rejection of record, this procedure of analysis falls within the scope of what a human would be able to perform in the human mind based on pure observation and reasoning. While the claims do recite the idea of an eventual solution achieved via interaction between devices in a computer environment, they fail to recite any specific, technical details to explain how such a solution is accomplished beyond mere execution of a generic machine learning model and mere transmission of data. Generic steps of computer implementation thereby do not absolve the claims from still being abstract in nature.
	The examiner further notes that although the current claimed procedure ends with “send[ing] instructions to a device to perform the selected corrective action”, and does not expressly include the actual execution of received instructions on the device itself, even simply appending such a step of execution (absent specific, technical details of implementation) would likewise be considered as nothing more than a tangential post-solution implementation step of “applying” the results of an abstract procedure within a conventional computing environment.
	Applicant argues that the claimed invention improves inefficiency and delay in manual display monitoring via transforming a general-purpose computer into an automated, self-correcting device-management and diagnostic tool. The recited machine learning model, trained specifically on visual screen states to automatically trigger targeted, hardware-level remediation actions on a remote device, is allegedly a non-conventional arrangement due to the ordered combination of receiving image data, using a specifically trained ML model for status classification, and initiating an automated, closed-loop remediation command to remote hardware.
	In response, the examiner notes that mere invocation of a generic machine learning model trained on input data to “automatically” perform what could otherwise be performed mentally is inadequate to suggest improvement. Merely claiming the improved speed or efficiency that is inherent to invoking a machine learning model on computer hardware, without any details to explain how the machine learning model at issue is specifically leveraged to achieve such an efficiency (beyond mere convention of utilizing training data to learn class predictions) or specifically improve processes of an underlying computer beyond mere execution, fails to lift the claims from abstraction or recite anything beyond what would be recognized as routine and conventional by one of ordinary skill in the art.
	Applicant has not presented further arguments with respect to the dependent claims. As such, amended claims 1, 7-9, 13, 17, and 19-20 stand rejected under 35 U.S.C. 101.
	Applicant’s remarks [Remarks pages 10-12] traversing the anticipation rejections under 35 U.S.C. 102 set forth in the office action mailed 08/04/2025, with respect to claims 1, 7-9, 13, 17, and 19-20 as amended, have been considered, but are moot because the new grounds of rejection set forth above does not rely on the reference(s) applied in the prior rejection of record for the subject matter being specifically challenged in applicant's argument.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Chiatti et al. (“Guess What’s On My Screen? Clustering Smartphone Screenshots with Active Learning”, available arXiv 10 Jan 2019) discloses a framework for combining K-Means clustering with Active Learning for efficient leveraging of labeled and unlabeled samples, with the goal of discovering latent classes and describing a large collection of screenshot data.	Packevičius et al. (“Automated Visual Testing of Application User Interfaces Using Static Analysis of Screenshots”, published 2021) discloses an automated visual testing method for user interfaces that utilizes a classification scheme for visual defects, and tests applications on different devices with varying hardware and software parameters. 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIJAY M BALAKRISHNAN whose telephone number is (571) 272-0455. The examiner can normally be reached 10am-5pm EST Mon-Thurs.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER WELCH can be reached on (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/V.M.B./
Examiner, Art Unit 2143 
	
/JENNIFER N WELCH/Supervisory Patent Examiner, Art Unit 2143
Read full office action
Prosecution Timeline

Jul 28, 2022
Application Filed
Jul 31, 2025
Non-Final Rejection — §101, §102, §103
Nov 04, 2025
Response Filed
Feb 06, 2026
Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/766,854
Patent 12585912
GATED LINEAR CONTEXTUAL BANDITS
2y 5m to grant Granted Mar 24, 2026
17/517,698
Patent 12468967
METHOD AND SYSTEM FOR GENERATING A SOCIO-TECHNICAL DECISION IN RESPONSE TO AN EVENT
2y 5m to grant Granted Nov 11, 2025
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
43%
Grant Probability
99%
With Interview (+85.7%)
3y 12m
Median Time to Grant
Moderate
PTA Risk
Based on 14 resolved cases by this examiner. Grant probability derived from career allow rate.