Prosecution Insights
Last updated: April 19, 2026
Application No. 18/579,756

Machine Learning for Automated Navigation of User Interfaces

Non-Final OA §103§112
Filed
Jan 16, 2024
Examiner
SILVERMAN, SETH ADAM
Art Unit
2172
Tech Center
2100 — Computer Architecture & Software
Assignee
Google LLC
OA Round
2 (Non-Final)
73%
Grant Probability
Favorable
2-3
OA Rounds
2y 4m
To Grant
88%
With Interview

Examiner Intelligence

Grants 73% — above average
73%
Career Allow Rate
327 granted / 449 resolved
+17.8% vs TC avg
Moderate +15% lift
Without
With
+14.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
47 currently pending
Career history
496
Total Applications
across all art units

Statute-Specific Performance

§101
8.9%
-31.1% vs TC avg
§103
58.5%
+18.5% vs TC avg
§102
20.1%
-19.9% vs TC avg
§112
9.4%
-30.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 449 resolved cases

Office Action

§103 §112
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Information Disclosure Statement The information disclosure statement (IDS) submitted on 1/16/2024, 4/19/2024, 6/26/2024, & 8/7/2025, were filed before the first office action. The submissions are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner. Response to Arguments Applicant’s arguments, filed 1/14/2026, with respect to the rejection(s) of claim(s) 1 under 35 USC 102 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of 35 USC 103 rejection is made in view of The combination of Harries and Williams, wherein Williams has been added to cure the deficiencies of Harries. Accordingly, this action has been reset to a Non-Final Office action. Response to Amendment The previous 112b rejections to claim and 10 have been removed in view of the applicant’s amendments. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. Claim 20, is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 20 is rejected for the indefinite language that neither elaborates on the computation/determination of the loss, nor on how the determination of this loss affects the navigation of the machine-learned interface navigation model. In line with the wording of the present claim 20, the masking operation may not have any effect at all. Claim Rejection Notes In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1, 3, 4, 6, 9, 11-14, and 16-19, and 21, are rejected under 35 U.S.C. 103 as being unpatentable over Harries et al. ("DRIFT: Deep Reinforcement Learning for Functional Software Testing", 33rd Deep Reinforcement Learning Workshop (NeurIPS 2019), 12/8/2019, pages 1-10, XP093006546, Vancouver Canada, DOI: 10.48550/arxiv.2007.08220, URL:https://www.microsoft.com/en-us/research/uploads/prod/2020/02/DRIFT_26_CameraReadySubmission_NeurlPS_DRL.pdf), in view of Williams et al. (US 20070033172 A1, published: 2/8/2007). Claim 1. (Currently Amended): Harries teaches a computing system configured to navigate user interfaces using machine learning, the computing system comprising: one or more processors; and one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising (this computing system performs a method, which is referred to as Deep Reinforcement Learning for Functional Software Testing (DRIFT) as can be seen from the title and the abstract. This method is described in more detail in section 3 on pages 2-6. Its computer- implementation follows e.g., from "PyTorch" in the second paragraph of section 4 on page 6 [Harries]): obtaining, by the computing system, user interface data descriptive of a user interface that comprises a plurality of user interface elements ("UITree" in the first two paragraphs of section 3.1 and FIG. 1 [Harries]); generating, by the computing system and based on the user interface data, a plurality of element embeddings respectively for the plurality of user interface elements (the first two paragraphs of the subsection, "Modeling the state using Graph Neural Networks" of section 3.2 on page 4, from which it follows that the element embeddings are represented as respective element embedding vectors [Harries]); processing, by the computing system, the plurality of element embeddings with a machine-learned interface navigation model to generate a selected action as an output of the machine-learned interface navigation model ("and then apply the graph neural network" in first paragraph of subsection "Modeling the state using Graph Neural Networks: of section 3.2 on page 4. Also the third paragraph spanning pages 4 and 5 [Harries]), wherein the machine-learned interface navigation model selects the selected action from a predefined action space comprising a plurality of predefined candidate actions (paragraph spanning page 5 and 6 [Harries]); and performing, by the computing system, the selected action on the user interface (paragraph spanning page 5 and 6 [Harries]). Harries does not teach wherein the machine-learned interface navigation model is further configured to receive data descriptive of a query as an input alongside the plurality of element embeddings, wherein the query indicates a desired result of machine interaction with the user interface. However, Williams teaches wherein the machine-learned interface navigation model is further configured to receive data descriptive of a query as an input alongside the plurality of element embeddings, wherein the query indicates a desired result of machine interaction with the user interface (by highlighting various commands and user interface elements as the user enters query text, and by providing additional distinctive highlighting in response to navigation within the menu containing search results, the present invention provides the user with a quick mechanism for learning how to find commands and other user interface elements [Williams, 0087]). Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the machine-learned element embedded user interface invention of Harries to include the icon recognition model features of Williams. One would have been motivated to make this modification for both novice and advanced users to navigate a user interface, and for providing a mechanism that avoids the need for users to memorize the location of commands in a user interface (Williams, 0006). Claim 21, having similar elements to claim 1, is likewise rejected. Claim 3. (Currently Amended): The combination of Harries and Williams, teaches the computing system of claim 12. Harries further teaches wherein the query does not reference any of the plurality of predefined candidate actions (the query input is a scalar reward. This reward does not reference any of the plurality of predefined actions. [Harries, par. 3, section 2, pg. 2). Claim 4. (Currently Amended): The combination of Harries and Williams, teaches the computing system of claim 1. Harries further teaches wherein the query comprises a single instruction, and wherein the computing system is configured to perform a plurality of actions from the predefined action space in response to single instruction (the agent has a policy 'Pi' which selects an action given the state Pi(a|s) = P[A = a|S = s]. The action is passed to the environment, and the state is updated using the transition function 7 [Harries, Section 2, paragraph beginning with Reinforcement Learning, pg. 2]). Claim 6: The combination of Harries and Williams, teaches the computing system of claim 1. Harries further teaches wherein the user interface data comprises structural metadata descriptive of a structure of the user interface ("UITree" [Harries, first 2 paragraphs of section 3.1, pg. 1, FIG. 1]). Claim 9: The combination of Harries and Williams, teaches the computing system of claim 1. Harries further teaches wherein processing, by the computing system, the plurality of element embeddings with the machine-learned interface navigation model to generate the selected action further comprises processing, by the computing system, the plurality of element embeddings with the machine-learned interface navigation model to further generate an element index and an argument as an output of the machine-learned interface navigation model, wherein the element index identifies one of the plurality of user interface elements as a target of the selected action ("node identified", "action type" [Harries, fourth paragraph of section 3.1, pg. 3]). Claim 11: The combination of Harries and Williams, teaches the computing system of claim 1. Harries further teaches wherein the machine-learned interface navigation model comprises: a first attention model configured to perform self-attention on the plurality of element embeddings to generate a plurality of first intermediate embeddings; a second attention model configured to perform attention between a query embedding and the plurality of first intermediate embeddings to generate one or more second intermediate embeddings ([Harries, paragraphs spanning pgs. 4-5, and paragraphs that follow]); and one or more prediction heads configured to process the one or more second intermediate embeddings to generate one or more predictions, the one or more prediction heads comprising at least an action prediction head configured to select the selected action from the plurality of predefined candidate actions (Examiner's Official Notice: in the field it is common practice to try out different structures for a neural network at hand, and neural networks, in particular so-called Transformers, whose structure is characterized by an attention mechanism, are frequently encountered). Claim 12: The combination of Harries and Williams, teaches the computing system of claim 1. Harries further teaches wherein the machine-learned interface navigation model comprises a reinforcement learning agent ([Harries, last paragraph of section 1, pg. 2]). Claim 13. (Currently Amended): Harries teaches a computer-implemented method to train a machine-learned interface navigation model to navigate interfaces, the method comprising (algorithm 1 on page 5. In the last paragraph of section 1 of page 2, the interface navigation model is referred to as agent. The method's computer-implementation follow from "PyTorch" in the second paragraph of section 4 on page 6 [Harries]): obtaining, by a computing system comprising one or more computing devices, user interface data descriptive of a user interface that comprises a plurality of user interface elements ("UITree" in the first two paragraphs of section 3.1 and FIG. 1 [Harries]); generating, by the computing system and based on the user interface data, a plurality of element embeddings respectively for the plurality of user interface elements (the first two paragraphs of the subsection, "Modeling the state using Graph Neural Networks" of section 3.2 on page 4, from which it follows that the element embeddings are represented as respective element embedding vectors [Harries]); processing, by the computing system, the plurality of element embeddings with the machine-learned interface navigation model to generate a selected action as an output of the machine-learned interface navigation model ("and then apply the graph neural network" in first paragraph of subsection "Modeling the state using Graph Neural Networks: of section 3.2 on page 4. Also the third paragraph spanning pages 4 and 5 [Harries]), wherein the machine-learned interface navigation model selects the selected action from a predefined action space comprising a plurality of predefined candidate actions (paragraph spanning page 5 and 6 [Harries]); determining, by the computing system, a reward based at least in part on the selected action ("get_reward" in algorithm 1 [Harries, pg. 5]); and modifying, by the computing system, one or more values of one or more parameters of the machine-learned interface navigation model based at least in part on the reward ("train(batch, …)" in algorithm 1 [Harries, pg. 5]). Claim 14. (Currently Amended): The combination of Harries and Williams, teaches the computer-implemented method of claim 13. Williams further teaches wherein the query corresponds to a user utterance (voice command [Williams, 0049]). Claim 16: The combination of Harries and Williams, teaches the computer-implemented method of claim 13. Harries further teaches wherein: processing, by the computing system, the plurality of element embeddings with the machine-learned interface navigation model to generate the selected action further comprises processing, by the computing system, the plurality of element embeddings with the machine-learned interface navigation model to further generate an element index and an argument as an output of the machine-learned interface navigation model, wherein the element index identifies one of the plurality of user interface elements as a target of the selected action; and performing, by the computing system, the selected action comprises performing the selected action on the identified user interface element in accordance with the argument ("node identified" and "action type" [Harries, section 3.1, fourth paragraph, pg. 3]). Claim 17: The combination of Harries and Williams, teaches the computer-implemented method of claim 13. Harries further teaches wherein the user interface data comprises augmented user interface data generated by performance of one or more augmentation operations on existing user interface training data ([Harries, section 3.1, second paragraph, pg. 3]; Examiner's Note: GUIs, which are obtained from other GUIs, by performing one or more augmentation operations on the former are frequently encountered ). Claim 18: The combination of Harries and Williams, teaches the computer-implemented method of claim 17. Harries further teaches wherein the one or more augmentation operations comprise: modifying texts or locations of one or more user interface elements that have been classified as irrelevant ([Harries, section 3.1, second paragraph, pg. 3]; Examiner's Note: GUIs, which are obtained from other GUIs, by performing one or more augmentation operations on the former are frequently encountered ). Claim 19: The combination of Harries and Williams, teaches the computer-implemented method of claim 13. Harries further teaches wherein determining, by the computing system, a reward based at least in part on the selected action comprises comparing, by the computing system, the selected action to a demonstration action that was included in a human demonstration (function call "episode_meets_objective(…)" in algorithm 1 [Harries, pg. 5]). Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Harries et al. ("DRIFT: Deep Reinforcement Learning for Functional Software Testing", 33rd Deep Reinforcement Learning Workshop (NeurIPS 2019), 12/8/2019, pages 1-10, XP093006546, Vancouver Canada, DOI: 10.48550/arxiv.2007.08220, URL:https://www.microsoft.com/en-us/research/uploads/prod/2020/02/DRIFT_26_CameraReadySubmission_NeurlPS_DRL.pdf) and Williams et al. (US 20070033172 A1, published: 2/8/2007), and in further view of Chen ET AL: ("From UI Design Image to GUI Skeleton: A Neural Machine Translator to Bootstrap Mobile GUI Implementation", Proceedings of ICSE '18: 40th International Conference on Software Engineering, 27 May 2018 (2018-05-27), XP093006555, DOI: 10.1145/3180155.3180222 Retrieved from the Internet: URL:https://ieeexplore.ieee.org/stampPDF/getPDF.jsp? tp=&arnumber=8453135&ref=aHROCHM6Ly9pZWVIeHBsb3JILmIIZW Uub3JUnL2RvY3ViZW50LZgONTMxMzU=). Claim 5: The combination of Harries and Williams, teaches the computing system of claim 1. Harries further teaches wherein: the user interface data comprises imagery that depicts the user interface; and generating, by the computing system, the plurality of element embeddings comprises one or more of the following: performing optical character recognition on the imagery; processing the imagery with an icon recognition model; and processing the imagery with an image detection model (as part of a disadvantageous modification of the system [Harries, third paragraph of section 3.1, pg. 3). Harries does not explicitly teach icon recognition model. However, Chen teaches icon recognition model ([Chen, FIG. 3]). Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the machine-learned element embedded user interface invention of the combination of Harries and Williams, to include the icon recognition model features of Chen. One would have been motivated to make this modification to further narrow the machine-learned elements of Harries, with the narrower machine-learning of icon recognition. Such would improve said machine-learning by expanding its abilities. Claim(s) 7, 8, and 15, is/are rejected under 35 U.S.C. 103 as being unpatentable over Harries et al. ("DRIFT: Deep Reinforcement Learning for Functional Software Testing", 33rd Deep Reinforcement Learning Workshop (NeurIPS 2019), 12/8/2019, pages 1-10, XP093006546, Vancouver Canada, DOI: 10.48550/arxiv.2007.08220, URL:https://www.microsoft.com/en-us/research/uploads/prod/2020/02/DRIFT_26_CameraReadySubmission_NeurlPS_DRL.pdf) and Williams et al. (US 20070033172 A1, published: 2/8/2007), and in further view of Eskonen ET AL: ("Automating GUI Testing with Image-Based Deep Reinforcement Learning", 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS), 1 August 2020 (2020-08-01), pages 160-167, XP093006561, DOI: 10.1109/ACSOS4961 4.2020.00038 ISBN: 978-1-7281-7277-4 Retrieved from the Internet: URL:https://ieeexplore.ieee.org/stampPDF/getPDF.jsp? tp=&arnumber=91 96452&ref=). Claim 7: The combination of Harries and Williams, teaches the computing system of claim 1. Harries does not teach wherein the selected action comprises a macro action that comprises a sequence of two or more component actions. However Eskonen teaches wherein the selected action comprises a macro action that comprises a sequence of two or more component actions (first paragraph on page 163 left column, and second paragraph on page 165 left column [Eskonen]). Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the machine-learned element embedded user interface invention of the combination of Harries and Williams, to include the macro action features of Eskonen. One would have been motivated to make this modification to improve the machine learning process of the former. Claim 8: The combination of Harries and Williams, teaches the computing system of claim 7. Harries does not teach wherein the macro action comprises a focus and type action in which an argument is entered into a data entry field of the user interface. However Eskonen teaches wherein the macro action comprises a focus and type action in which an argument is entered into a data entry field of the user interface (first paragraph on page 163 left column, and second paragraph on page 165 left column [Eskonen]). Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the machine-learned element embedded user interface invention of the combination of Harries and Williams, to include the macro action features of Eskonen. One would have been motivated to make this modification to improve the machine learning process of the former. Claim 15: The combination of Harries and Williams, teaches the computer-implemented method of claim 13. Harries does not teach wherein the selected action comprises a macro action that comprises a sequence of two or more component actions. However Eskonen teaches wherein the selected action comprises a macro action that comprises a sequence of two or more component actions (first paragraph on page 163 left column, and second paragraph on page 165 left column [Eskonen]). Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the machine-learned element embedded user interface invention of the combination of Harries and Williams, to include the macro action features of Eskonen. One would have been motivated to make this modification to improve the machine learning process of the former. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to SETH A SILVERMAN whose telephone number is (571)272-9783. The examiner can normally be reached Mon-Thur, 8AM-4PM MST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Adam Queler can be reached at (571)272-4140. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Seth A Silverman/Primary Examiner, Art Unit 2172
Read full office action

Prosecution Timeline

Jan 16, 2024
Application Filed
Oct 09, 2025
Non-Final Rejection — §103, §112
Jan 08, 2026
Examiner Interview Summary
Jan 08, 2026
Applicant Interview (Telephonic)
Jan 14, 2026
Response Filed
Feb 26, 2026
Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12587581
SYSTEMS, METHODS, AND MEDIA FOR CAUSING AN ACTION TO BE PERFORMED ON A USER DEVICE
2y 5m to grant Granted Mar 24, 2026
Patent 12579201
INFORMATION PROCESSING SYSTEM
2y 5m to grant Granted Mar 17, 2026
Patent 12578200
NAVIGATIONAL USER INTERFACES
2y 5m to grant Granted Mar 17, 2026
Patent 12572269
PERFORMING A CONTROL OPERATION BASED ON MULTIPLE TOUCH POINTS
2y 5m to grant Granted Mar 10, 2026
Patent 12572261
SPATIAL NAVIGATION AND CREATION INTERFACE
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

2-3
Expected OA Rounds
73%
Grant Probability
88%
With Interview (+14.8%)
2y 4m
Median Time to Grant
Moderate
PTA Risk
Based on 449 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month