Last updated: April 19, 2026

Application No. 18/967,935

LEARNED COMPUTER CONTROL USING POINTING DEVICE AND KEYBOARD ACTIONS

Final Rejection §103

Filed

Dec 04, 2024

Examiner

LUBIT, RYAN A

Art Unit

2626

Tech Center

2600 — Communications

Assignee

Deepmind Technologies Limited

OA Round

2 (Final)

Interview Optional

— +38.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 756 resolved cases, 2023–2026

Examiner Intelligence

LUBIT, RYAN A View full profile →

Grants 63% of resolved cases

Career Allow Rate

476 granted / 756 resolved

+1.0% vs TC avg

Strong +39% interview lift

Without

With

+38.6%

Interview Lift

resolved cases with interview

Typical timeline

2y 4m

Avg Prosecution

18 currently pending

Career history

774

Total Applications

across all art units

Statute-Specific Performance

§101

4.3%

-35.7% vs TC avg

§103

45.3%

+5.3% vs TC avg

§102

19.9%

-20.1% vs TC avg

§112

23.1%

-16.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 756 resolved cases

Office Action

§103

DETAILED ACTION
Status of the Application
1.	Applicant’s Amendment to the Claims filed January 23, 2026 are received and entered.
2.	Claim 1 is cancelled.  Claims 2, 7, 11, 16, and 22 are amended.  Claims 2 – 26 are pending and are under examination in this action.
3.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments / Amendments
4.	The rejections of claims 7 and 11 – 26 under 35 USC 112(b) are WITHDRAWN in view of the Amendment.
5.	The rejections of claims 11 – 26 under 35 USC 101 are WITHDRAWN in view of the Amendment.
6.	On page 8 of the Response, Applicant argues that the newly added subject matter of claims 2, 11, and 22 is not taught or suggested by the prior art of record.
Applicant’s arguments have been fully considered and are persuasive in view of the newly added subject matter.  However, upon further consideration, a new ground(s) of rejection is made in view of Shimizu et al. (U.S. Pub. 2022/0398497).

Claim Rejections - 35 USC § 103
7.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

8.	Claims 2 – 5, 7 – 8, 10 – 14, 16 – 17, 19 – 23, and 25 – 26 are rejected under 35 U.S.C. 103 as being unpatentable over Croxford et al. (U.S. Pub. 2022/0308667) in view of Shimizu et al. (U.S. Pub. 2022/0398497).
Regarding claim 2, Croxford teaches: a computer-implemented method for controlling a particular computer to execute a task, the method comprising:
receiving a control input that represents at least a current state of the particular computer (FIG. 3; paragraph [0062], [0063], [0077], [0084]; console 32 receives instructions from controller 1 based on user actions [control inputs] to control a current state of game being executed by console 32.  Console 32 and controller 1 together are interpreted as the “particular computer”);
processing the control input using a neural network to generate one or more control outputs that are used to control the particular computer to execute the task, wherein the one or more control outputs comprise an action type output that specifies a set of possible actions, the set of possible actions comprising game controller button press actions (FIG. 1; paragraphs [0023], [0042], [0048]; user actions [control inputs] applied to controller 1 are processed using a neural network to infer gaming inputs.  A variety of types of user actions [control inputs] maybe applied to control the console 32 [part of the particular computer] to perform tasks.  Specifically, gaming controller 1 may include a four-way controller 11, a joystick 12, triggers, touch screens, clickers, keyboard, mouse, eye-trackers, etc., for detecting user actions [control inputs].  These different types of inputs are part of a “set of possible actions” which, due to being used with a gaming controller 1, are “game controller button press actions”); and
executing the one or more actions to control the particular computer (FIG. 6; paragraphs [0074] – [0077]; user actions as predicted as accurate by a combination of user input and the neural network are executed in step s65 to perform corresponding functions by the console 32).
Croxford fails to explicitly disclose: the action type output specifies a probability distribution over a set of possible actions; and determining one or more actions based on the probability distribution specified by the action type output.
However, in a related field of endeavor, Shimizu discloses using a neural network to determine a selected action (Abstract).
With regard to claim 1, Shimizu teaches: the action type output species a probability distribution over a set of possible actions; and determining one or more actions based on the probability distribution specified by the action type output (paragraph [0096]; outputs from a neural network are converted to a probability distribution which is used to select a particular action based on the probability distribution).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of Applicant’s claimed invention to combine the known teachings of Croxford and Shimizu to yield predictable results.  Specifically, it would have been obvious to modify the neural network processing of Croxford in include the utilization of a probability distribution to select a particular action corresponding to an applied input, as taught by Shimizu.  Such a modification of Smith only requires using a known feature of neural networks, i.e., a probability distribution, in a known manner to yield predictable results.  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of Applicant’s claimed invention to combine the known teachings of Croxford and Shimizu to yield the aforementioned predictable results.
Regarding claim 11, Croxford teaches: an apparatus comprising a particular computer (FIG. 3; console 32 and controller 1 together are interpreted as an “apparatus” that comprises a “particular computer”) and one or more storage devices storing instructions, that, when executed by the particular computer, cause the particular computer to be configured to (FIG. 2; paragraphs [0040], [0044]; the processes disclosed herein may be implemented by execution [such as via processor 21] of a program stored on a computer-readable storage medium [such as memory 22]):
determine a control input that represents at least a current state of the particular computer (FIG. 3; paragraph [0062], [0063], [0077], [0084]; console 32 receives instructions from controller 1 based on user actions [control inputs] to control a current state of game being executed by console 32.  Console 32 and controller 1 together are interpreted as the “particular computer”);
send the control input to a neural network system implemented by one or more computers (FIGS. 6, 9; paragraph [0051], [0074], [0095]; at step s62, the user action [control inputs] are sent to a recurrent neural network [RNN] is to predict user input.  The RNN can be implemented on cloud server 91 [one or more remote computers]);
receive, from the neural network system, one or more control outputs, wherein the one or more control outputs comprise an action type output that specifies a set of possible actions comprising one or more game controller button press actions (FIGS. 1, 6; paragraphs [0023], [0042], [0048], [0074] – [0077]; user actions [control inputs] applied to controller 1 are processed using a neural network to infer gaming inputs.  A variety of types of user actions [control inputs] maybe applied to control the console 32 [part of the particular computer] to perform tasks.  Specifically, gaming controller 1 may include a four-way controller 11, a joystick 12, triggers, touch screens, clickers, keyboard, mouse, eye-trackers, etc., for detecting user actions [control inputs].  These different types of inputs are part of a “set of possible actions” which, due to being used with a gaming controller 1, are “game controller button press actions”.  The RNN outputs a predicted action in step s64 for execution in step s65); and
execute the one or more actions to control the particular computer (FIG. 6; paragraphs [0074] – [0077]; user actions as predicted as accurate by a combination of user input and the neural network are executed in step s65 to perform corresponding functions by the console 32).
Croxford fails to explicitly disclose: the action type output specifies a probability distribution over a set of possible actions; and determine one or more actions based on the probability distribution specified by the action type output.
However, Shimizu teaches: the action type output species a probability distribution over a set of possible actions; and determine one or more actions based on the probability distribution specified by the action type output (paragraph [0096]; outputs from a neural network are converted to a probability distribution which is used to select a particular action based on the probability distribution).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of Applicant’s claimed invention to combine the known teachings of Croxford and Shimizu to yield predictable results for at least the reasons set forth above with regard to claim 1.  
Regarding claim 22, Croxford teaches: a mobile device comprising a touchscreen, one or more processors and one or more storage devices (FIG. 10; paragraph [0100]; smartphone 104 inherently includes a touch screen, a processor, and a memory of some sort) storing instructions that, when executed by the one or more processors, cause the one or more processors to be configured to (FIG. 2; paragraphs [0040], [0044]; the processes disclosed herein may be implemented by execution [such as via processor 21] of a program stored on a computer-readable storage medium [such as memory 22]) comprising:
determine a control input that represents at least a current state of the mobile device (FIG. 10; paragraphs [0042], [0100]; since smartphone 104 includes a touchscreen, touch gestures applied thereto represent a current state of the smartphone 104);
send the control input to a neural network system implemented by one or more computers (FIG. 10; paragraphs [0051], [0095], [0112]; the user’s inputs, including applied touch gestures, are sent to a recurrent neural network [RNN] is to predict user input.  As set forth above, the RNN can be implemented on cloud server 91 [one or more remote computers]);
receive, from the neural network system, one or more control outputs, wherein the one or more control outputs comprise an action type output that specifies a set of possible actions comprising one or more touchscreen actions (FIGS. 6, 10; paragraphs [0023], [0042], [0048], [0074] – [0077], [0112]; the process of FIG. 6 is applicable to all user inputs applied to the extended reality system 100 of FIG. 10.  As set forth above, user actions [control inputs] applied to a device, such as smartphone 104, are processed using a neural network to infer inputs.  Since smartphone 104 inherently includes a touch screen for accepting user applied touch gestures, a particular touch gesture is a touchscreen action that is part of a “set of possible actions”); and
execute the one or more actions to control the mobile device (FIGS. 6, 10; paragraphs [0074] – [0077], [0112]; the process of FIG. 6 is applicable to all user inputs applied to the extended reality system 100 of FIG. 10.  Accordingly, user actions as predicted as accurate by a combination of touch screen gesture input and the neural network are executed in step s65 to perform corresponding functions by the smartphone 104).
Croxford fails to explicitly disclose: the action type output specifies a probability distribution over a set of possible actions; and determine one or more actions based on the probability distribution specified by the action type output.
However, Shimizu teaches: the action type output species a probability distribution over a set of possible actions; and determine one or more actions based on the probability distribution specified by the action type output (paragraph [0096]; outputs from a neural network are converted to a probability distribution which is used to select a particular action based on the probability distribution).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of Applicant’s claimed invention to combine the known teachings of Croxford and Shimizu to yield predictable results for at least the reasons set forth above with regard to claim 1.  
Regarding claims 3 and 12, Croxford teaches: wherein the set of possible actions comprises one or more analogue stick actions (FIG. 1; paragraphs [0023], [0042], [0048]; as set forth above, the “set of possible actions” includes input via a joystick 12 [analogue stick actions]).
Regarding claims 4 and 13, Croxford teaches: wherein the set of possible actions comprises one or more touchpad actions (FIG. 1; paragraphs [0023], [0042], [0048]; as set forth above, the “set of possible actions” includes input via a touchscreen [touchpad actions]).
Regarding claims 5 and 14, Croxford teaches: wherein the set of possible actions comprises one or more eye tracking device actions (FIG. 1; paragraphs [0023], [0042], [0048]; as set forth above, the “set of possible actions” includes input via an eye-tracking sensor [eye tracking device actions]).
Regarding claims 7 and 16, Croxford teaches: wherein the control input comprises a visual input that comprises one or more screen frames of a computer display (paragraph [0036]; the user actions include predicting advance frames of display [visual input] based on the predicted user actions), wherein a screen frame in the visual input is an image that represents a step in a process of executing the task on the particular computer (paragraph [0036]; the predictive rendering of advance frames of display [visual input] is based on the predicted action(s) and is therefore a step in the process of executing the predicted user action).
Regarding claims 8, 17, and 23, Croxford teaches: wherein the control input further comprises one or more language inputs, one or more previous controls, or both (paragraph [0079]; the user actions [control inputs] are based on previous predicted user actions [previous controls]).
Regarding claims 10, 20, and 25, Croxford teaches: wherein the one or more computers are remote from the particular computer (FIG. 9; paragraph [0095]; the RNN can be implemented remotely on cloud server 91 [one or more computers]).
Regarding claim 19, Croxford teaches: wherein the apparatus is a game console (FIG. 3; console 32).
Regarding claims 21 and 26, Croxford teaches: wherein the control input comprises a visual input that comprises one or more screen frames of a computer display (paragraph [0036]; the user actions include predicting advance frames of display [visual input] based on the predicted user actions).

9.	Claims 6 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Croxford in view of Shimizu, as applied to claims 2 and 11 above, as evidenced by Weng et al. (U.S. Pub. 2014/0058584).
Regarding claims 6 and 15, neither Croxford nor Shimizu explicitly disclose: wherein the set of possible actions comprises one or more infrared pointer actions.
However, it was well-known and conventional for user input devices to include game-pad inputs, joystick inputs, keypads, infrared pointers, etc.  For evidence please see paragraph [0023] of Weng.  Additionally, gaming controllers having infrared pointers have been well-known since at least the Nintendo Wii, released on November 19, 2006.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of Applicant’s claimed invention to modify the combination of the known teachings of Croxford and Shimizu to yield predictable results.  Specifically, it would have been obvious to modify the gaming controller of Croxford in include an infrared pointer as another source of user input.  Such a modification only requires using well-known and conventional teachings in the art of a user input device including an infrared pointer or a gaming controller including an infrared pointer.  This combination would merely increase the number of potential user input sources of the gaming controller of Croxford in a predictable manner using well-known teachings.  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of Applicant’s claimed invention to modify the combination of the known teachings of Croxford and Shimizu to yield the aforementioned predictable results.

10.	Claims 9, 18, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Croxford in view of Shimizu, as applied to claims 8, 17, and 22 above, in view of Kessler et al. (U.S. Pub. 2021/0027774).
Regarding claims 9, 18, and 24, neither Croxford nor Shimizu explicitly disclose: wherein the one or more language inputs comprise a voice instruction input.
However, in a related field of endeavor, Kessler discloses a computing device that receives input from a variety of sources (microphone, camera, keyboard, mouse, touchscreen, etc.) and processes the input using a trained convolutional neural network [CNN] (paragraphs [0036], [0084]).
With regard to claims 9, 18, and 24, Kessler teaches: wherein the one or more language inputs comprise a voice instruction input (FIGS. 1, 4, 9; paragraphs [0036], [0083], [0105]; in step 902, a “control input” is received which an utterance [voice/language input] of a user).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of Applicant’s claimed invention to combine the known teachings of Croxford, Shimizu, and Kessler to yield predictable results.  Specifically, it would have been obvious to modify the gaming controller / mobile device of Croxford in include a microphone as another source of user input, specifically voice instruction input.  Such a modification only requires using another user input device in combination with a neural network, as taught by both Croxford and Kessler.  This combination would merely increase the number of potential user input sources of the gaming controller / mobile device of Croxford in a predictable manner using well-known teachings of Kessler.  Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of Applicant’s claimed invention to combine the known teachings of Croxford, Shimizu, and Kessler to yield the aforementioned predictable results.



Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN A LUBIT whose telephone number is (571)270-3389. The examiner can normally be reached M - F, ~6am - 3pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Temesghen Ghebretinsae can be reached at 571-272-3017. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/RYAN A LUBIT/Primary Examiner, Art Unit 2626

Read full office action

Prosecution Timeline

Dec 04, 2024

Application Filed

Dec 16, 2024

Response after Non-Final Action

Oct 06, 2025

Non-Final Rejection — §103

Dec 12, 2025

Examiner Interview Summary

Dec 12, 2025

Applicant Interview (Telephonic)

Jan 23, 2026

Response Filed

Feb 23, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/728,795

Patent 12602119

STYLUS MOVEMENT TRANSLATION SWITCHING

2y 5m to grant Granted Apr 14, 2026

19/172,303

Patent 12578817

DISPLAY PANEL, DRIVING METHOD THEREOF, AND ELECTRONIC TERMINAL

2y 5m to grant Granted Mar 17, 2026

18/948,073

Patent 12566499

EYE CENTER OF ROTATION DETERMINATION WITH ONE OR MORE EYE TRACKING CAMERAS

2y 5m to grant Granted Mar 03, 2026

18/658,115

Patent 12562098

DISPLAY APPARATUS AND DISPLAY PANEL

2y 5m to grant Granted Feb 24, 2026

18/663,374

Patent 12560833

DISPLAY DEVICE

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

63%

Grant Probability

99%

With Interview (+38.6%)

2y 4m

Median Time to Grant

Moderate

PTA Risk

Based on 756 resolved cases by this examiner. Grant probability derived from career allow rate.