Last updated: April 19, 2026
Application No. 18/378,272
APPARATUS AND METHOD FOR INTERACTION BETWEEN WORKER AND AUTONOMOUS VEHICLE

Final Rejection §103
Filed
Oct 10, 2023
Examiner
YANOSKA, JOSEPH ANDERSON
Art Unit
3664
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
OA Round
2 (Final)
This examiner grants 38% of cases after interview

— +60.1% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 26 resolved cases, 2023–2026
Examiner Intelligence

YANOSKA, JOSEPH ANDERSON View full profile →
Grants only 38% of cases
Career Allow Rate
10 granted / 26 resolved
-13.5% vs TC avg
Strong +60% interview lift
Without
With
+60.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
34 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
28.5%
-11.5% vs TC avg
§103
47.1%
+7.1% vs TC avg
§102
15.6%
-24.4% vs TC avg
§112
7.8%
-32.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 26 resolved cases
Office Action

§103
Detailed Office Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 06/13/2025 is being considered by the examiner.
Status of Claims
This Office Action is in response to the Applicant’s amendments and remarks filed 07/30/2025. Claims 1-15 are presently pending and are presented for examination.

Reply to Applicant’s Remarks
Applicant’s remarks filed 07/30/2025 have been fully considered and are addressed as follows:

Claim Rejections Under 35 U.S.C. 103:
	Applicant’s arguments, see Arguments/Remarks, filed 07/30/2025, with regard to the rejections of Claims 1, 8, and 11 under 35 U.S.C. 103 have been fully considered, however the arguments are respectfully not persuasive. 
	Regarding the applicant’s argument that Shamma does not teach “wherein the display controller is configured to control a message indicating the at least one control operation to be displayed on the display in response to receiving the fifth signal and further configured to control a message indicating that the autonomous vehicle is pointed out to be displayed on the display in response to receiving the third signal” the examiner respectfully disagrees.
Shamma teaches “wherein the display controller is configured to control a message indicating the at least one control operation to be displayed on the display in response to receiving the fifth signal” (see at least Shamma [¶ 30, 44-45] The base station 225 can also include a display, which shows the video captured by the UAV 205, as well as information about the operation of the UAV 205, such as flight speed, heading, etc…a display that can output information about the flight path of the UAV, as well as display images captured by the UAV’s imager....The base station software also includes a repeater node 1030, which can forward data and signals received from the UAV to the laser pointer and vice-versa) Shamma discloses a vehicle that can be controlled via a light emission gesture and a display that outputs information about the operation and flight path of the vehicle. When a command is sent via a light emission gesture for the vehicle to be controlled to travel a certain flight path, it will be displayed on the display. This is analogous to displaying a control operation to be displayed after receiving a signal. 
	Shamma further teaches “and further configured to control a message indicating that the autonomous vehicle is pointed out to be displayed on the display in response to receiving the third signal” (see at least Shamma [¶ 30, 44-45] … a display that can output information about the flight path of the UAV, as well as display images captured by the UAV’s imager…) Based on the provided specification, the examiner is interpreting the term “pointed out” to mean when the vehicle detects a laser emitted from outside the vehicle (see Specification “in response to sensing a laser emitted from a light rod outside the autonomous vehicle, controlling a message indicating that the autonomous vehicle is pointed out to be displayed on a display”). Because the display in Shamma displays information about the control and surroundings of the vehicle after the vehicle detects a light emission gesture performed outside the vehicle, the examiner assert that this is analogous to  displaying a message that the autonomous vehicle is “pointed out”.
	Regarding the applicant’s argument that “the display” of Shamma is different from “the display” of Claims 1, 8 and 11, the examiner respectfully disagrees. The current claim language recited in Claims 1, 8, and 11 only cover a generic “display” and does not specify where the display is installed, for instance, the claims do not recite whether the display is installed on the exterior of the vehicle or on a base station.
	Please see detailed rejection below.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  

Such claim limitation(s) are: 

“Controller” in Claim 1 is being interpreted as a generic processor. Per the specification: “The controller 330 may further include a command processor 334”
“Gesture Analyzer” in Claim 1 is being interpreted as a generic processor. Per the specification: “The controller 330 may further include a command processor 334 communicatively connected to the gesture analyzer 332” 

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-7, 9, and 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Dev (US 20200225662 A1) in view of Taylor et al (US 20180251219 A1), Ren (CN 106249889 A), and Shamma et al (WO 2019211680 A1). Hereafter referred to as Dev, Taylor, Ren and Shamma, respectively. 

Regarding Claim 1, Dev teaches an apparatus for an interaction between a worker and an autonomous vehicle (see at least Dev [¶ 20] the intersection management system may determine a gesture of the one or more traffic police based on the correlation matrix, and at least one of the video and the one or more images. Further, the intersection management system may determine navigation information for the autonomous vehicle based on the correlation matrix and the determined gesture, for navigating the autonomous vehicle)
the apparatus comprising:
a sensor unit comprising...a light sensing module (see at least Dev [¶ 25] the autonomous vehicle 101 may be configured with the one or more sensors 103. As an example, the one or more sensors 103 may include, but not limited to, Light Detection and Ranging (LIDAR) system, Radio Detection and Ranging (RADAR), Inertial Measurement Unit (IMU), Ultrasonic Sensors, image capturing devices such as stereoscopic depth cameras)
a controller communicatively connected to the sensor unit and comprising a gesture analyzer, a command processor, and a display controller (see at least Dev [¶ 20, 26, 99] the sensor data may include, but not limited to, depth of an object with respect to the autonomous vehicle, one or more images and a video of environment surrounding the autonomous vehicle... the intersection management system may determine a gesture of the one or more traffic police based on the correlation matrix, and at least one of the video and the one or more images...The intersection management system 105 includes a processor...The User interface 406 may facilitate display, execution, interaction, manipulation, or operation of program components through textual or graphical facilities. For example, user interfaces 406 may provide computer interaction interface elements on a display system operatively connected to the computer system)
a display communicatively connected to the display controller (see at least Dev [¶ 99] the User interface 406 may facilitate display, execution, interaction, manipulation, or operation of program components through textual or graphical facilities. For example, user interfaces 406 may provide computer interaction interface elements on a display system operatively connected to the computer system)
a storage connected to the sensor unit and the controller (see at least Dev [¶ 34, 101] the data 203 stored in the memory 113 may be processed by the modules 205 of the intersection management system 105. The modules 205 may be stored within the memory 113)
wherein the command processor is configured to perform at least one control operation for performing the command of the worker in response to receiving the fourth signal, and further configured to output a fifth signal indicating the at least one control operation (see at least Dev [¶ 61, 64] the navigation information determining module 229 may determine navigation information 215 for the autonomous vehicle 101 based on the correlation matrix, the detected object data and the determined gesture, for navigating the autonomous vehicle 101. In some embodiments, the navigation information 215 may include, but not limited to, a plurality of directing commands. As an example, the plurality of directing commands may include, but not limited to, “START”, “STOP”, “TURN LEFT”, “TURN RIGHT”, “CHANGE LANE”, “SLOW DOWN”, “STAY IDLE”, or “KEEP STRAIGHT”...As an example, when the value of the determined gesture is determined as “LEFT”, the navigation information 215 is determined to be “TURN LEFT” i.e. the autonomous vehicle 101 is provided with a directing command “TURN LEFT”).

However, Dev does not explicitly teach a sensor unit comprising an infrared (IR) camera module, wherein the IR camera module is configured to output video frames by capturing IR light incident on the IR camera module, wherein the video frames are stored in the storage, and further configured to output a first signal indicating…sensing of IR light emitted from a light rod outside the autonomous vehicle.
Taylor, in the same field as the endeavor, teaches a sensor unit comprising an infrared (IR) camera module, wherein the IR camera module is configured to output video frames by capturing IR light incident on the IR camera module, wherein the video frames are stored in the storage, and further configured to output a first signal indicating…sensing of IR light emitted from a light rod outside the autonomous vehicle (see at least Taylor [¶ 19, 28, 43, 48] the system detects a gesture from a ground crew member…the gesture may be detected by an image sensor of a UAV… the image sensor may comprise one or more of a camera, an optical sensor, a night vision camera, and an infrared camera…the gesture may be detected based on the position and/or movement of one or more visual identifiers carried by the ground crew member, such as batons…light emitters…In some embodiments, the gesture may be detected based on one or more images of the ground crew member...In step 602, the system activates video analytics from the UAV camera. In step 603, the UAV picks up a human hand gesture with an onboard camera. In step 604, the system analyzes the hand gestures captured by the camera...the control circuit 221 may be configured to determine a flight path and/or flight pattern for the UAV 120 based on one or more of a task profile stored in the memory 224, real-time information detected by one or more onboard sensors, information received from a remote central computer, and gestures detected via the image sensor 227....the system may include a UAV with the ability to recognize…infrared signatures).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in Dev to contain an IR camera module configured to output video frames and further configured to output when it senses IR light emitted from outside the autonomous vehicle with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the controlling the autonomous vehicle via gestures made outside of the vehicle by expanding the number of visual identifiers that can be detected by the vehicle as discussed in Taylor (see at least Taylor [¶ 24] In some embodiments, the gesture may be detected based on one or more visual identifiers carried by the ground crew member).

However, the combination of Dev and Taylor does not explicitly teach an IR camera module further configured to output a first signal indicating a point in time at which sensing of IR light emitted from a light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends
Ren, in the same field as the endeavor, teaches an IR camera module further configured to output a first signal indicating a point in time at which sensing of IR light emitted from a light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends (see at least Ren [English Translation ¶ 15, 56, 39-41, 139, 186]  the infrared light-emitting component preset in the wearable device to emit infrared light…the camera unit obtains infrared images…Acquire one or more frames of images from the preview image acquired by the camera unit; Determining a gesture area described by infrared light in the multiple frames of images; Gesture feature data is extracted based on the gesture area, and the data is matched with preset gesture instruction type description data to determine the corresponding gesture instruction type…The video obtained by the recognition device through the camera unit can be regarded as consisting of multiple frames of images…The "identification device" used here can be a portable, transportable mobile smart device installed in a vehicle) Ren teaches that a infra-red emitting device can be used with an infrared camera module installed in a vehicle, it further teaches that it can extract multiple frames of images taken by the camera and stored in memory, to recognize gestures. Extracting the frames that an infrared gesture is made is analogous to detecting a point in time when the sensing of the light starts and ends.
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in the combination of Dev and Taylor to contain a system for an IR camera further configured to identify points in time when the sensing of the IR light starts and ends with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the distinction of the intended control gestures by using the infrared emitting device to enhance the gesture’s visibility against the background of the cameras view (see at least Ren [¶ 120] the present invention drives the infrared light-emitting component preset in the wearable device to emit infrared light, effectively enhancing the distinction between the gesture area and the background area, thereby enabling the recognition device to capture gestures according to infrared imaging and generate corresponding gesture interaction events…It can reduce the usage of computing resources, shorten the response time required for the operator to perform gesture recognition in complex backgrounds or dim light conditions, and improve the recognition rate and real-time performance of the operator's human-computer interaction).

Further, the combination of Dev and Taylor does not explicitly teach wherein the gesture analyzer is configured to search for first video frames stored in the storage from the point in time indicated by the first signal to the point in time indicated by the second signal, in response to receiving the first signal and the second signal, configured to identify a command of the worker by analyzing the found first video frames, and configured to output a fourth signal indicating the command of the worker.
Ren, in the same field as the endeavor teaches, wherein the gesture analyzer is configured to search for first video frames stored in the storage from the point in time indicated by the first signal to the point in time indicated by the second signal, in response to receiving the first signal and the second signal, configured to identify a command of the worker by analyzing the found first video frames, and configured to output a fourth signal indicating the command of the worker (see at least Ren [English Translation ¶ 180, 292, 39-40] The recognition device usually includes a camera unit, a processor, a storage device, etc… One or more cameras 707 , at least one of which has an infrared imaging function, are connected to the processor 704 and controlled by the processor 704 . Images captured by the cameras 707 can be stored in the memory 702…Determining a gesture area described by infrared light in the multiple frames of images; Gesture feature data is extracted based on the gesture area, and the data is matched with preset gesture instruction type description data to determine the corresponding gesture instruction type)
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in the combination of Dev and Taylor to contain a system for accessing the stored video frames in the memory to identify gestures with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the user experience of the gesture controlled vehicle by improving recognition rate and performance (see at least Ren [¶ 262] Through the disclosure of the recognition device of the present invention, it can be known that the implementation of the present invention greatly improves the recognition rate and real-time performance of human-computer interaction during gesture recognition by the operator, thereby improving the user experience).

Further, the combination of Dev, Taylor, and Ren does not explicitly teach wherein the light sensing module is configured to output a third signal in response to sensing a laser emitted from the light rod and wherein the display controller is further configured to control a message indicating that the autonomous vehicle is pointed out to be displayed on the display in response to receiving the third signal and wherein the display controller is configured to control a message indicating the at least one control operation to be displayed on the display in response to receiving the fifth signal.
Shamma, in the same field as the endeavor, teaches wherein the light sensing module is configured to output a third signal in response to sensing a laser emitted from the light rod (see at least Shamma [¶ 7] The vehicle is guided from the position proximate to the target position using the optical system of the vehicle responsive to detection of a dot of the laser beam on the target by an optical system of the vehicle) 
wherein the display controller is configured to control a message indicating the at least one control operation to be displayed on the display in response to receiving the fifth signal” (see at least Shamma [¶ 30, 44-45] The base station 225 can also include a display, which shows the video captured by the UAV 205, as well as information about the operation of the UAV 205, such as flight speed, heading, etc…a display that can output information about the flight path of the UAV, as well as display images captured by the UAV’s imager....The base station software also includes a repeater node 1030, which can forward data and signals received from the UAV to the laser pointer and vice-versa) Shamma discloses a vehicle that can be controlled via a light emission gesture and a display that outputs information about the operation and flight path of the vehicle. When a command is sent via a light emission gesture for the vehicle to be controlled to travel a certain flight path, it will be displayed on the display. This is analogous to displaying a control operation to be displayed after receiving a signal
	and further configured to control a message indicating that the autonomous vehicle is pointed out to be displayed on the display in response to receiving the third signal (see at least Shamma [¶ 30, 44-45] … a display that can output information about the flight path of the UAV, as well as display images captured by the UAV’s imager…) Based on the provided specification, the examiner is interpreting the term “pointed out” to mean when the vehicle detects a laser emitted from outside the vehicle (see Specification “in response to sensing a laser emitted from a light rod outside the autonomous vehicle, controlling a message indicating that the autonomous vehicle is pointed out to be displayed on a display”). Because the display in Shamma displays information about the control and surroundings of the vehicle after the vehicle detects a light emission gesture performed outside the vehicle, the examiner assert that this is analogous to  displaying a message that the autonomous vehicle is “pointed out”.
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in the combination of Dev, Taylor, and Ren to contain a system for wherein the light sensing module is configured to output a third signal in response to sensing a laser emitted from the light rod and wherein the display controller is further configured to control a message indicating that the autonomous vehicle is pointed out to be displayed on the display and wherein the display controller is configured to control a message indicating the at least one control operation to be displayed on the display in response to receiving the fifth signal in response to receiving the third signal with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the user experience of the user pointing the laser by allowing the vehicle to know when the laser is being pointed at it and outputting useful information onto the display regarding the interaction and operation between the laser and the vehicle. 

Regarding Claim 2, Dev in view of Taylor, Ren, and Shamma, teach all limitations of the system of Claim 1 as set forth above. Dev further teaches wherein the sensor unit further comprises a camera module (see at least Dev [¶ 25-26, 36] the one or more sensors 103 may include, but not limited to, Light Detection and Ranging (LIDAR) system, Radio Detection and Ranging (RADAR), Inertial Measurement Unit (IMU), Ultrasonic Sensors, image capturing devices such as stereoscopic depth cameras)
wherein the camera module is configured to capture an outside of the autonomous vehicle and output second video frames, and the second video frames are stored in the storage (see at least Dev [¶ 25-26, FIG 2B] The sensor data includes at least one of depth of an object with respect to the autonomous vehicle, and one or more images and a video of environment surrounding the autonomous vehicle...Further, the processor 109 may store the sensor data in the memory 113 coupled with the processor 109).

Regarding Claim 3, Dev in view of Taylor, Ren, and Shamma, teach all limitations of the system of Claim 2, as set forth above. However, Dev alone does not explicitly teach wherein the gesture analyzer is configured to further search for the second video frames stored in the storage from the point in time indicated by the first signal to the point in time indicated by the second signal, in response to receiving the first signal and the second signal, and further configured to output the fourth signal based on an analysis of the found first video frames and the found second video frames.
Ren, in the same field as the endeavor, teaches wherein the gesture analyzer is configured to further search for the second video frames stored in the storage from the point in time indicated by the first signal to the point in time indicated by the second signal, in response to receiving the first signal and the second signal, and further configured to output the fourth signal based on an analysis of the found first video frames and the found second video frames (see at least Ren [English Translation ¶ 180, 292, 39-40] The recognition device usually includes a camera unit, a processor, a storage device, etc… One or more cameras 707 , at least one of which has an infrared imaging function, are connected to the processor 704 and controlled by the processor 704 . Images captured by the cameras 707 can be stored in the memory 702…Determining a gesture area described by infrared light in the multiple frames of images; Gesture feature data is extracted based on the gesture area, and the data is matched with preset gesture instruction type description data to determine the corresponding gesture instruction type) Ren teaches that a infra-red emitting device can be used with an infrared camera module installed in a vehicle, it further teaches that it can extract multiple frames of images taken by the camera and stored in memory, to recognize gestures. Extracting the frames that an infrared gesture is made is analogous to detecting a point in time when the sensing of the light starts and ends. The gesture identification device can repeat this action for multiple gestures made, allowing it to search for second video frames. 
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in the combination of Dev, Taylor, Ren, and Shamma to contain a system for search for second video frames from when the IR light is detected on to when it is detected to be off to identify a gesture made  with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the user experience of the gesture controlled vehicle by improving recognition rate and performance (see at least Ren [¶ 262] Through the disclosure of the recognition device of the present invention, it can be known that the implementation of the present invention greatly improves the recognition rate and real-time performance of human-computer interaction during gesture recognition by the operator, thereby improving the user experience).

Regarding Claim 4, Dev in view of Taylor, Ren, and Shamma, teach all limitations of the system of Claim 2, as set forth above. Dev further teaches wherein the IR camera module, the light sensing module, and the camera module are installed on a front side of the autonomous vehicle (see at least Dev [¶ 8, FIG 2B] the sensor data comprises at least one of depth of an object with respect to the autonomous vehicle, and one or more images and a video of environment surrounding the autonomous vehicle. Further, the instructions cause the processor to detect one or more traffic police and one or more auxiliary objects associated with each of the one or more traffic police from a plurality of objects of interest present in the one or more images, when the autonomous vehicle is within a predefined distance from an intersection of roads) Because the cameras are used to detect objects and persons that are in front of the vehicle, it is clear that the light sensing equipment and camera modules are installed on the front side of the autonomous vehicle, FIG 2B illustrates this further. 

Regarding Claim 5, Dev in view of Taylor, Ren, and Shamma, teach all limitations of the system of Claim 2, as set forth above. However, Dev does not explicitly teach wherein the display is a light-emitting diode (LED) display. 
Ren, in the same field as the endeavor, teaches wherein the display is a light-emitting diode (LED) display (see at least Ren [English Translation ¶ 287] The display unit may include a display panel. Optionally, the display panel may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in the combination of Dev, Taylor, Ren, and Shamma to contain a display that is an LED display with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of using display hardware that is commonly used in the art. 

Regarding Claim 6, Dev in view of Taylor, Ren, and Shamma, teach all limitations of the system of Claim 1, as set forth above. Dev further teaches wherein the at least one control operation comprises at least one of stopping, driving, or generating a new movement path (see at least Dev [¶ 60-61, 64] the dynamic hand gestures may include, but not limited to, hand movements which indicate actions such as “STOP”, “MOVE”, “START”, “RIGHT”, “LEFT” and the like…As an example, when the value of the determined gesture is determined as “STOP”, the navigation information 215 is determined to be “STOP” i.e. the autonomous vehicle 101 is provided with a directing command “STOP”).

Regarding Claim 7, Dev in view of Taylor, Ren, and Shamma, teach all limitations of the system of Claim 1, as set forth above. However, Dev does not explicitly teach wherein the display controller is further configured to control a message indicating the command of the worker to be displayed on the display in response to receiving the fourth signal.
Shamma, in the same field as the endeavor, teaches teach wherein the display controller is further configured to control a message indicating the command of the worker to be displayed on the display in response to receiving the fourth signal (see at least Shamma [¶ 30, 44-45] The base station 225 can also include a display, which shows the video captured by the UAV 205, as well as information about the operation of the UAV 205, such as flight speed, heading, etc…a display that can output information about the flight path of the UAV, as well as display images captured by the UAV’s imager....The base station software also includes a repeater node 1030, which can forward data and signals received from the UAV to the laser pointer and vice-versa) Shamma discloses a vehicle that can be controlled via a light emission gesture and a display that outputs information about the operation and flight path of the vehicle. If a command is sent via a light emission gesture for the vehicle to be controlled to travel a certain flight path, it will be displayed on the display. 
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in the combination of Dev, Taylor, and Ren to contain a system for wherein the display controller is further configured to control a message indicating the command of the worker to be displayed on the display in response to receiving the fourth signal with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the user experience by outputting useful information onto the display regarding the interaction and operation between the user making the laser gesture and the vehicle. 

Regarding Claim 9, Dev in view of Shamma teaches all limitations of Claim 8 as set forth above. However, the combination does not explicitly teach wherein the identifying of the command of the worker based on the video frames comprises: in response to generating a first signal indicating a point in time at which sensing of infrared (IR) light emitted from a light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends, searching for the stored video frames from the point in time indicated by the first signal to the point in time indicated by the second signal; and identifying the command of the worker by analyzing the found video frames.
Taylor, in the same field as the endeavor teaches the sensing of infrared (IR) light emitted from a light rod outside the autonomous vehicle (see at least Taylor [¶ 19, 28, 43, 48] the system detects a gesture from a ground crew member…the gesture may be detected by an image sensor of a UAV… the image sensor may comprise one or more of a camera, an optical sensor, a night vision camera, and an infrared camera…the gesture may be detected based on the position and/or movement of one or more visual identifiers carried by the ground crew member, such as batons…light emitters…In some embodiments, the gesture may be detected based on one or more images of the ground crew member...In step 602, the system activates video analytics from the UAV camera. In step 603, the UAV picks up a human hand gesture with an onboard camera. In step 604, the system analyzes the hand gestures captured by the camera...the control circuit 221 may be configured to determine a flight path and/or flight pattern for the UAV 120 based on one or more of a task profile stored in the memory 224, real-time information detected by one or more onboard sensors, information received from a remote central computer, and gestures detected via the image sensor 227....the system may include a UAV with the ability to recognize…infrared signatures).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in Dev and Shamma to contain a method for sensing of infrared (IR) light emitted from a light rod outside the autonomous vehicle with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the controlling the autonomous vehicle via gestures made outside of the vehicle by expanding the number of visual identifiers that can be detected by the vehicle as discussed in Taylor (see at least Taylor [¶ 24] In some embodiments, the gesture may be detected based on one or more visual identifiers carried by the ground crew member).

Further, the combination of Dev, Shamma, and Taylor does not explicitly teach in response to generating a first signal indicating a point in time at which sensing of infrared (IR) light emitted from a light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends, searching for the stored video frames from the point in time indicated by the first signal to the point in time indicated by the second signal; and identifying the command of the worker by analyzing the found video frames.
Ren, in the same field as the endeavor, teaches in response to generating a first signal indicating a point in time at which sensing of infrared (IR) light emitted from a light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends, searching for the stored video frames from the point in time indicated by the first signal to the point in time indicated by the second signal; and identifying the command of the worker by analyzing the found video frames (see at least Ren [English Translation ¶ 15, 56, 39-41, 139,141, 186]  the infrared light-emitting component preset in the wearable device to emit infrared light…the camera unit obtains infrared images…Acquire one or more frames of images from the preview image acquired by the camera unit; Determining a gesture area described by infrared light in the multiple frames of images; Gesture feature data is extracted based on the gesture area, and the data is matched with preset gesture instruction type description data to determine the corresponding gesture instruction type…The video obtained by the recognition device through the camera unit can be regarded as consisting of multiple frames of images…The "identification device" used here can be a portable, transportable mobile smart device installed in a vehicle…Gesture recognition: refers to the process of identifying various gestures through computers according to certain rules, instructing the computer to convert them into corresponding control commands) Ren teaches that a infra-red emitting device can be used with an infrared camera module installed in a vehicle, it further teaches that it can extract multiple frames of images taken by the camera and stored in memory, to recognize gestures. Extracting the frames that an infrared gesture is made is analogous to detecting a point in time when the sensing of the light starts and ends.
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in the combination of Dev, Taylor, and Shamma to contain a method for in response to generating a first signal indicating a point in time at which sensing of infrared (IR) light emitted from a light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends, searching for the stored video frames from the point in time indicated by the first signal to the point in time indicated by the second signal; and identifying the command of the worker by analyzing the found video frames with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the distinction of the intended control gestures by using the infrared emitting device to enhance the gesture’s visibility against the background of the cameras view (see at least Ren [¶ 120] the present invention drives the infrared light-emitting component preset in the wearable device to emit infrared light, effectively enhancing the distinction between the gesture area and the background area, thereby enabling the recognition device to capture gestures according to infrared imaging and generate corresponding gesture interaction events…It can reduce the usage of computing resources, shorten the response time required for the operator to perform gesture recognition in complex backgrounds or dim light conditions, and improve the recognition rate and real-time performance of the operator's human-computer interaction).

Regarding Claim 11, Dev teaches a method performed in an autonomous vehicle for an interaction with a worker (see at least Dev [¶ 20] the intersection management system may determine a gesture of the one or more traffic police based on the correlation matrix, and at least one of the video and the one or more images. Further, the intersection management system may determine navigation information for the autonomous vehicle based on the correlation matrix and the determined gesture, for navigating the autonomous vehicle)
the method comprising:
storing video frames output by capturing an outside of the autonomous vehicle (see at least Dev [¶ 21, 25-26, FIG 2B]  the present disclosure uses Convolutional Neural Network (CNN) techniques to determine dynamic hand gestures of the one or more traffic police, by learning spatiotemporal features from consecutive frames of at least one of the video and the one or more images…The sensor data includes at least one of depth of an object with respect to the autonomous vehicle, and one or more images and a video of environment surrounding the autonomous vehicle...Further, the processor 109 may store the sensor data in the memory 113 coupled with the processor 109)
performing at least one control operation for performing the command of the worker (see at least Dev [¶ 61, 64] the navigation information determining module 229 may determine navigation information 215 for the autonomous vehicle 101 based on the correlation matrix, the detected object data and the determined gesture, for navigating the autonomous vehicle 101. In some embodiments, the navigation information 215 may include, but not limited to, a plurality of directing commands. As an example, the plurality of directing commands may include, but not limited to, “START”, “STOP”, “TURN LEFT”, “TURN RIGHT”, “CHANGE LANE”, “SLOW DOWN”, “STAY IDLE”, or “KEEP STRAIGHT”…As an example, when the value of the determined gesture is determined as “LEFT”, the navigation information 215 is determined to be “TURN LEFT” i.e. the autonomous vehicle 101 is provided with a directing command “TURN LEFT”).
However, Dev does not explicitly teach in response to generating a first signal indicating a point in time at which sensing of infrared (IR) light emitted from the light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends within a predetermined time after the laser is sensed, identifying the command of the worker based on the video frames.
Taylor, in the same field as the endeavor teaches the sensing of infrared (IR) light emitted from a light rod outside the autonomous vehicle (see at least Taylor [¶ 19, 28, 43, 48] the system detects a gesture from a ground crew member…the gesture may be detected by an image sensor of a UAV… the image sensor may comprise one or more of a camera, an optical sensor, a night vision camera, and an infrared camera…the gesture may be detected based on the position and/or movement of one or more visual identifiers carried by the ground crew member, such as batons…light emitters…In some embodiments, the gesture may be detected based on one or more images of the ground crew member...In step 602, the system activates video analytics from the UAV camera. In step 603, the UAV picks up a human hand gesture with an onboard camera. In step 604, the system analyzes the hand gestures captured by the camera...the control circuit 221 may be configured to determine a flight path and/or flight pattern for the UAV 120 based on one or more of a task profile stored in the memory 224, real-time information detected by one or more onboard sensors, information received from a remote central computer, and gestures detected via the image sensor 227....the system may include a UAV with the ability to recognize…infrared signatures).
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the system set forth in Dev to contain a method for sensing of infrared (IR) light emitted from a light rod outside the autonomous vehicle with reasonable expectation of success. One of ordinary skill in the art would have been motivated to make such a modification for benefit of improving the controlling the autonomous vehicle via gestures made outside of the vehicle by expanding the number of visual identifiers that can be detected by the vehicle as discussed in Taylor (see at least Taylor [¶ 24] In some embodiments, the gesture may be detected based on one or more visual identifiers carried by the ground crew member).

Further, the combination of Dev and Taylor does not explicitly teach in response to generating a first signal indicating a point in time at which sensing of infrared (IR) light emitted from the light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends within a predetermined time after the laser is sensed, identifying the command of the worker based on the video frames.
Ren, in the same field as the endeavor, teaches in response to generating a first signal indicating a point in time at which sensing of infrared (IR) light emitted from the light rod outside the autonomous vehicle starts and a second signal indicating a point in time at which the sensing of the IR light emitted from the light rod ends within a predetermined time after the laser is sensed, identifying the command of the worker based on the video frames (see at least Ren [English Translation ¶ 15, 20, 56, 39-41, 139,141, 186]  the infrared light-emitting component preset in the wearable device to emit infrared light…the camera unit obtains infrared images…Acquire one or more frames of images from the preview image acquired by the camera unit; Determining a gesture area described by infrared light in the multiple frames of images; Gesture feature data is extracted based on the gesture area, and the data is matched with preset gesture instruction type description data to determine the corresponding gesture instruction type…The video obtained by the recognition device through the camera unit can be regarded as consisting of multiple frames of images…The "identification device" used here can be a portable, transportable 
Read full office action
Prosecution Timeline

Oct 10, 2023
Application Filed
May 12, 2025
Non-Final Rejection — §103
Jul 30, 2025
Response Filed
Oct 29, 2025
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/896,451
Patent 12600502
NEURAL NETWORK-GUIDED PASSIVE SENSOR DRONE INSPECTION SYSTEM
2y 5m to grant Granted Apr 14, 2026
18/188,500
Patent 12548454
CONTROLLING DRONE NOISE BASED UPON HEIGHT
2y 5m to grant Granted Feb 10, 2026
17/943,058
Patent 12530031
VIRTUAL OFF-ROADING GUIDE
2y 5m to grant Granted Jan 20, 2026
17/899,796
Patent 12447969
LIMITED USE DRIVING OPERATIONS FOR VEHICLES
2y 5m to grant Granted Oct 21, 2025
18/147,101
Patent 12366859
TROLLING MOTOR AND SONAR DEVICE DIRECTIONAL CONTROL
2y 5m to grant Granted Jul 22, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
38%
Grant Probability
99%
With Interview (+60.1%)
2y 11m
Median Time to Grant
Moderate
PTA Risk
Based on 26 resolved cases by this examiner. Grant probability derived from career allow rate.