Last updated: April 19, 2026
Application No. 18/832,374
VOICE RECOGNITION METHOD AND VOICE RECOGNITION DEVICE

Non-Final OA §101§102§103
Filed
Jul 23, 2024
Examiner
WONG, LINDA
Art Unit
2655
Tech Center
2600 — Communications
Assignee
Nissan Motor Co., Ltd.
OA Round
1 (Non-Final)
Interview Optional

— +15.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 709 resolved cases, 2023–2026
Examiner Intelligence

WONG, LINDA View full profile →
Grants 85% — above average
Career Allow Rate
602 granted / 709 resolved
+22.9% vs TC avg
Strong +16% interview lift
Without
With
+15.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
17 currently pending
Career history
726
Total Applications
across all art units
Statute-Specific Performance

§101
7.2%
-32.8% vs TC avg
§103
44.5%
+4.5% vs TC avg
§102
22.3%
-17.7% vs TC avg
§112
16.5%
-23.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 709 resolved cases
Office Action

§101 §102 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings were received on 7/23/2024.  These drawings are accepted.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-16 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea in the form of mental process without significantly more. The claim(s) recite(s) voice recognition within a vehicle based on sensors and control signal and detecting position or state of an object, such as a window, blinker, etc. associated with a target object such as a person and estimating a potential or candidate target object state of position. In the broadest reasonable interpretation of the claimed language, such language is directed towards actions performed by a human mentally. For example, a person in a vehicle requests lowering a window, and a person can mentally view where the person requesting the action is located, and lower a window according to the position or state of the window depending on the window. Sensors and device are merely generic devices used to provide information and are considered pre-solution activity. This judicial exception is not integrated into a practical application because the recited limitation fails to include positively recited language integrating the abstract idea into the practical application. For example, what is practical application is applied in light of the recited limitations? The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited language fails to include positively recited language indicating significantly more than the judicial exception. For example, is there a specific method or manner in which the recited limitations are performed as supported by the specification that would indicate significantly more than the judicial exception?
	Claims 2-9,12-15 merely add to the judicial exception but fails to include positively recited language indicating significantly more and/or integrating the judicial exception into practical application.
	Claims 10,11 recites language adding to the judicial exception with language such as storage device, device mounted on the vehicle. Such are merely generic devices providing data or information to perform the abstract idea. Such recited language fails to include positively recited language indicating significantly more and/or integrating the judicial exception into practical application.
	Claim 16 recites language regarding generic devices providing data to perform the abstract idea. Such recited language fails to include positively recited language indicating significantly more and/or integrating the judicial exception into practical application.
Claim 17 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea in the form of mental process without significantly more. The claim(s) recite(s) voice recognition within a vehicle based on sensors and control signal and detecting position or state of an object, such as a window, blinker, etc. associated with a target object such as a person and estimating a potential or candidate target object state of position. In the broadest reasonable interpretation of the claimed language, such language is directed towards actions performed by a human mentally. For example, a person in a vehicle requests lowering a window, and a person can mentally view where the person requesting the action is located, and lower a window according to the position or state of the window depending on the window. Sensors, controller, device are merely generic devices used to provide information and are considered pre-solution activity. This judicial exception is not integrated into a practical application because the recited limitation fails to include positively recited language integrating the abstract idea into the practical application. For example, what is practical application is applied in light of the recited limitations? The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited language fails to include positively recited language indicating significantly more than the judicial exception. For example, is there a specific method or manner in which the recited limitations are performed as supported by the specification that would indicate significantly more than the judicial exception?

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1,2,7,8,9,11,12,14,15,16,17 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Bhattacharya et al (US Publication No.: 20210347328).
Claim 1, Bhattacharya et al discloses
A voice recognition method for acquiring utterance content of a user of a vehicle and estimating a target object mentioned in the utterance content (Paragraph 3 discloses receiving voice command. Paragraph 29-32 discloses determining positional region to determine candidate of the target object based on the user’s voice command.), the voice recognition method comprising: 
	Acquiring, as an input signal, at least one of a control signal of a device mounted on the vehicle (Paragraph 39 discloses controller 536 and Fig. 5a, label 536 shows the controller mounted on the vehicle.) and an output signal of a sensor mounted on the vehicle (Fig. 5a, label 596 Paragraph 3 discloses receiving voice command via a sensor such as a microphone.);
	Recognizing an expression representing a state or a position from the utterance content (Paragraph 3 discloses the system receives the voice command and recognizing the expression “lower Sally’s window” spoken in the voice command, where the expression represents the state or position of the request such as “lower Sally’s window”. Paragraph 22 discloses the processing circuitry may perform speech recognition algorithms to parse the received first sensor data into recognizable words in a specific language.);
	Determining whether or not the state or position recognized from the utterance content fits the state or position of the candidate of the target object detected based on the input signal (Paragraph 31 discloses a voice command of “lower Sally’s window”. Paragraph 32 discloses Sal’s positional region is determined (state or position of the target object), window within the region is region is determined (Fig. 3, label 316, where the window is the candidate target object. Paragraph 32 discloses the state or position of the window, such as 316,315, is determined via positional region and neural network.) based on sensor data such as camera data (paragraph 29,23,30) and sensor data used during operations as training data set to id occupants and surrounding objects (e.g. chairs, windows, etc.) (paragraph 29), wherein such window is the candidate of the target object. By determining the positional region of the object, Sally, and the window within the positional region of the object, the system determines whether the window mentioned in the voice command fits the window within the positional region of Sally.); and 
	Estimating a candidate of the target object fitting a state or a position recognized from the utterance content to be a target object mentioned in the utterance content (Paragraph 32 discloses “the processing circuitry may lower the window 316 that is within a positional region 314 of the object (Sally- occupant 312). The processing circuitry may implement machine learning (e.g. a neural network) to determine which operation is to be performed given the positional region of the object and the voice command.” By determining the operation is to perform lowering a window within a positional region, this indicates estimation of the candidate of the target object (window in the region of the object (Sally) fits a state or position recognized from the utterance (“Lower Sal’s window” is the utterance and the state or position recognized is lowering of Sal’s window).).
Claim 2, Bhattacharya et al discloses
	A candidate of the target object is a device controlled by the control signal acquired as the input signal (Paragraph 39 discloses “the controller(s) 536 may provide the signals for controlling one or more components and/or systems of the vehicle 500 in response to sensor data received from one or more sensors (e.g. sensor inputs).”, where windows as shown in Fig. 3 are one or more components and/or systems of the vehicle.); and 
	The voice recognition method detects a control state controlled by the control signal as a state of a candidate of the target object.
Claim 7, Bhattacharya et al discloses
	storing the acquired input signal (paragraph 32 discloses historical or past or previous voice commands or acquired input signal is used to determine the candidate of the target object. Paragraph 81,101 discloses data stores may store at least one bit of data, wherein past or previously acquired input signal or voice commands are considered data.); and 
	detecting a state of a candidate of the target object based on the stored input signal in a past and the input signal currently being acquired (Paragraph 22 discloses “a machine learning model (e.g., a neural network) that is trained on non-lexical utterances and corresponding actions following in short temporal proximity.” Paragraph 32 discloses decision on the candidate (window) of the target object (passenger) is determined based on voice commands of the target object (person) pertaining to the candidate (window). This indicates determination of the state of a candidate of the target object based on past voice commands and current input signal (paragraph 31).).
Claim 8, Bhattacharya et al discloses outputting information relating to a target object mentioned in the utterance content. (Paragraph 32 discloses lower the window 316 within a positional region 314 of the object (target object Sally) mentioned in the utterance “lower Sal’s window”. (paragraph 31) Lowering the window 316 within the proximity of the target object, Sally, is an indication of outputting information relating to the target object.)
Claim 9, Bhattacharya et al discloses outputting information relating to a state of a target object mentioned in the utterance content (Paragraph 32 discloses lower the window 316 within a positional region 314 of the object (target object Sally) mentioned in the utterance “lower Sal’s window”. (paragraph 31) Lowering the window 316 within the proximity of the target object, Sally, is an indication of outputting information relating to the state of the target object.).
Claim 11, Bhattacharya et al discloses 
a candidate of the target object is a device mounted on the vehicle (Paragraph 31,32 discloses window as the candidate of the target object, Sally, wherein a window is mounted on the vehicle.), and 
	the voice recognition method comprises acquiring, as the input signal, an output signal of a sensor detecting a state of an inside of the vehicle (Paragraph 33 discloses camera sensor providing images detecting the state of an inside of the vehicle such as occupants, windows, seats, etc.) and detecting a state or a position of the device (paragraph 33 discloses images to detect state of windows, seats, etc.), based on the acquired output signal (paragraph 33 discloses images from the sensor or acquired output signal.).
	Claim 12, Bhattacharya et al discloses 
acquiring, as the input signal, an output signal of a sensor detecting a seating position of a passenger of the vehicle (paragraph 23 pressure sensor on a seat within the vehicle to detect the seating position of a passenger and camera sensor indicating the occupants in the vehicle such as seating of a person shown in Fig. 3, label 314,312.); 
detecting a window serving as a candidate of the target object to be a window in a vicinity of the seating position (Paragraph 32 discloses detecting window as the candidate of the target object based on seating of the occupant and image of the inside of the vehicle.);
recognizing, from the utterance content including an opening/closing instruction to open or close a window of the vehicle, an expression representing a position of a window to be opened or closed (Paragraph 31 discloses recognizing command includes an express ion of “lowering Sally’s window”, which is a form of closing/opening of a window.); and 
when a position of a window recognized from the utterance content indicates a vicinity of the seating position, estimating a window in a vicinity of the seating position as the target object (Paragraph 32 discloses selecting window 315 to lower as opposed to 316 based on the vicinity of the target object, Sally.).
Claim 14, Bhattacharya et al discloses 
a candidate of the target object is a surrounding object of the vehicle 1 (Paragraph 45 discloses sensors such as cameras with field of view will detect obstacles and/or paths in the environment in front of the vehicle, wherein depending on the voice command and sensors, the candidate of the target object such as obstacles, paths, etc. is found in the environment of the vehicle.), and 
the voice recognition method comprises acquiring, as the input signal, an output signal of a sensor detecting the surrounding object (paragraph 45 discloses cameras with a field of view that includes portions of the environment in front of the vehicle.) and detecting a state or a position of the surrounding object, based on the acquired output signal (Paragraph 45 discloses “Cameras with a field of view that includes portions of the environment in front of the vehicle 500 … may be used for surround view, to help identify forward-facing paths and obstacles, as well aid in, with the help of one or more controllers 536 and/or control SoCs, providing information critical to generating an occupancy grid and/or determining the preferred vehicle paths.” Such disclosure indicates obstacles or paths as candidate of the target object surrounding the vehicle is detected, wherein position or state of the obstacles or paths is determined from the cameras with a field of view.).  
Claim 15, Bhattacharya et al discloses
acquiring, as the input signal, a captured image generated by a camera capturing surroundings of the vehicle (Paragraph 43 discloses “one or more camera(s) … may record and provide image data (e.g., video) simultaneously.”); and 
recognizing, based on the captured image, an object coming close to the vehicle as a candidate of the target object (paragraph 45 discloses detecting or identifying paths and obstacles within the field of view (close to the vehicle). Paragraph 47 discloses “camera lenses … and an image processing chip that may measure the distance from the vehicle to the target object and use the generated information … to activate the autonomous emergency braking …”. This indicates recognizing object coming close or within the field of view of a candidate of the target object such as obstacles.).  
Claim 16, Bhattacharya et al discloses wherein the sensor includes one of a pressure sensor, a seat belt sensor, a camera, a range sensor, a microphone, and a biosensor (Paragraph 23 discloses pressure sensor, microphones, camera, etc.).
Claim 17 recites similar limitations as claim 1 and is rejected on the same grounds as claim 1.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3,4,10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bhattacharya et al (US Publication No.: 20210347328) in view of Lee et al (US Publication No.: 20220134880).
Claim 3, Bhattacharya et al discloses voice command-based vehicle control (Fig. 8), but fails to disclose the recited limitations. 
Lee et al discloses 
the input signal is a control signal of a visual information presentation device (Paragraph 105-106 discloses warning light is showing. Fig. 1, label 130, paragraph 51), the visual information presentation device being installed inside the vehicle (Paragraph 51 discloses “instrument panel (cluster 130) is disposed on the dashboard, and may include various indicator lights and warning lights.”) and presenting visual information to the user (Fig. 1, label 130 shows the dashboard is displaying information to the user, wherein the information includes various indicator lights and warning lights as per paragraph 51.); and 
the control state is a display state of the visual information (Paragraph 90 discloses “extract the characteristic of the identified sign (403). The processor 210 may extract at least one of a color, a shape, a location, or a boundary of the identified sign as a characteristic of the sign.”).
It would be obvious to one skilled in the art before the effective filing date of the application to modify Bhattacharya et al’s voice control vehicle with control signals by incorporating control signals as disclosed by Lee et al so to improve safety while driving a vehicle.
Claim 4, Lee et al discloses the visual information presentation device is a warning lamp (Paragraph 90 discloses extracting characteristic of the identified sign such as signs shown in Fig. 1, label 130. Examples of signs are shown in Fig.2.) and the control state is a turned-on state or a turned-off state of the warning lamp (Fig. 11 shows a table of the different states of the warning lamp or lights. Each indicating whether the lamp or light is turned off or turned on. Fig. 14 is a chart indicating question regarding warning light or lamp with response, where the question indicates the light is turned on or off.)
Claim 10, Bhattacharya et al discloses voice command-based vehicle control (Fig. 8), but fails to disclose the recited limitations.
Lee et al discloses 
	Storing a coping method matching a state of a candidate of the target object in a predetermined storage device (Fig. 14 shows coping method stored with matching target object mentioned in utterance (label question) and the state of the candidate of the target object (label response such as target object is engine warning light, response indicating warning light is red and engine oil pressure is low (state of the candidate target object).).); and 
	Outputting information relating to the coping method matching a state of a target object mentioned in the utterance content (Fig. 14, label response indicates outputting information relating to the coping method matching state of a target object mentioned in the utterance content (ex. Utterance: what is the engine warning light (target object), Response: warning light in the shape of a red kettle (state of a target object, warning light), engine oil warning light turns on when engine oil pressure is low indicates state of a target object.).
It would be obvious to one skilled in the art before the effective filing date of the application to modify Bhattacharya et al’s voice recognition and response to the user’s input by incorporating coping method as disclosed by Lee et al so to improve to readily provide a response to the user’s input such as a query.

Claim(s) 5,6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bhattacharya et al (US Publication No.: 20210347328) in view of Tsuchiya (US Publication No.: 20190266908).
Claim 5, Bhattacharya et al discloses voice command-based vehicle control (Fig. 8), but fails to disclose all the recited limitations.
Tsuchiya discloses the input signal is a control signal of an audio information presentation device (Paragraph 69 discloses question about terrible sound from the car, which indicates input signal is a control signal (response to the user’s query) of an audio information presentation device (device emitting the terrible sound).), the audio information presentation device (Paragraph 69 discloses intrusion detection automatic alarm as the audio information presentation device) being installed inside the vehicle (Paragraph 68 discloses the intrusion-detection automatic alarm is provided in the vehicle, which indicates such alarm is installed inside the vehicle.) and presenting audio information to the user (Paragraph 69 discloses user question “I want to stop the terrible sound from the car”, which indicates user can hear the terrible sound, hence indicating presentation of the audio information to the user.), and 
	The control state is a notification state of the audio information (Paragraph 69 discloses a response of “the intrusion-detection automatic alarm is activated. Unlock the door or turn on the power switch if you are inside the car.” Such indicates the control signal as notification state of the audio information.). 
It would be obvious to one skilled in the art before the effective filing date of the application to modify Bhattacharya et al’s voice control of vehicle by incorporating the control signal as disclosed by Tsuchiya so to improve safety of driving the vehicle and allow user the ability to access information reading the condition of the vehicle hands free.
Claim 6, Tsuchiya disclose the audio information presentation device is an alarm device (Paragraph 69 discloses the intrusion-detection automatic alarm as an alarm device.) and the control state is an output state or a suspended state of an alarm (Paragraph 69 discloses the response states the output state (intrusion detection automatic alarm is activated) indicating the control state.).


Claim(s) 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bhattacharya et al (US Publication No.: 20210347328) in view of Jaegal et al (US Publication No.: 20170021768).
Claim 13, Bhattacharya et al discloses 
acquiring, as the input signal, an output signal of a sensor detecting sound information of an abnormal sound from the vehicle (Paragraph 164 discloses determining a state of the vehicle based on information a sound generated from a component of the vehicle or a frictional sound between a wheel and a road surface. … the controller 180 can detect whether an engine and a brake are normal, based on information noise generated from the engine and information on noise generated from the brake.” Such describes using detected sound information to determine condition or state of the engine as normal or abnormal. Paragraph 45 discloses one or more sensors including sound sensor to sense audio signal inside or outside the vehicle.); and 
by estimating, based on the sound information, a device serving as a sound source of the abnormal sound (paragraph 164 discloses detecting whether break and an engine is normal or abnormal based on sound information, wherein the engine and/or brake as the device serving as a sound source of abnormal sound.), detecting a state in which the device serving as a candidate of the target object is generating the abnormal sound (paragraph 164-165 discloses determining the state of the vehicle as abnormal.). 
It would be obvious Bhattacharya et al by incorporating detection of abnormal sound and device emitting the abnormal sound as disclosed by Jaegal et al so to inform the drive of an abnormal state of the vehicle, hence improving vehicle and driving safety.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINDA WONG whose telephone number is (571)272-6044. The examiner can normally be reached 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew C Flanders can be reached at 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LINDA WONG/Primary Examiner, Art Unit 2655
Read full office action
Prosecution Timeline

Jul 23, 2024
Application Filed
Mar 16, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/463,509
Patent 12596877
COMPUTER-IMPLEMENTED CONTRACT RISK ASSESSMENT PLATFORM LEVERAGING TRANSFORMERS
2y 5m to grant Granted Apr 07, 2026
18/493,770
Patent 12573368
RESIDUAL ADAPTERS FOR FEW-SHOT TEXT-TO-SPEECH SPEAKER ADAPTATION
2y 5m to grant Granted Mar 10, 2026
18/104,047
Patent 12567426
MACHINE LEARNING-BASED KEY GENERATION FOR KEY-GUIDED AUDIO SIGNAL TRANSFORMATION
2y 5m to grant Granted Mar 03, 2026
18/343,389
Patent 12566925
DIALOGUE STATE AWARE DIALOGUE SUMMARIZATION
2y 5m to grant Granted Mar 03, 2026
18/662,342
Patent 12562824
SYSTEMS AND METHODS FOR WIRELESS SIGNAL CONFIGURATION BY A NEURAL NETWORK
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
85%
Grant Probability
99%
With Interview (+15.5%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 709 resolved cases by this examiner. Grant probability derived from career allow rate.