Last updated: April 19, 2026
Application No. 18/398,674
METHODS AND SYSTEMS FOR CLASSIFYING VEHICLES AS ELECTRIC OR NONELECTRIC BASED ON AUDIO

Final Rejection §103
Filed
Dec 28, 2023
Examiner
RODRIGUEZ, ANTHONY JASON
Art Unit
2672
Tech Center
2600 — Communications
Assignee
Robert Bosch GmbH
OA Round
2 (Final)
Interview Optional

— -21.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 18 resolved cases, 2023–2026
Examiner Intelligence

RODRIGUEZ, ANTHONY JASON View full profile →
Grants only 17% of cases
Career Allow Rate
3 granted / 18 resolved
-45.3% vs TC avg
Minimal -21% lift
Without
With
+-21.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
47 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
22.1%
-17.9% vs TC avg
§103
43.4%
+3.4% vs TC avg
§102
16.1%
-23.9% vs TC avg
§112
18.3%
-21.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 18 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments, see Remarks page 7, filed 02/12/2026, with respect to the rejections of claims 4, 11, and 18 under 35 U.S.C. 112(b) have been fully considered and are persuasive.  The rejections of claims 4, 11, and 18 under 35 U.S.C. 112(b) have been withdrawn. 
Applicant’s arguments, see Remarks pages 7-9, filed 02/12/2026, with respect to the rejection of amended claim(s) 1, 8, and 15 under 35 U.S.C. 103 have been fully considered and are moot in view of the new grounds of rejection (detailed in the rejections below) necessitated by Applicant’s amendment to the claim(s).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-5, 8-12, and 15-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (Audio Classification of Accelerating Vehicles) hereinafter referenced as Yang, in view of Wang et al. (CN107730902A) hereinafter referenced as Wang, and Hohenacker (US2018174453A1).
Regarding claim 1, Yang discloses: A method for training a neural network to identify an electric vehicle based on audio (Yang: Figure 1: 
    PNG
    media_image1.png
    224
    1009
    media_image1.png
    Greyscale
;
Abstract: “In this work, using careful feature extraction and various deep learning architectures, we demonstrate that vehicles can be effectively classified by recording vehicle acceleration with a cellphone.”), the method comprising:
generating video data from a camera, wherein the camera has a field of view of a roadway; generating audio data from a microphone, wherein the audio data is associated with vehicles traveling across the roadway (Yang: 3.1 Data Collection: “audio was recorded at an isolated stop sign. A cellphone (Samsung S9+) was fastened to a light-post directly adjacent to the stop sign, allowing direct vehicle visibility and clear audio recording of the accelerating vehicle. In total, 936 audio clips were extracted from 5 hrs of video, and divided into 7 classes and numerous manufacturers.”);
segmenting the audio data into a plurality of audio segments, wherein each audio segment has a start time and a finish time associated with that of a respective video segment (Yang: 3.2 Feature Extraction: “The audio track was directly imported into Audacity software 2 for splitting and labeling, and no further audio processing was performed prior to analysis. Labeling was performed by matching the audio track to the simultaneously taken video.”; Wherein the audio data was segmented and labeled based on vehicle classification within video segments);
training a neural network to identify electric vehicles based on the audio segments and the labels of the respective video segments; and based on the training, outputting a trained neural network configured to identify electric vehicles based on audio (Yang: 3.1 Data Collection: “A cellphone (Samsung S9+) was fastened to a light-post directly adjacent to the stop sign, allowing direct vehicle visibility and clear audio recording of the accelerating vehicle. In total, 936 audio clips were extracted from 5 hrs of video, and divided into 7 classes and numerous manufacturers.”;
3.2 Feature Extraction: “Labeling was performed by matching the audio track to the simultaneously taken video.”;
4.1 Fully-connected Neural Networks (FCNN): “For the FCNN models, either no hidden layers or one hidden layer with 20 neurons (with ReLU as the activation function) were studied. Training examples with 678 features generated from the raw .wav file and FFT were sent to the model and batch gradient descent was used in the backpropagation process…In the output layer, we used softmax as the activation function to determine the result for the full 5-class and 7-class problems (Figure 5a).”).
Yang does not disclose expressly: segmenting the video data into a plurality of video segments, wherein each video segment has a start time and a finish time that corresponds to a respective vehicle traveling across the roadway within the field of view; based on the respective vehicle in each video segment, labeling each video segment with label indicating the respective vehicle as either an electric vehicle (EV) or a non-electric internal combustion vehicle; and segmenting the audio data into a plurality of audio segments, wherein each audio segment has a start time and a finish time associated with that of a respective one of the video segments.
Wang discloses: a method comprising: generating video data from a camera, wherein the camera has a field of view of a roadway; generating audio data from a microphone, wherein the audio data is associated with vehicles traveling across the roadway (Wang: 0031: “This solution pre-sets a recording start area, a recording end area, and a license plate recognition area within the camera's field of view. When the camera captures a vehicle entering the recording start area, recording begins.”
0044: “the camera device may also include a camera unit, RF (Radio Frequency) circuitry, sensors, audio circuitry, WiFi module, and so on.”; Wherein the camera, containing audio circuitry, implies the video recording performed by the camera containing audio.); 
segmenting the video data into a plurality of video segments, wherein each video segment has a start time and a finish time that corresponds to a respective vehicle traveling across the roadway within the field of view; and segmenting the audio data into a plurality of audio segments, wherein each audio segment has a start time and a finish time associated with that of a respective one of the video segments (Wang: 0041: “This invention provides a method for recording vehicle videos. Within the camera's field of view, a recording start area, a recording end area, and a license plate recognition area are preset. When the camera detects a vehicle entering the recording start area, recording begins…When the vehicle leaves the recording end area, the camera stops recording and stores the recorded vehicle video based on the license plate information.”; Wherein the recording of live video based on vehicle detection constitutes the segmentation of video and audio data.);
based on the respective vehicle in each video segment, labeling each video segment with label indicating the respective vehicle’s license plate information (Wang: 0073: “when the vehicle enters the license plate recognition area, license plate recognition is performed and the recognized license plate information is output. When the vehicle leaves the recording end area, the camera stops recording and the recorded vehicle video is stored in association with the license plate information.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement the camera for video recording based on vehicle detection taught by Wang for the capturing of audio and video data disclosed by Yang. The suggestion/motivation for doing so would have been “This method eliminates the need for external detection equipment, effectively reducing costs.” (Wang: 0041; Wherein the vehicle detection and video segmenting is performed by the camera.). Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Yang in view of Wang does not disclose expressly: based on the respective vehicle in each video segment, labeling each video segment with label indicating the respective vehicle as either an electric vehicle (EV) or a non-electric internal combustion vehicle.
Hohenacker discloses: a method for monitoring the statuses of parking space areas based on an image capture system (Hohenacker: 0003-0006: “This object is satisfied by a method in accordance with claim 1 and in particular by a system composed of at least one street-lighting device, a camera system mounted at the street-lighting device, a recognition unit, a transmission unit and a mobile display unit, wherein the camera system is configured for delivering image indications from within parking space areas located within a parking space zone, and wherein the recognition unit is configured to…associate a respective occupation status in dependence on the image indications with the parking space areas, said occupation status marking whether a respective parking space area is free or occupied;”). Wherein, based on a respective vehicle in each image, labeling each image with label indicating the respective vehicle as either an electric vehicle (EV) or a non-electric internal combustion vehicle (Hohenacker: 0046: “The recognition unit in accordance with the invention can be configured to classify motor vehicles detected by the camera system using typical features and thus to distinguish electric models from models with an internal combustion engine.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to substitute the algorithms for detecting vehicle based on camera frame feature information disclosed by Yang in view of Wang with the algorithms for classifying vehicles as electric or internal combustion taught by Hohenacker. The suggestion/motivation for doing so would have been “The recognition unit in accordance with the invention can be configured to classify motor vehicles detected by the camera system using typical features and thus to distinguish electric models from models with an internal combustion engine” (Hohenacker: 0046; Wherein the classification of vehicles can be automated/combined with the detection of vehicles.). Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Yang in view of Wang with Hohenacker to obtain the invention as specified in claim 1.

Regarding claim 2, Yang in view of Wang and Hohenacker discloses: The method of claim 1, further comprising: associating each audio segments with a respective one of the labels (Wang: 0044: “the camera device may also include a camera unit, RF (Radio Frequency) circuitry, sensors, audio circuitry, WiFi module, and so on.”; 
0073: “when the vehicle enters the license plate recognition area, license plate recognition is performed and the recognized license plate information is output. When the vehicle leaves the recording end area, the camera stops recording and the recorded vehicle video is stored in association with the license plate information.”; Wherein the labeling of the video captured by the camera, constitutes the labeling of an audio segment with the label of its respective video segment.); 
wherein the training includes training the neural network based on each audio segment and its respective label (Yang: 3.1 Data Collection: “A cellphone (Samsung S9+) was fastened to a light-post directly adjacent to the stop sign, allowing direct vehicle visibility and clear audio recording of the accelerating vehicle. In total, 936 audio clips were extracted from 5 hrs of video, and divided into 7 classes and numerous manufacturers.”;
4.1 Fully-connected Neural Networks (FCNN): “For the FCNN models, either no hidden layers or one hidden layer with 20 neurons (with ReLU as the activation function) were studied. Training examples with 678 features generated from the raw .wav file and FFT were sent to the model and batch gradient descent was used in the backpropagation process…In the output layer, we used softmax as the activation function to determine the result for the full 5-class and 7-class problems (Figure 5a).”).
Regarding claim 3, Yang in view of Wang and Hohenacker discloses: The method of claim 1, wherein the trained neural network is configured to identify electric vehicles based on audio and not video (Yang: Abstract: “Audio identification of vehicles is promising because it requires simple and cheap recording devices and much lower quantities of data than other technologies. Importantly, audio-based technologies don’t suffer from low visibility and are equally effective in low-light conditions. In this work, using careful feature extraction and various deep learning architectures, we demonstrate that vehicles can be effectively classified by recording vehicle acceleration with a cellphone.”).
Regarding claim 4, Yang in view of Wang and Hohenacker discloses: The method of claim 1, wherein the start time and the finish time of each audio segment is identical to the start time and finish time of its respective video segment (Wang: 0044: “the camera device may also include a camera unit, RF (Radio Frequency) circuitry, sensors, audio circuitry, WiFi module, and so on.”; 
0073: “when the vehicle enters the license plate recognition area, license plate recognition is performed and the recognized license plate information is output. When the vehicle leaves the recording end area, the camera stops recording and the recorded vehicle video is stored in association with the license plate information.”; Wherein the labeling of the video captured by the camera, constitutes the labeling of an audio segment with the label of its respective video segment.).
Regarding claim 5, Yang in view of Wang and Hohenacker discloses: The method of claim 1, wherein the microphone is installed adjacent to the camera (Wang: 0044: “the camera device may also include a camera unit, RF (Radio Frequency) circuitry, sensors, audio circuitry, WiFi module, and so on.”; Wherein the camera device including the audio circuitry constitutes an adjacent microphone).
As per claim(s) 8, arguments made in rejecting claim(s) 1 are analogous. Section: 3.1 Data Collection of Yang discloses the collection of video and audio data via a cell phone. In addition, Section: 5 Results and Discussion of Yang discloses the training of neural networks for audio classification. Thus, implying the disclosure of “A system for training a neural network…comprising: an image sensor…an audio sensor…and a processor in communication with the image sensor and the audio sensor.”
As per claim(s) 9, arguments made in rejecting claim(s) 2 are analogous.
As per claim(s) 10, arguments made in rejecting claim(s) 3 are analogous.
As per claim(s) 11, arguments made in rejecting claim(s) 4 are analogous.
As per claim(s) 12, arguments made in rejecting claim(s) 5 are analogous.
As per claim(s) 15, arguments made in rejecting claim(s) 1 are analogous. Section: 3.1 Data Collection of Yang discloses the collection of video and audio data via a cell phone. In addition, Section: 5 Results and Discussion of Yang discloses the training of neural networks for audio classification. Thus, implying the disclosure of “A non-transitory computer-readable storage medium storing executable instructions that, when executed by one or more processors, cause the processor to: generate video data from a camera…generate audio data from a microphone.”
As per claim(s) 16, arguments made in rejecting claim(s) 2 are analogous.
As per claim(s) 17, arguments made in rejecting claim(s) 3 are analogous.
As per claim(s) 18, arguments made in rejecting claim(s) 4 are analogous.


Claim(s) 6-7, 13-14, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yang in view of Wang and Hohenacker, and further in view of Szegedy et al. (Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning) hereinafter referenced as Szegedy.

Regarding claim 6, Yang in view of Wang and Hohenacker discloses: The method of claim 1, wherein the start time and the finish time associated with each video segment is associated with the respective vehicle entering the field of view and exiting the field of view, respectively (Wang: 0073: “when the vehicle enters the license plate recognition area, license plate recognition is performed and the recognized license plate information is output. When the vehicle leaves the recording end area, the camera stops recording and the recorded vehicle video is stored in association with the license plate information.”; Wherein the labeling of the video captured by the camera, constitutes the labeling of an audio segment with the label of its respective video segment.).
Yang in view of Wang and Hohenacker does not disclose expressly: further comprising: executing an object detection and classification machine learning model to identify and classify the vehicles.
Szegedy discloses: deep learning convolutional networks, capable of executing object detection and classification tasks (Szegedy: Abstract: “Here we give clear empirical evidence that training with residual connections accelerates the training of Inception networks significantly. There is also some evidence of residual Inception networks outperforming similarly expensive Inception networks without residual connections by a thin margin. We also present several new streamlined architectures for both residual and non-residual Inception networks.”;
1. Introduction: “Since the 2012 ImageNet competition [11] winning entry by Krizhevsky et al [8], their network “AlexNet” has been successfully applied to a larger variety of computer vision tasks, for example to object-detection [4], segmentation [10], human pose estimation [17], video classification [7], object tracking [18], and super-resolution [3]. These examples are but a few of all the applications to which deep convolutional networks have been very successfully applied ever since. In this work we study the combination of the two most recent ideas: Residual connections introduced by He et al. in [5] and the latest revised version of the Inception architecture [15].”)
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to substitute the algorithms for detecting and classifying vehicles disclosed by Yang in view of Wang and Hohenacker with the Inception-v4 model taught by Szegedy. The suggestion/motivation for doing so would have been “Very deep convolutional networks have been central to the largest advances in image recognition performance in recent years. One example is the Inception architecture that has been shown to achieve very good performance at relatively low computational cost.” (Szegedy: Abstract). Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Yang in view of Wang and Hohenacker with Szegedy to obtain the invention as specified in claim 6.
	Regarding claim 7, Yang in view of Wang, Hohenacker, and Szegedy: The method of claim 6, wherein the object detection and classification machine learning model generates the labels of each video segment based upon the classification of the vehicles (Hohenacker: 0046: “The recognition unit in accordance with the invention can be configured to classify motor vehicles detected by the camera system using typical features and thus to distinguish electric models from models with an internal combustion engine.”);
(Szegedy: Abstract: “We also present several new streamlined architectures for both residual and non-residual Inception networks. These variations improve the single-frame recognition performance on the ILSVRC 2012 classification task significantly.”).
As per claim(s) 13, arguments made in rejecting claim(s) 6 are analogous.
As per claim(s) 14, arguments made in rejecting claim(s) 7 are analogous. 
As per claim(s) 19, arguments made in rejecting claim(s) 6 are analogous.
As per claim(s) 20, arguments made in rejecting claim(s) 7 are analogous.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTHONY J RODRIGUEZ whose telephone number is (703)756-5821. The examiner can normally be reached Monday-Friday 10am-7pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached at (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/ANTHONY J RODRIGUEZ/Examiner, Art Unit 2672



/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672
Read full office action
Prosecution Timeline

Dec 28, 2023
Application Filed
Jan 19, 2024
Response after Non-Final Action
Nov 26, 2025
Non-Final Rejection — §103
Feb 12, 2026
Response Filed
Mar 18, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/972,931
Patent 12499701
DOCUMENT CLASSIFICATION METHOD AND DOCUMENT CLASSIFICATION DEVICE
2y 5m to grant Granted Dec 16, 2025
17/897,121
Patent 12488563
Hub Image Retrieval Method and Device
2y 5m to grant Granted Dec 02, 2025
17/847,222
Patent 12444019
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND MEDIUM
2y 5m to grant Granted Oct 14, 2025
Study what changed to get past this examiner. Based on 3 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
17%
Grant Probability
-5%
With Interview (-21.4%)
3y 2m
Median Time to Grant
Moderate
PTA Risk
Based on 18 resolved cases by this examiner. Grant probability derived from career allow rate.