Prosecution Insights
Last updated: May 29, 2026
Application No. 18/768,409

DEEP REINFORCEMENT ACTIVE MACHINE LEARNING SYSTEM FOR AUDIO EVENT DETECTION AND CLASSIFICATION

Non-Final OA §103
Filed
Jul 10, 2024
Examiner
SIDDO, IBRAHIM
Art Unit
2681
Tech Center
2600 — Communications
Assignee
Robert Bosch GmbH
OA Round
1 (Non-Final)
84%
Grant Probability
Favorable
1-2
OA Rounds
2m
Est. Remaining
97%
With Interview

Examiner Intelligence

Grants 84% — above average
84%
Career Allowance Rate
400 granted / 477 resolved
+21.9% vs TC avg
Moderate +13% lift
Without
With
+13.1%
Interview Lift
resolved cases with interview
Fast prosecutor
2y 1m
Avg Prosecution
23 currently pending
Career history
494
Total Applications
across all art units

Statute-Specific Performance

§101
0.9%
-39.1% vs TC avg
§103
86.7%
+46.7% vs TC avg
§102
7.2%
-32.8% vs TC avg
§112
1.4%
-38.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 477 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1-3, 5-11, 13-18 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mahnoosh (NPL “Active learning for classifying long-duration audio recordings of the environment”) in view of Gomez (US 2023/0173683). With respect to claim 9 (similarly claims 1 and 17), Mahnoosh teaches a system (e.g. the pool- based active learning (AL) process and algorithm of Fig 1-2, section 2.1) for active machine learning for audio event detection and classification (e.g. for active machine learning for audio event detection and classification, see the abstract), comprising: a memory storing a labeled training pool of audio samples (e.g. the memory storing the labelled data, Fig 1); an audio event classifier trained using the labeled training pool (e.g. the classification model of Fig 1, section 2.5-2.6); a reinforcement learning agent configured to select a batch of audio samples from an unlabeled pool for annotation (e.g. the reinforcement learning agent that selects and annotates a set of information instances in Fig 1); and a processor (e.g. inherently the process has a processor) configured to: calculate one or more environment states for each audio sample using outputs of the audio event classifier (e.g. query strategies that measure sampling criterion for each instance, section 2.2), add an annotated batch of audio samples to the labeled training pool (e.g. add new labelled instances, Fig 1), retrain the audio event classifier using an updated labeled training pool (e.g. retrain the classification model using the added/updated labelled training pool, see Fig 1-2 where the algorithm is repeated until stopping criterion is met), update the environment states using the retrained audio event classifier (e.g. update the environment states of section 2.2 using the retrained classification model, as suggested in Fig 1), update an exploration-exploitation parameter of the reinforcement learning agent (e.g. the query strategies are updated after each iteration until a stopping criterion is met in the algorithm of Fig 2 section 2.2, 3.1-3.2), retrain the reinforcement learning agent using the updated environment states and, and detecting an audio event and classifying the audio event in response to the retrained reinforcement learning agent (e.g. section 3.1-3.2 Table 3-5 disclose the results on Test set 1 and 2 which include retraining the reinforcement learning agent using the updated environment states and, and detecting an audio event and classifying the audio event in response to the retrained reinforcement learning agent). Even though Mahnoosh teaches the annotated batch, Mahnoosh fails to teach calculate a reward for each audio sample in the annotated batch, Gomez teaches calculate a reward for each audio sample in raw data (e.g. a system that extracts feature information from raw data and estimates a reward for each audio sample in raw data, see Fig 7 [0128]-[0132]). Mahnoosh and Gomez are analogous art because they all pertain to performing reinforcement learning. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify the process and algorithm of Mahnoosh with the teachings of Gomez in Fig 7 to include: retrain the reinforcement learning agent using the updated environment states and rewards, and detecting an audio event and classifying the audio event in response to the retrained reinforcement learning agent, as modified by the estimated reward in [0128]-[0132] of Gomez. The benefit of the modification would be to perform autonomous learning through interaction between the device and the human and to reduce the number of evaluations of the human required to obtain an optimal operation and, particularly, the number of mistakes (unexpected behaviors), Gomez [0025]-[0026]. With respect to claim 10 (similarly claims 2 and 18), Mahnoosh teaches the system of claim 9 including the audio event classifier i.e. the classification model of Fig 1. However, Mahnoosh fails to teach wherein the audio event classifier is a deep learning model. Gomez teaches an audio event classifier is a deep learning model (e.g. a system that performs deep reinforcement learning using raw data in a comparative example, see Fig 6 [0032] and [0125]). Mahnoosh and Gomez are analogous art because they all pertain to performing reinforcement learning. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify the process and algorithm of Mahnoosh with the teachings of Gomez in Fig 6 to include: wherein the audio event classifier is a deep learning model, as taught by Gomez. The benefit of the modification would be to perform autonomous learning through interaction between the device and the human and to reduce the number of evaluations of the human required to obtain an optimal operation and, particularly, the number of mistakes (unexpected behaviors), Gomez [0025]-[0026]. With respect to claim 11 (similarly claims 3 and 18), Mahnoosh in view of Gomez teaches the system of claim 9 wherein the reinforcement learning agent uses a deep Q-network algorithm (Gomez e.g. the reinforcement learning agent uses a deep Q-network algorithm, as suggested in Fig 6-7 [0125], [0128]-[0132]). With respect to claim 13 (similarly claims 5 and 19), Mahnoosh in view of Gomez teaches the system of claim 9 wherein a reinforcement learning agent action space comprises a binary choice of requesting or not requesting an annotation for each audio sample (Mahnoosh e.g. the reinforcement learning agent action space of Fig 1 comprises a binary choice of requesting or not requesting an annotation for each audio sample/set of informative instance, see Fig 1, see also the results of Table 4). With respect to claim 14 (similarly claims 6 and 20), Mahnoosh in view of Gomez teaches the system of claim 13 wherein the reward is positive if the reinforcement learning agent selected an audio sample for annotation that was misclassified by the audio event classifier (Mahnoosh e.g. the reward is positive if the reinforcement learning agent selected an audio sample/unlabelled for annotation that was misclassified by the audio event classifier/classification model of Fig 1, as modified by Gomez in Fig 7 [0128]-[0132]). With respect to claim 15 (similarly claims 7 and 20), Mahnoosh in view of Gomez teaches the system of claim 9 wherein the processor is further configured to initialize a reinforcement learning agent policy using transfer learning from a related audio event detection task (Mahnoosh e.g. section 2.4, feature extraction, suggest initializing a reinforcement learning agent policy using transfer learning from a related audio event detection task). With respect to claim 16 (similarly claim 8), Mahnoosh in view of Gomez teaches the system of claim 9 wherein the audio samples are represented as mel-frequency cepstral coefficients or log-mel spectrograms (Mahnoosh e.g. section 2.4 mentions Fast-Fourier Transform and Fourier coefficents). Claim(s) 4, 12 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mahnoosh (NPL “Active learning for classifying long-duration audio recordings of the environment”) in view of Gomez (US 2023/0173683) and further in view of Sriram (US 2018/0336884). With respect to claim 12 (similarly claims 4 and 19), Mahnoosh in view of Gomez teaches the system of claim 9 including the environment states. However, Mahnoosh fails to teach wherein the environment states are determined from logit outputs of the audio event classifier concatenated with softmax or sigmoid outputs of the audio event classifier. Sriram teaches states which are determined from logit outputs of the audio event classifier concatenated with softmax or sigmoid outputs of the audio event classifier (e.g. The logit output r.sub.t.sup.CF is eventually fed into a softmax layer 540 for the generation of a probability over outputs for model training, {circumflex over (P)}(y.sub.t|x, y.sub.<t) Fig 5 [0054], see also Fig 6 [0063]). Mahnoosh and Sriram are analogous art because they all pertain to determining states. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify Mahnoosh with the teachings of Sriram to include: wherein the environment states are determined from logit outputs of the audio event classifier concatenated with softmax or sigmoid outputs of the audio event classifier, as suggested by Sriram in Fig 5 [0054]. The benefit of the modification would be to determine the output probability with precision. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to IBRAHIM SIDDO whose telephone number is (571)272-4508. The examiner can normally be reached 9:00-5:30PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Akwasi Sarpong can be reached at 5712703438. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /IBRAHIM SIDDO/Primary Examiner, Art Unit 2681
Read full office action

Prosecution Timeline

Jul 10, 2024
Application Filed
Mar 06, 2026
Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12640151
VOICE CONTROL WITH CONTEXTUAL KEYWORDS
2y 8m to grant Granted May 26, 2026
Patent 12634401
INSPECTION SYSTEM AND METHOD OF CONTROLLING THE SAME, AND STORAGE MEDIUM
2y 4m to grant Granted May 19, 2026
Patent 12622505
SYSTEMS, DEVICES, AND METHODS FOR SEGMENT-BASED GUIDANCE OF PRODUCT APPLICATION
2y 10m to grant Granted May 12, 2026
Patent 12614550
ELECTRONIC DEVICE, METHOD, AND NON-TRANSITORY COMPUTER READABLE STORAGE MEDIUM CONTROLLING EXECUTABLE OBJECT BASED ON VOICE SIGNAL
2y 4m to grant Granted Apr 28, 2026
Patent 12608166
Automated Data Handling
2y 2m to grant Granted Apr 21, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2
Expected OA Rounds
84%
Grant Probability
97%
With Interview (+13.1%)
2y 1m (~2m remaining)
Median Time to Grant
Low
PTA Risk
Based on 477 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month