Last updated: April 19, 2026
Application No. 18/532,871
System and Method for Secure Speech Feature Extraction

Non-Final OA §103§DP
Filed
Dec 07, 2023
Examiner
OPSASNICK, MICHAEL N
Art Unit
2658
Tech Center
2600 — Communications
Assignee
Microsoft Technology Licensing, LLC
OA Round
3 (Non-Final)
Interview Optional

— +10.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 900 resolved cases, 2023–2026
Examiner Intelligence

OPSASNICK, MICHAEL N View full profile →
Grants 82% — above average
Career Allow Rate
737 granted / 900 resolved
+19.9% vs TC avg
Moderate +10% lift
Without
With
+10.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
46 currently pending
Career history
946
Total Applications
across all art units
Statute-Specific Performance

§101
17.7%
-22.3% vs TC avg
§103
33.0%
-7.0% vs TC avg
§102
29.9%
-10.1% vs TC avg
§112
6.3%
-33.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 900 resolved cases
Office Action

§103 §DP
Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/27/2026 has been entered.
 
Double Patenting

The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.

Claims 1,4,5,14,17-19,22-25, 27-30, 32, 33 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-6,11-14,17,18,20 of copending Application No. 18/532900.(reference application), in view of Li et al (20190287515).  Claims 1,4,5,14,17-19,21-33 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1,8-20 of copending Application No. 18/532893.(reference application) in view of Li et al (20190287515).   
Although the claims at issue are not identical, they are not patentably distinct from each other because, as to the ‘900 application, the extra decoding steps are not necessary to realize the functionality of the claims in the instant invention; as to the ‘893 application, the further detailing of the second set of features/embeddings towards acoustic information, is more detailed than the claim scope of the claims in the instant invention.
Furthermore, the ‘900 and ‘893 application is silent to the student-teacher neural network relationship, wherein the teacher model is speaker invariant, and the student model is trained such that the network parameter results are similar to the teacher model; however, Li et al (20190287515) a speaker recognition defining at least two artificial networks (abstract), with the relationship defined as  a student-teacher model, wherein the teacher model is trained as a full model, with the learned parameters transferred to a smaller “student” model – abstract, wherein the teacher model is speaker invariant – see para 0016, the teacher model is based on clean speech, and speaker invariant as well – para 0026, and is used to train the smaller student model (which handles noisy speech environments – para 0016; and the system uses an adversarial network to train the student model to be more similar/in-line with the teacher model – para 0017. Therefore, it would have been obvious to one of ordinary skill in the art of speaker recognition to further define the neural networks as described in the ‘900/’893, as a student-teacher model relationship, as taught by Li et al (20190287515), because it would advantageously carry the effectiveness of the teacher model, while using the smaller student model, and reducing the divergence between the teacher-student model (Li et al (20190287515) – para 0019, 0020). 
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.
	The claims of each patent application, are reproduced below in their entirety, for the ease of evaluating all claim features of each application, in the event claim amendments are contemplated and as not to accidentally trigger further obviousness-type double patenting issues.
Examiner notes that the table below, maps the method claims 1,4,5,32,33, and claim 14; the remaining claims are similar in scope and content, and are not repeated below.  To see those further mapping relationships, see the presentation under the 35 USC 103 rejection.

18/532871
18/532893
18/532900
1A computer-implemented method, executed on a computing device, comprising: receiving a speech signal comprising acoustic features of a speaker content; altering the acoustic features in the speech signal to obtain an augmented speech signal, the augmented speech signal being a speaker invariant speech signal; and training, over multiple iterations, a student neural network to mimic a trained teacher neural network based on training data that includes the augmented speech signal and the speech signal, the trained teacher neural network being an automated speech recognition (ASR) model trained for speaker invariant speech recognition

32. The computer-implemented method of claim 1, wherein altering a component of the speaker information comprises adding a perturbation to the speech signal.

33. The computer-implemented method of claim 1, wherein the perturbation includes at least one of voice conversion, pitch shifting, and vocal tract length normalization.

4. The computer-implemented method of claim 1, wherein altering the acoustic features in the speech signal to obtain the augmented speech signal includes adding a loss function constraint to a processing of the received speech signal..

5. The computer-implemented method of claim 4, wherein the loss function constraint includes at least one of speaker dispersion, speaker identification, and content clustering.

14. A computing system comprising: at least one processor; a memory storing programming instructions for execution by the at least one processor, the programming instructions, upon execution by the at least one processor, causing the system to perform the following operations, 
receiving a speech signal comprising acoustic features of a speaker; altering the acoustic features in the speech signal to obtain an augmented speech signal, the augmented speech signal being a speaker invariant speech signal; and training, over multiple iterations, a student neural network to mimic a trained teacher neural network based on training data that includes the augmented speech signal and the speech signal, the trained teacher neural network being an automated speech recognition (ASR) model trained for speaker invariant speech recognition.

1. A computer-implemented method, executed on a computing device, comprising: receiving a speech signal; extracting background information from the speech signal to generate a background acoustics embedding; extracting speaker information from the speech signal to generate a speaker acoustics embedding; applying a first loss factor to the background acoustics embedding to decrease speaker information therein to generate a processed background acoustics embedding using machine learning; applying a second loss factor to the speaker acoustics embedding to decrease background information therein to generate a processed speaker acoustics embedding using machine learning; and outputting at least one of the processed background acoustics embedding and the processed speaker acoustics embedding to a speech processing system.

2. The computer-implemented method of claim 1, further including identifying at least one background acoustics metric based on the background acoustics embedding.

3. The computer-implemented method of claim 2, wherein the first loss factor is based on the at least one background acoustics metric.

4. The computer-implemented method of claim 1, further including identifying at least one speaker acoustics metric based on the speaker acoustics embedding.

5. The computer-implemented method of claim 4, wherein the second loss factor is based on the at least one speaker acoustics metric.

6. The computer-implemented method of claim 3, wherein the at least one background acoustics metric comprises measures of at least one of reverberation, noise, and sound quality.

7. The computer-implemented method of claim 5, wherein the at least one speaker acoustics metric comprises measures of at least one of pitch, vocal tract length, gender, accent, language, and age.

8. The computer-implemented method of claim 1, further including combining features of the background information and the speaker information prior to generating the background acoustics embedding and the speaker acoustics embedding.

9. The computer-implemented method of claim 1, further including training a speech processing system with at least one of the processed background acoustics embedding and the processed speaker acoustics embedding.

10. The computer-implemented method of claim 1, further including applying a clustering constraint in the generation of at least one of the background acoustics embedding and the speaker acoustics embedding.

11. A computing system comprising: a memory; and a processor to: receive a speech signal; extract first feature information from the speech signal; generate a first acoustics embedding from the first feature information; generate a second acoustics embedding from the first feature information; identify at least one first acoustics metric based on the first acoustics embedding using machine learning; identify at least one second acoustics metric based on the second acoustics embedding using machine learning; and outputting at least one of the first acoustics metric and the second acoustics metric with a speech processing system.

12. The computing system of claim 11, further including training a speech processing system with at least one of the first acoustics metric and the second acoustics metric.

13. The computing system of claim 11, further including generating a loss factor based on the first acoustics embedding.

14. The computing system of claim 13, further including applying the loss factor to the first acoustics embedding to maximize first information therein and to minimize second information therein.

15. The computing system of claim 14, wherein the loss factor is further based on the second acoustics embedding.

16. The computing system of claim 11 wherein the at least one first acoustics metric comprises background information of the speech signal.

17. The computing system of claim 15, further including applying the loss factor to the second acoustics embedding to maximize second information therein and to minimize first information therein.

18. The computing system of claim 11 wherein the at least one second acoustics metric comprises speaker information of the speech signal.

19. The computing system of claim 18, wherein the background information of the speech signal comprises at least one of reverberation, noise, and sound quality.

20. A computer program product residing on a non-transitory computer readable medium having a plurality of instructions stored thereon which, when executed by a processor, cause the processor to perform operations comprising: receiving a speech signal; extracting first feature information from the speech audio signal; extracting second feature information from the speech signal; generating a first acoustics embedding from the first feature information; generating a second acoustics embedding from the first feature information; identifying at least one first acoustics metric based on the first acoustics embedding using machine learning; identifying at least one second acoustics metric based on the second acoustics embedding using machine learning; and training a speech processing system with at least one of the first acoustics metric and the second acoustics metric.
1. A computer-implemented method, executed on a computing device, comprising: receiving, at an encoder, a speech signal comprising a content component and a speaker component, resulting in a received speech signal; processing, using machine learning, the speaker component of the speech signal to generate a representation of speaker information in the speaker component; processing, using machine learning and based at least on the representation of the speaker information, the content component of the audio signal, to generate a representation of content information in the content component having minimized speaker information; transmitting the representation of content information in the content component to a decoder; and decoding the representation of content information in the content component to generate at least a portion of the received speech signal.

2. The computer-implemented method of claim 1, wherein processing the speaker component of the speech signal to generate a representation of speaker information in the speaker component comprises generating a speaker embedding.

3. The computer-implemented method of claim 2, wherein processing the content component of the speech signal to generate a representation of content information in the content component comprises generating a content embedding.

4. The computer-implemented method of claim 3, further comprising generating an estimate of the speaker embedding from the content embedding.

5. The computer-implemented method of claim 4, further comprising comparing the estimate of the speaker embedding to the speaker embedding to generate a loss factor.

6. The computer-implemented method of claim 5, further comprising using the loss factor when generating the content embedding to minimize speaker information within the content embedding.

7. The computer-implemented method of claim 6, further comprising quantizing the content embedding prior to transmitting the content embedding to the decoder.

8. The computer-implemented method of claim 6, further comprising scrambling the content embedding prior to transmitting the content embedding to the decoder.

9. The computer-implemented method of claim 6, further comprising applying a neural watermark including the speaker information to the content embedding.

10. The computer-implemented method of claim 9, wherein the speaker information is encoded as side data of the content embedding.

11. A computing system comprising: a memory; and a processor to: receive, at an encoder, a speech signal comprising a content component and a speaker component of a first voice; process, using machine learning, the speaker component of the speech signal to generate a speaker embedding; process, using machine learning and based at least on the speaker embedding, the content component of the speech signal, to generate a content embedding having minimized speaker information; and transmit the content embedding to a decoder.

12. The computing system of claim 11, further comprising generating an estimate of the speaker embedding from the content embedding.

13. The computing system of claim 12, further comprising comparing the estimate of the speaker embedding to the speaker embedding to generate a loss factor.

14. The computing system of claim 13, further comprising using the loss factor when generating the content embedding to minimize speaker information within the content embedding.

15. The computing system of claim 14, further comprising quantizing the content embedding prior to transmitting the content embedding to the decoder.

16. The computing system of claim 13, further comprising applying a neural watermark including speaker information of the speaker component to the content embedding.

17. The computing system of claim 14, further comprising decoding the speaker information and the content embedding to generate the speech signal.

18. The computing system of claim 16, wherein the speaker information is encoded as side data of the content embedding.

19. The computing system of claim 13, further comprising performing voice conversion on the speech signal wherein the content embedding is transmitted with speaker information of a second voice different from the first voice.

20. A computer program product residing on a non-transitory computer readable medium having a plurality of instructions stored thereon which, when executed by a processor, cause the processor to perform operations comprising: receiving, at an encoder, a speech signal comprising a content component and a speaker component of a first voice, resulting in a received speech signal; processing, using machine learning, the speaker component of the speech signal to generate a speaker embedding; processing, using machine learning and based at least on the speaker embedding, the content component of the voice signal, to generate a content embedding having minimized speaker information, by using a loss factor generated from the speaker embedding when generating the content embedding; and decoding the representation of content information in the content component to generate the received speech signal.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1,4,5,14,17-19,22-25,27-30,32,33 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al (20200365166) in view of Li et al (20190287515).

As per claim 1, Zhang et al (20200365166) teaches a computer-implemented method, executed on a computing device, comprising: receiving a speech signal comprising content and acoustic features of a speaker (as receiving speech, speaker identity, and content – fig. 2, left side);
altering the acoustic features  in the speech signal to obtain an augmented speech signal being a speaker invariant speech signal (as operating on and modifying the speech spectrogram – para 0029; and performing the conversion so that it is speaker independent – para 0038)); 
training, over multiple iterations (as repeated training – para 0058), a full sentence “An autoencoder system is operated…”), the trained 

	As per claim 1, Zhang et al (20200365166) teaches the above features, especially the concept of using a smaller trained neural network model, based on a larger model (as mapped above); however, Zhang et al (20200365166) does not explicitly teach:
that the trained teacher neural network model recognizes content invariant to the voice of a particular speaker, processes the speech signal so that the teacher neural network generates speaker invariant embeddings (based on content) that are invariant to the acoustic features of the speaker, and using the teacher invariant embedding toe train a student neural network to generate representative speaker invariant embedding as well, that are at a level of similarity to the speaker invariant embedding generated by the trained neural network;   
however, Li et al (20190287515) teaches a speaker recognition defining at least two artificial networks (abstract), with the relationship defined as  a student-teacher model, wherein the teacher model is trained as a full model, with the learned parameters transferred to a smaller “student” model – abstract, wherein the teacher model is speaker invariant – see para 0016, the teacher model is based on clean speech, and speaker invariant as well – para 0026, and is used to train the smaller student model (which handles noisy speech environments – para 0016; and the system uses an adversarial network to train the student model to be more similar/in-line with the teacher model – para 0017.  Therefore, it would have been obvious to one of ordinary skill in the art of speaker recognition to further define the neural networks as described in Zhang et al (20200365166) as a student-teacher model relationship, along with adversarial training of the student model, as taught by Li et al (20190287515), because it would advantageously carry the effectiveness of the teacher model, while using the smaller student model, and reducing the divergence between the teacher-student model (Li et al (20190287515) – para 0019, 0020).           

As per claim 4, the combination of Zhang et al (20200365166) in view of Li et al (20190287515) teaches the computer-implemented method of claim 1, wherein altering the acoustic features in the speech signal to obtain the augmented speech signal includes adding a loss function constraint to a processing of the received speech signal (as, Zhang et al (20200365166), using a loss function to modify the signal – para 0037-0038).

As per claim 5, the combination of Zhang et al (20200365166) in view of Li et al (20190287515) teaches the computer-implemented method of claim 4, wherein the loss function constraint includes at least one of speaker dispersion, speaker identification, and content clustering (as, Zhang et al (20200365166) the loss can also be a representation of speaker dispersion – see para 0051, wherein the loss maximizes similarities between a speaker utterances while minimizing the similarity among different speakers – ie, a dispersion measurement).

As per claims 32,33, the combination of Zhang et al (20200365166) in view of Li et al (20190287515) teaches the method of claim 1, wherein altering the acoustic features in the speech signal to obtain an augmented speech signal includes:
adding perturbations to the speech signal.(claim 32); at least one of performing speaker conversion on the speech signal, performing pitch shifting or flattening on the speech signal, or performing vocal track length normalization on the speech signal (claim 33);  see Zhang et al (20200365166), as altering the speaker information with a style content vector – para 0034, using a style vector to represent the voice quality, and converting to a destination style S2 – para 0034, 0036 – these modify/change the speech features of the speaker, into a target speaker style)).    

	Claims 14,17,18,27-30 are computing system claims that perform the method steps found throughout in claims 1,4,5,32, 33 above and as such, claims 14,17,18,27-30 are similar in scope and content to claims 1,4,5,32,33 above; therefore, claims 14,17,18,27-30 are rejected under similar rationale as presented against claims 1,4,5,32,33 above.  Furthermore, Zhang et al (20200365166) teaches a processor accessing machine readable instructions stored in memory, to perform the steps (para 0006).  Additionally, in further detail, as to claim 28, Zhang et al (20200365166) teaches altering the phonetics/prosody of the speaker information to a target speaker (para 0005).  As to claim 29, Zhang et al (20200365166) teaches the content including phonetic and prosodic information (para 0029).  The “content” is modified in Zhang et al (20200365166), see paragraph 0029, 0030; wherein the content is preserved in X1 matches the speaker characteristics of speaker U2.  See also para 0034, further defining the style vector modifications to a target speaker style.  Examiner notes, that it is old and notoriously well known, in the art of speaker characteristic conversion, to perform pitch shifting (claim 29) and vocal track length normalization (claim 30).  As evidence of this, see Gibson et al (US Patent, 6336092, Issued Jan 1, 2002) teaching modification toward a target voice (col. 1 lines 4-6, with shifting the pitch (Figure 4, subblock labeled “Shift Pitch”, col. 2 lines 51-55), and vocal tract lengthening – col. 1 lines 45-48).

	Claims 19,22-25 are non-transitory computer readable product claims performing steps found throughout in claims 1,4,5,32,33 above and as such, claims 19,22-25 are similar in scope and content to claims 1,4,5,32,33 above; therefore, claims 19,22-25 are rejected under similar rationale as presented against claims 1,4,5,32,33 above.  Furthermore, Zhang et al (20200365166) teaches a processor accessing machine readable instructions stored in memory, to perform the steps (para 0006).  Additionally, in further detail, as to claim 23, Zhang et al (20200365166) teaches altering the phonetics/prosody of the speaker information to a target speaker (para 0005).  As to claim 24, Zhang et al (20200365166) teaches the content including phonetic and prosodic information (para 0029).  The “content” is modified in Zhang et al (20200365166), see paragraph 0029, 0030; wherein the content is preserved in X1 matches the speaker characteristics of speaker U2.  See also para 0034, further defining the style vector modifications to a target speaker style.  Examiner notes, that it is old and notoriously well known, in the art of speaker characteristic conversion, to perform pitch shifting (claim 24) and vocal track length normalization (claim 25).  As evidence of this, see Gibson et al (US Patent, 6336092, Issued Jan 1, 2002) teaching modification toward a target voice (col. 1 lines 4-6, with shifting the pitch (Figure 4, subblock labeled “Shift Pitch”, col. 2 lines 51-55), and vocal tract lengthening – col. 1 lines 45-48).
       
Response to Arguments

Applicant’s arguments with respect to claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.  Examiner notes the introduction of the Li et al (20190287515) reference, to teach the defined concept of a teacher-student neural network models with speaker invariant properties of the teacher model, and using adversarial networks to train the student model to output similar results; to modify the Zhang et al (20200365166).  Examiner also notes the Gibson et al (US Patent, 6336092) to teach the well known techniques in prosody modification for speaker/voice conversion. 

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
In further detail, the following references were found, to be pertinent to applicants specification/claim elements:

Kim et al (20200357384) teaches a student/teacher model using a discriminator model to minimize any output differential between the two (para 0022, 0048, 0043)

Meng et al (20200334538) teaches minimizing loss between the teacher/student models with different domains for each model (para 0032, 0033).
Rebryk (20220157316) teaches the combinations of audio encoding and speaker encoding embeddings (first and second) for speech synthesis (see fig. 2), using neural network structures – fig. 3.

Wang et al (20230100259) teaches parallel tracks of speaker embeddings and speech feature embedding, and combining for a final similarity calculation and output – see Fig. 4.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        03/20/2026
Read full office action
Prosecution Timeline

Dec 07, 2023
Application Filed
Jul 18, 2025
Non-Final Rejection — §103, §DP
Sep 05, 2025
Examiner Interview Summary
Sep 05, 2025
Applicant Interview (Telephonic)
Sep 24, 2025
Response Filed
Dec 20, 2025
Final Rejection — §103, §DP
Jan 06, 2026
Applicant Interview (Telephonic)
Jan 09, 2026
Examiner Interview Summary
Feb 27, 2026
Request for Continued Examination
Mar 02, 2026
Response after Non-Final Action
Mar 20, 2026
Non-Final Rejection — §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/512,723
Patent 12602554
SYSTEMS AND METHODS FOR PRODUCING RELIABLE TRANSLATION IN NEAR REAL-TIME
2y 5m to grant Granted Apr 14, 2026
17/698,029
Patent 12592246
SYSTEM AND METHOD FOR EXTRACTING HIDDEN CUES IN INTERACTIVE COMMUNICATIONS
2y 5m to grant Granted Mar 31, 2026
18/367,779
Patent 12586580
System For Recognizing and Responding to Environmental Noises
2y 5m to grant Granted Mar 24, 2026
18/344,007
Patent 12579995
Automatic Speech Recognition Accuracy With Multimodal Embeddings Search
2y 5m to grant Granted Mar 17, 2026
18/273,354
Patent 12567432
VOICE SIGNAL ESTIMATION METHOD AND APPARATUS USING ATTENTION MECHANISM
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
82%
Grant Probability
92%
With Interview (+10.5%)
3y 3m
Median Time to Grant
High
PTA Risk
Based on 900 resolved cases by this examiner. Grant probability derived from career allow rate.