Last updated: May 29, 2026

Application No. 18/176,252

METHODS AND APPARATUS FOR REAL-TIME VOICE TYPE DETECTION IN AUDIO DATA

Non-Final OA §102§103

Filed

Feb 28, 2023

Examiner

SIDDO, IBRAHIM

Art Unit

2681

Tech Center

2600 — Communications

Assignee

Intel Corporation

OA Round

1 (Non-Final)

Interview Optional

— +13.1% interview lift. Interview lift (+13.1%) is below the 15.0% threshold. A written response is recommended.

Based on 477 resolved cases, 2023–2026

Examiner Intelligence

SIDDO, IBRAHIM View full profile →

Grants 84% — above average

Career Allowance Rate

400 granted / 477 resolved

+21.9% vs TC avg

Moderate +13% lift

Without

With

+13.1%

Interview Lift

resolved cases with interview

Fast prosecutor

2y 1m

Avg Prosecution

23 currently pending

Career history

494

Total Applications

across all art units

Statute-Specific Performance

§101

0.9%

-39.1% vs TC avg

§103

86.7%

+46.7% vs TC avg

§102

7.2%

-32.8% vs TC avg

§112

1.4%

-38.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 477 resolved cases

Office Action

§102 §103

CTNF 18/176,252 CTNF 89039 DETAILED ACTION Notice of Pre-AIA or AIA Status 07-03-aia AIA 15-10-aia The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Allowable Subject Matter 12-151-08 AIA 07-43 12-51-08 Claim s 5, 12 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Claim Rejections - 35 USC § 102 07-06 AIA 15-10-15 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 07-07-aia AIA 07-07 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – 07-08-aia AIA (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. 07-15 AIA Claim (s) 1-4, 6-11, 13-18 and 20 are rejected under 35 U.S.C. 102( a)(1 ) as being anticipated by Kwon (NPL: “Voice Frequency Synthesis using VAW GAN based Amplitude Scaling for Emotion Transformation”) . With respect to claim 8 (similarly claims 1 and 15) , Kwon teaches an apparatus (e.g. the apparatus used to conduct the experiment, section 4.2 the first 4 lines): audio interface circuitry (e.g. the audio interface of Fig 2 where data sources are received, see also the speech data used in this study, section 4.1 the first 3 lines); and one or more processors (e.g. the processors of section 4.2 the first 4 lines) to execute instructions to: identify a first vocal effort of a first audio segment of first audio data and a second vocal effort of a second audio segment of the first audio data (e.g. the emotion classes that the speech represents are: angry and disgust expressions amongst other expressions, see section 4.1 the fourth line suggest identifying angry/a first vocal effort of a first expression/audio segment of the speech and disgust/second vocal effort of a second expression/audio segment of the speech); train a neural network including training data (e.g. training and voice transformation Fig 2, train VAW-GAN-based generation model, section 3.2 including training data used in this study, see section 3.1 and/or 4.1), the training data including the first vocal effort, the first audio segment, the second audio segment, and the second vocal effort (e.g. the training data including angry, the first expression, the second expression and disgust, as suggest in section 4.1 fourth line); and deploy the neural network (e.g. deploy the VAW-GAN-based generation model of section 3.2, as suggest in section 4.2, for evaluation), the neural network to distinguish between the first vocal effort and the second vocal effort (e.g. the VWA-GAN-based generation model to distinguish between angry and disgust, as suggest in section 3.1, Fig 3 which shows SP and F0, the features of the speech data with different emotion classes in the same sentence). With respect to claim 9 (similarly claims 2 and 16) , Kwon teaches the apparatus of claim 8, wherein the one or more processors executes the instructions to preprocess the first audio segment by extracting linear predictive coefficients from the first audio segment (e.g. the structure of voice frequency analysis of Fig 2 includes data pre-processing and feature extraction with the use of GAN, which is applied in section 4.2 where Mel-spectral coefficients (MCEP) seem to be extracted, see equation 2 of section 4.2, suggesting the processors executes the instructions to preprocess the first audio segment by extracting linear predictive coefficients from the first audio segment), the linear predictive coefficients including a time-frequency representation of the first audio segment (e.g. Fig 7 section 4.1 shows the linear predictive coefficients including a time-frequency representation of the first audio segment i.e. angry). With respect to claim 10 (similarly claims 3 and 17) , Kwon teaches the apparatus of claim 8, wherein the training data further includes a third audio segment including a regular vocal effort, a fourth audio segment including a loud vocal effort, and a fifth audio segment including a yelled vocal effort (e.g. the emotion classes that the speech represents are happy, calm, sad, fearful, angry, surprise, and disgust expressions, section 4.1 fourth line, which expressions include a third audio segment including a regular vocal effort, a fourth audio segment including a loud vocal effort, and a fifth audio segment including a yelled vocal effort in addition to angry and disgust selected in the rejection of claim 8 as the first and second vocal efforts). With respect to claim 11 (similarly claims 4 and 18) , Kwon teaches the apparatus of claim 8, wherein the one or more processors executes the instructions to: analyze, via the neural network (e.g. analyze via VAW-GAN generation model, as suggested in section 4), a third audio segment (e.g. a third expression, as suggested in section 4.1 fourth line) to determine a third vocal effort of the third audio segment (e.g. to determine a third vocal effort of the third expression whereby the third vocal effort is among the unselected expressions of section 4.1 fourth line, like surprise in Fig 6); and output, via the neural network, metadata including an indication corresponding to the third vocal effort (e.g. output via VAW-GAN generation model, metadata including an indication corresponding to the third vocal effort, as suggested in Fig 3 section 3.1 for neutral, angry and disgust), the indication having a first value when the third vocal effort is a whispered vocal effort, the indication having a second value when the third vocal effort is a soft vocal effort, the indication having a third value when the third vocal effort is neither the soft vocal effort or the whispered vocal effort (e.g. Fig 3 section 3.1 and Fig 6 disclosing the indication having values when the vocal effort is angry, disgust, surprise and neutral suggest the indication having a first value when the third vocal effort is a whispered vocal effort, the indication having a second value when the third vocal effort is a soft vocal effort, the indication having a third value when the third vocal effort is neither the soft vocal effort or the whispered vocal effort when the remaining expressions of section 4.1 fourth line are processed similar to those processed in Figs 3 and 6). With respect to claim 14 (similarly claims 7 and 20) , Kwon teaches the apparatus of claim 8, wherein the one or more processors executes the instructions to identify the first vocal effort by identifying a presence of harmonics indicative of a whispered vocal effort in the first audio segment (e.g. Figs 3 and 6 identify the first vocal effort by identifying a presence of harmonics indicative of a whispered vocal effort in the first audio segment) and the identification of the second vocal effort including identifying an absence of harmonics indicative of a soft vocal effort in the second audio segment (e.g. Figs 3 and 6 suggest the identification of the second vocal effort including identifying an absence of harmonics indicative of a soft vocal effort in the second audio segment) . Claim Rejections - 35 USC § 103 07-06 AIA 15-10-15 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 07-20-aia AIA The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 07-21-aia AIA Claim (s) 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Kwon (NPL: “Voice Frequency Synthesis using VAW GAN based Amplitude Scaling for Emotion Transformation”) in view of Guez (US 5,293,456) . With respect to claim 13 (similarly claim 6) , Kwon teaches the apparatus of claim 8 including the neural network. However, Kwon fails to teach wherein the neural network is a feed-forward fully layered neural network. Guez teaches a neural network that is a feed-forward fully layered neural network (e.g. a conventional, three-layer, fully-connected, feedforward neural network trained using back-propagation techniques is shown generally at 10 in FIG. 1, see col 1 ln 21-23). Kwon and Guez are analogous art because they all pertain to voice/object recognition systems. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify Kwon with the three-layer, fully-connected, feedforward neural network of Guez, as disclosed in Fig 1 col 1 ln 21-23. The benefit would be to reduce the training time in neural networks having large input arrays, Guez col 2 ln 8-10. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to IBRAHIM SIDDO whose telephone number is (571)272-4508. The examiner can normally be reached 9:00-5:30PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Akwasi Sarpong can be reached at 5712703438. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /IBRAHIM SIDDO/Primary Examiner, Art Unit 2681 Application/Control Number: 18/176,252 Page 2 Art Unit: 2681 Application/Control Number: 18/176,252 Page 3 Art Unit: 2681 Application/Control Number: 18/176,252 Page 4 Art Unit: 2681 Application/Control Number: 18/176,252 Page 5 Art Unit: 2681 Application/Control Number: 18/176,252 Page 6 Art Unit: 2681 Application/Control Number: 18/176,252 Page 7 Art Unit: 2681

Read full office action

Prosecution Timeline

Feb 28, 2023

Application Filed

Jun 15, 2023

Response after Non-Final Action

Mar 27, 2026

Non-Final Rejection mailed — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/368,333

Patent 12640151

VOICE CONTROL WITH CONTEXTUAL KEYWORDS

2y 8m to grant Granted May 26, 2026

18/417,104

Patent 12634401

INSPECTION SYSTEM AND METHOD OF CONTROLLING THE SAME, AND STORAGE MEDIUM

2y 4m to grant Granted May 19, 2026

18/345,339

Patent 12622505

SYSTEMS, DEVICES, AND METHODS FOR SEGMENT-BASED GUIDANCE OF PRODUCT APPLICATION

2y 10m to grant Granted May 12, 2026

18/538,632

Patent 12614550

ELECTRONIC DEVICE, METHOD, AND NON-TRANSITORY COMPUTER READABLE STORAGE MEDIUM CONTROLLING EXECUTABLE OBJECT BASED ON VOICE SIGNAL

2y 4m to grant Granted Apr 28, 2026

18/423,276

Patent 12608166

Automated Data Handling

2y 2m to grant Granted Apr 21, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

84%

Grant Probability

97%

With Interview (+13.1%)

2y 1m (~0m remaining)

Median Time to Grant

Low

PTA Risk

Based on 477 resolved cases by this examiner. Grant probability derived from career allowance rate.