Last updated: May 29, 2026
Application No. 17/915,624
METHOD OF GENERATING AUDIO DATA WITH ELECTROACOUSTIC EFFECT, DEVICE, AND STORAGE MEDIUM

Non-Final OA §103§112
Filed
Mar 02, 2023
Priority
Aug 24, 2021 — CN 202110978065.3 +1 more
Examiner
SUBRAMANI, NANDINI
Art Unit
2656
Tech Center
2600 — Communications
Assignee
BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
OA Round
2 (Non-Final)
This examiner grants 64% of cases after interview

— +49.2% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 93 resolved cases, 2023–2026
Examiner Intelligence

SUBRAMANI, NANDINI View full profile →
Grants 64% of resolved cases
Career Allowance Rate
60 granted / 93 resolved
+2.5% vs TC avg
Strong +49% interview lift
Without
With
+49.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
10 currently pending
Career history
112
Total Applications
across all art units
Statute-Specific Performance

§103
96.6%
+56.6% vs TC avg
§102
2.4%
-37.6% vs TC avg
§112
0.3%
-39.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 93 resolved cases
Office Action

§103 §112
DETAILED ACTION
Introduction
This office action is in response to Applicant’s submission filed on 11/10/2025. Claims 1-2, 4-9, 19-20 and 22-31 are pending in the application and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statements filed 09/29/2022, 08/02/2023, 01/10/2024  fails to comply with 37 CFR 1.98(a)(3)(i) because it does not include a concise explanation of the relevance, as it is presently understood by the individual designated in 37 CFR 1.56(c) most knowledgeable about the content of the information, of each reference listed that is not in the English language.  It has been placed in the application file, but the information referred to therein has not been considered.
Response to Amendment
The response filed on 11/10/2025 has been correspondingly accepted and considered in this Office Action. Claims 1-2, 4-9, 19-20 and 22-31 have been examined. Claim 3 has been cancelled. Claim 31 has been added. 
Applicant’s amendments to claims 1, 19 and 20 and the Applicant Response filed 11/10/2025 on pages 15-17 overcome the 35 U.S.C 101 rejections previously set forth in the Non-Final Office Action mailed 08/14/2025. The dependent claims 2,4-9, 22-30 overcome the 35 U.S.C 101 rejections previously set forth in the Non-Final Office Action mailed 08/14/2025 based on their dependency to the amended claims 1, 19 and 20 respectively. Therefore, the above referenced rejections under 35 U.S.C. 101 are withdrawn.
Applicant’s modification to the Title of the Specifications (lines 1-4) filed 11/10/2025, overcomes the Specification objection to the title of the invention previously set forth in the Non-Final Office Action mailed 08/14/2025.
Applicant’s replacement Drawing of Fig. 4 filed 11/10/2025, overcomes the objection to the Drawing previously set forth in the Non-Final Office Action mailed 08/14/2025.

Response to Arguments
Applicant's arguments filed 11/10/2025 have been fully considered as follows:
Applicant’s further arguments with respect to Claims 8 and 22-25  state that
“Claims 8 and 22-25 were rejected under 35 U.S.C. §112 as allegedly being indefinite. Applicant traverses.  Without agreeing or acceding to the propriety or the merits of the rejection, and merely to expedite prosecution, Applicant has amended claims 8 and 22-25 to reflect the quantization described in the application. Thus, this rejection should be withdrawn”

The examiner respectfully disagrees, the formulae for second fundamental frequency and third fundamental frequency are similar but named differently, hence, the claims are rejected as being indefinite or unclear to distinctly claim what the inventors regards as the invention. Further the claim amendments refers to a rounded scale however this was not supported in the originally filed Specifications on 09/29/2022 and therefore, the rejections of Claims 8 and 22-25 are rejected under 35 U.S.C. 112(b) are sustained and further updated accordingly.
Applicant’s arguments with respect to claim 1 (also representative of claims 19 and 20)  state that
“The cited portions of Nguyen do not disclose or teach "correcting the original fundamental frequency to obtain a first fundamental frequency" as claimed.... In summary, the cited portions of Nguyen do not disclose the specific features related to electroacoustic processing as claimed. The claimed technical features reflect the innovation of the invention, which generates high-quality electroacoustic effects through fundamental frequency correction, electroacoustic parameter adjustment, and quantization processing. ”

Applicant’s arguments above with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
In response to the art rejection(s) of the remainder of dependent claims are rejected under 35 U.S.C 103, in case said claims are correspondingly discussed and/or argued for at least the same rationale presented in Remarks filed 11/10/2025, Examiner respectfully notes as follows. For completeness, should the mentioned claims be likewise traversed for similar reasons to independent claims 1, 19 and 20 correspondingly, Examiner respectfully directs Applicant to the same previous supra reasons provided in the response directed towards claims 1, 19 and 20 correspondingly discussed above. For at least the same supra provided reasons, Examiner likewise respectfully disagrees, and Applicant's arguments have been fully considered but they are not persuasive.

Specification
The amendment filed 11/10/2025 is objected to under 35 U.S.C. 132(a) because it introduces new matter into the disclosure.  35 U.S.C. 132(a) states that no amendment shall introduce new matter into the disclosure of the invention.  The added material which is not supported by the original disclosure is as follows: in Specification amendment filed 11/10/2025 lines 6-10 to Specifications [0066] is not supported in the original Specifications filed on 9/29/2022. 
Applicant is required to cancel the new matter in the reply to this Office Action.
Further, the Specification amendment filed 11/10/2025 lines 13-20 to Specifications [0080-0082] are redundant and already indicated in amended Drawing sheet filed 11/10/2025 and in original Specifications filed 09/29/2022 and it is the recommendation of Examiner to not include these amendments to the Specifications.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8 and 22-25 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
It is unclear how third fundamental frequency(F0”)  is different from the second fundamental frequency(F0’) as has been defined in the following equations :  
    PNG
    media_image1.png
    52
    279
    media_image1.png
    Greyscale
and 
    PNG
    media_image2.png
    61
    246
    media_image2.png
    Greyscale

The formulae for second fundamental frequency and third fundamental frequency are similar but named differently, hence, the claims are rejected as being indefinite or unclear to distinctly claim what the inventors regards as the invention. Further the amendment refers to a rounded scale however this was not supported in the originally filed Specifications on 09/29/2022.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Claims 1, 2, 19, 20 and 30-31  are rejected under 35 U.S.C. 103 as being unpatentable over Cohen-Hadria, Alice, et al. "Voice anonymization in urban sound recordings." 2019 IEEE 29th international workshop on machine learning for signal processing (Mlsp). IEEE, 2019 (cited in IDS) in view of Nakano et. al. US Patent 8,244,546.

    PNG
    media_image3.png
    511
    343
    media_image3.png
    Greyscale
Regarding claim 1, Cohen-Hadria teaches a method of processing audio data, the method comprising: decomposing using a neural network, original audio data to obtain voice audio data and background audio data (see Cohen-Hadria, sect 2.1. U-Net source separation(neural network), the voice is extracted from mix of voice and background using a U-net;); performing electroacoustic processing on the voice audio data to obtain electroacoustic voice data(see Cohen-Hadria, sect 2.2 Voice blurring (electroacoustic processing of voice data) to remove the identifiable information); and combining the electroacoustic voice data and the background audio data to obtain target audio data with electroacoustic effect (see Cohen-Hadria, Fig. 1, Resynthesis, sect 1.3 The blurred voice is recombined with the background signal, resulting in the anonymized resynthesis (blurred voice : audio with electroacoustic effect)).
However, Cohen-Hadria fails to teach wherein the performing electroacoustic processing on the voice audio data to obtain electroacoustic voice data comprises: extracting an original fundamental frequency of the voice audio data; correcting the original fundamental frequency to obtain a first fundamental frequency; adjusting, according to a pre-determined electroacoustic parameter, the first fundamental frequency to obtain a second fundamental frequency; performing quantization processing on the second fundamental frequency to obtain a third fundamental frequency, wherein the quantization processing comprises determining a frequency range based on a piano key frequency; and determining the electroacoustic voice data according to the third fundamental frequency.
However, Nakano teaches extracting an original fundamental frequency of the voice audio data (see Nakano, col 13 lines 61-64 The input singing voice audio signal analysis section 5 in this embodiment has the following four functions: a first function of estimating a fundamental frequency Fo from the audio signal of input singing voice in a predetermined cycle); correcting the original fundamental frequency to obtain a first fundamental frequency(see Nakano, col 27 lines 17-24 Off-pitch correction" and "pitch transposing" functions that alter the pitch of the audio signal of input singing voice are implemented by using the off-pitch estimating section 17 and the pitch compensating section 19 as follows. (corrected to obtain the first frequency/pitch correction)); adjusting, according to a pre-determined electroacoustic parameter, the first fundamental frequency to obtain a second fundamental frequency (see Nakano, col 17 lines 7-50 discusses adjusting the pitch to the predetermined note ( electroacoustic parameter) to obtain the corrected pitch (second frequency)); performing quantization processing on the second fundamental frequency to obtain a third fundamental frequency, wherein the quantization processing comprises determining a frequency range based on a piano key frequency and calculating the third fundamental frequency based on the frequency range (see Nakano, col 18 line 61- col 19 line 43 describes the DYN which is the frequency range based on piano key frequency(see Nakano, col 16 lines 34-45) and calculating the singing synthesis parameter ( third fundamental frequency) ); and determining the electroacoustic voice data according to the third fundamental frequency(see Nakano, col 19 lines 35-40 the singing synthesis parameter data is estimated by the singing synthesis parameter data estimating section 13. The temporary audio signal of synthesized singing voice is thereby obtained by the singing synthesis section 101).
Cohen-Hadria and Nakano are considered to be analogous to the claimed invention because they relate to voice transformation. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of  Cohen-Hadria on using a neural network framework for source audio signal separation and reconstruction with the estimation of singing synthesis parameter data from an audio signal of a user's input singing voice teachings of Nakano to support music production which uses singing synthesis ( see Nakano, col 1 lines 6-12).
Regarding claim 2, Cohen-Hadria in view of Nakano teaches the method according to claim 1.  Cohen-Hadria further teaches wherein the decomposing original audio data to obtain voice audio data and background audio data comprises: determining original Mel-spectrogram data corresponding to the original audio data(see Cohen-Hadria, sect 2.1 the input X ∈ R T ×F + is a magnitude spectrogram of mix); determining, by using a neural network, background Mel-spectrogram data corresponding to the original Mel-spectrogram data and voice Mel-spectrogram data corresponding to the original Mel-spectrogram data(see Cohen-Hadria, sect 2.1, From X, the U-Net(deep neural network) learns a compressed, encoded representation, which is decoded to reconstruct a target magnitude spectrogram, which in our case, is derived from an isolated vocal signal provided as supervision. More precisely, the U-Net with parameters θ computes a continuous (soft) mask fθ : R T ×F + → [0, 1]T ×F , and the separated magnitude spectrogram is estimated by the element-wise product of X with the mask: X ⊗ fθ(X)    After estimating the vocal magnitude spectrogram, we combine it with the phase of the input spectrogram to reconstruct, with an ISTFT, an temporal signal of the separated voice); and generating the background audio data according to the background Mel- spectrogram data, and generating the voice audio data according to the voice Mel- spectrogram data (see Cohen-Hadria, sect 2, We extract the voices from the mix of voice and background using a deep neural network called U-Net [5, 6], described in Section 2.1. This step estimates two separated audio signals: the voice and the residual background from the spectrogram of the mix).
Regarding claim 19, is directed to an electronic device claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 20, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 30, is directed to an electronic device claim corresponding to the method claim presented in claim 2 and is rejected under the same grounds stated above regarding claim 2.
Regarding claim 31, is directed to a non-transitory computer readable medium claim corresponding to the method claim presented in claim 2 and is rejected under the same grounds stated above regarding claim 2.
Claims 4-7 and are rejected under 35 U.S.C. 103 as being unpatentable over Cohen-Hadria, Alice, et al. "Voice anonymization in urban sound recordings." 2019 IEEE 29th international workshop on machine learning for signal processing (Mlsp). IEEE, 2019 (cited in IDS) in view of Nakano et. al. US Patent 8,244,546 further in view of Bachu, Rajesh G., et al. "Voiced/unvoiced decision for speech signals based on zero-crossing rate and energy." Advanced techniques in computing sciences and software engineering. Dordrecht: Springer Netherlands, 2009. 279-282.
Regarding claim 4, Cohen-Hadria in view of Nakano teaches the method according to claim 1. However, Cohen-Hadria in view of Nakano fails to teach dividing the voice audio data into a plurality of audio segments; determining, for each audio segment of the plurality of audio segments, an energy of the audio segment and a zero-crossing rate of the audio segment; determining, according to the energy of the audio segment and the zero-crossing rate of the audio segment, whether the audio segment is a voiced audio segment or not; and correcting a fundamental frequency of the voiced audio segment by using a linear interpolation algorithm.

    PNG
    media_image4.png
    209
    711
    media_image4.png
    Greyscale
However, Bachu teaches dividing the voice audio data into a plurality of audio segments(see Bachu, Fig. 2, the speech signal is segmented into a non-overlapping frame of samples(audio segments, At the first stage, speech signal is divided into intervals in frame by frame without overlapping. It is given with Fig.2.)); determining, for each audio segment of the plurality of audio segments, an energy of the audio segment and a zero-crossing rate of the audio segment (see Bachu, Fig. 1, short-time energy, ZCR, Bachu, sect II B, C ); determining, according to the energy of the audio segment and the zero-crossing rate of the audio segment, whether the audio segment is a voiced audio segment or not; and correcting a fundamental frequency of the voiced audio segment by using a linear interpolation algorithm(see Bachu, Fig. 1, decision based on if ZCR is small and E is high, see Bachu sect. III).
Cohen-Hadria, Nakano and Bachu are considered to be analogous to the claimed invention because they relate to speech signal processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of  Cohen-Hadria in view of Nguyen on using a neural network framework for source audio signal separation and reconstruction with the classification of speech into voiced/unvoiced teachings of Bachu to improve the speed of the classification using simple and efficient classification methods ( see Bachu, sect. I).
Regarding claim 5, Cohen-Hadria in view of Nakano further in view of Bachu teaches the method according to claim 4. Bachu further teaches wherein the audio segment comprises a plurality of sampling points(see Bachu, sec IIC, The choice of the window determines the nature of the short-time energy representation. In our model, we used Hamming window (sampling points)), and the determining an energy of the audio segment comprises determining the energy of the audio segment according to a value of each sampling point in the audio segment(see Bachu, sec II C, describes the calculation of the short term energy based on the hamming window). The same motivation to combine as claim 4 applies here.
Regarding claim 6, Cohen-Hadria in view of Nakano further in view of Bachu teaches the method according to claim 4. Bachu further teaches wherein the audio segment comprises a plurality of sampling points, and the determining a zero-crossing rate of the audio segment comprises: determining, for every set of two adjacent sampling points in the audio segment, whether a value of one of the two adjacent sampling points has a sign opposite to a sign of a value of the other one of the two adjacent sampling points(see Bachu, sec II B, describes the sign of the sampling point , equations (2) and Fig. 3 & 4); and determining a ratio of a number of sets of two adjacent sampling points having values of opposite signs to a total number of the sampling points in the audio segment as the zero-crossing rate(see Bachu, Sec II B, equation (1)). The same motivation to combine as claim 4 applies here.
Regarding claim 7, Cohen-Hadria in view of Nakano further in view of Bachu teaches the method according to claim 4. Nakano further teaches wherein the pre-determined electroacoustic parameter comprises an electroacoustic degree parameter and/or an electroacoustic tone parameter, and the adjusting, according to a pre-determined electroacoustic parameter(see Nakano, col 17 lines 7-50 discusses adjusting the pitch to the predetermined note ( electroacoustic parameter) to obtain the corrected pitch), the first fundamental frequency to obtain a second fundamental frequency comprises: determining, according to the fundamental frequency of the voiced audio segment, a fundamental frequency variance and/or a fundamental frequency mean value (see Nakano, col 13 lines 61-64 The input singing voice audio signal analysis section 5 in this embodiment has the following four functions: a first function of estimating a fundamental frequency Fo from the audio signal of input singing voice in a predetermined cycle); determining a corrected fundamental frequency variance according to the electroacoustic degree parameter and the fundamental frequency variance, and/or determining a corrected fundamental frequency mean value according to the electroacoustic tone parameter and the fundamental frequency mean value (see Nakano, col 27 lines 17-24 Off-pitch correction" and "pitch transposing" functions that alter the pitch of the audio signal of input singing voice are implemented by using the off-pitch estimating section 17 and the pitch compensating section 19 as follows. (corrected to obtain the first frequency/pitch correction)); and adjusting, according to the corrected fundamental frequency variance and/or the corrected fundamental frequency mean value, the first fundamental frequency to obtain the second fundamental frequency (see Nakano, col 17 lines 7-50 discusses adjusting the pitch to the predetermined note ( electroacoustic parameter) to obtain the corrected pitch (second frequency)). The same motivation to combine as claim 1 applies here.
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Cohen-Hadria, Alice, et al. "Voice anonymization in urban sound recordings." 2019 IEEE 29th international workshop on machine learning for signal processing (Mlsp). IEEE, 2019 (cited in IDS) in view of Nakano et. al. US Patent 8,244,546 further in view of Josh S. Allen, “Equal temperaments as mathematical series” available at https://www.bikexprt.com/tunings/tunings0.htm Contents © 1997 John S. Allen, Last revised 6 May 2003.
Regarding claim 8, Cohen-Hadria in view of Nakano teaches the method according to claim 1. However, Cohen-Hadria in view of Nakano fail to teach wherein the performing quantization processing on the second fundamental frequency to obtain a third fundamental frequency comprises: determining a frequency range according to: 
    PNG
    media_image1.png
    52
    279
    media_image1.png
    Greyscale
wherein scale is the frequency range, and F0’ is the second fundamental frequency; rounding the frequency range: scale = round (scale) and determining, based on the rounded frequency range, the third fundamental frequency according to: 
    PNG
    media_image2.png
    61
    246
    media_image2.png
    Greyscale
wherein F0” is the third fundamental frequency and scale is the rounded frequency range.
However, Josh S. Allen, teaches wherein the performing quantization processing on the second fundamental frequency to obtain a third fundamental frequency comprises: determining a frequency range according to: 
    PNG
    media_image1.png
    52
    279
    media_image1.png
    Greyscale
wherein scale is the frequency range, and F0’ is the second fundamental frequency (see Josh S Allen, pg. 3 control of pitch calculation based on voltage control(scale) and Fm is fundamental frequency of musical tone, and Fr is 27. 5 Hz is the lowest note on a Piano (Ao) ); rounding the frequency range: scale = round (scale) (see Josh S Allen, pg. 6 discusses for non-integer equal interval scales ( rounding the frequency range) )and determining, based on the frequency range, the third fundamental frequency according to: 
    PNG
    media_image2.png
    61
    246
    media_image2.png
    Greyscale
wherein F0” is the third fundamental frequency and scale is the rounded frequency range ( see Josh S Allen, pg. 2, discusses the calculation of the Fm (F0 “) based on Fr (Ao 27.5)).
Cohen-Hadria and Nakano teaches the source audio separation and processing of the voice audio data, however does not teach quantization processing on the second fundamental frequency to obtain a third fundamental frequency based on the particular equations. Josh S Allen teaches the method of computing musical tone equal temperament and the control of the pitch for synthesizers per octave. Using the known technique of computing the musical tone equal temperament and pitch control per octave by Josh S Allen(see Josh S Allen, sect. Standard 12-tone equal temperament), to provide the quantization processing of the second fundamental frequency to obtain a third fundamental frequency in the references Cohen-Hadria in view of Nguyen and to process voice transformation would have been obvious to one of ordinary skill in the art.
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Cohen-Hadria, Alice, et al. "Voice anonymization in urban sound recordings." 2019 IEEE 29th international workshop on machine learning for signal processing (Mlsp). IEEE, 2019 (cited in IDS) in view of Nakano et. al. US Patent 8,244,546 further in view of Nguyen, Binh Phu. "Studies on spectral modification in voice transformation." (2009).
Regarding claim 9, Cohen-Hadria in view of Nakano teaches the method according to claim 1. However, Cohen-Hadria in view of Nakano fails to teach further comprising determining a spectral envelope and an aperiodic parameter according to the voice audio data and the first fundamental frequency, wherein the determining the electroacoustic voice data according to the third fundamental frequency comprises[[:]] determining the electroacoustic voice data according to the third fundamental frequency, the spectral envelope and the aperiodic parameter.   However, Nguyen teaches determining a spectral envelope and an aperiodic parameter according to the voice audio data and the first fundamental frequency (see Nguyen, Fig. 3.11 , Spectra envelope, Gain, AP based on Straight analysis), wherein the determining the electroacoustic voice data according to the third fundamental frequency comprises: determining the electroacoustic voice data according to the third fundamental frequency, the spectral envelope and the aperiodic parameter (see Nguyen, pg. 63, Fig. 3.11, processing using Straight synthesis, of the reconstructed spectral envelope, AP and F0 ( third fundamental frequency). 
Cohen-Hadria, Nakano and Nguyen are considered to be analogous to the claimed invention because they relate to voice transformation. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of  Cohen-Hadria in view of Nakano on using a neural network framework for source audio signal separation and reconstruction and the estimation of singing synthesis parameter data from an audio signal of a user's input singing voice with the spectral modeling and modifications for the speech modification teachings of Nguyen to improve the effectiveness of the spectral modifications of the speech signals( see Nguyen, sect. 3.3.4).
Claims 22-25 are rejected under 35 U.S.C. 103 as being unpatentable over Cohen-Hadria, Alice, et al. "Voice anonymization in urban sound recordings." 2019 IEEE 29th international workshop on machine learning for signal processing (Mlsp). IEEE, 2019 (cited in IDS) in view of Nakano et. al. US Patent 8,244,546 further in view of Bachu, Rajesh G., et al. "Voiced/unvoiced decision for speech signals based on zero-crossing rate and energy." Advanced techniques in computing sciences and software engineering. Dordrecht: Springer Netherlands, 2009. 279-282 further in view of Josh S. Allen, “Equal temperaments as mathematical series” available at https://www.bikexprt.com/tunings/tunings0.htm Contents © 1997 John S. Allen, Last revised 6 May 2003.
Regarding claim 22-25, is directed to method claim corresponding to the method claim presented in claim 8 but dependent on different dependent claims and is rejected under the same grounds stated above regarding claim 8.
Cohen-Hadria, Nakano and Bachu teaches the source audio separation and processing of the voice audio data, however does not teach quantization processing on the second fundamental frequency to obtain a third fundamental frequency based on the particular equations. Josh S Allen teaches the method of computing musical tone equal temperament and the control of the pitch for synthesizers per octave. Using the known technique of computing the musical tone equal temperament and pitch control per octave by Josh S Allen(see Josh S Allen, sect. Standard 12-tone equal temperament), to provide the quantization processing of the second fundamental frequency to obtain a third fundamental frequency in the references Cohen-Hadria in view of Nakano further in view of Bachu and to process voice transformation would have been obvious to one of ordinary skill in the art.
Claims 26-29 are rejected under 35 U.S.C. 103 as being unpatentable over Cohen-Hadria, Alice, et al. "Voice anonymization in urban sound recordings." 2019 IEEE 29th international workshop on machine learning for signal processing (Mlsp). IEEE, 2019 (cited in IDS) in view of Nakano et. al. US Patent 8,244,546 further in view of Bachu, Rajesh G., et al. "Voiced/unvoiced decision for speech signals based on zero-crossing rate and energy." Advanced techniques in computing sciences and software engineering. Dordrecht: Springer Netherlands, 2009. 279-282 further in view of Nguyen, Binh Phu. "Studies on spectral modification in voice transformation." (2009).
Regarding claim 26-29, is directed to method claim corresponding to the method claim presented in claim 9 but dependent on different dependent claims and is rejected under the same grounds stated above regarding claim 9.
Cohen-Hadria, Nakano, Bachu and Nguyen are considered to be analogous to the claimed invention because they relate to voice transformation. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of  Cohen-Hadria in view of Nakano further in view of Bachu on using a neural network framework for source audio signal separation and reconstruction and the estimation of singing synthesis parameter data from an audio signal of a user's input singing voice with the spectral modeling and modifications for the speech modification teachings of Nguyen to improve the effectiveness of the spectral modifications of the speech signals( see Nguyen, sect. 3.3.4).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Cheng, Corey. "Design of a pitch quantization and pitch correction system for real-time music effects signal processing." Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference. IEEE, 2012 (cited in IDS) teaches simple, low-complexity methods that can be used to create a practical pitch quantization system which can produce pitch corrected, pitch shifted, and/or reharmonized audio with quality approaching currently available commercial methods.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 12:00pm - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached at (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/NANDINI SUBRAMANI/            Examiner, Art Unit 2656      
                                                                                                                                                                                      /BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656
Read full office action
Prosecution Timeline

Mar 02, 2023
Application Filed
Aug 14, 2025
Non-Final Rejection mailed — §103, §112
Nov 10, 2025
Response Filed
Jan 15, 2026
Final Rejection mailed — §103, §112
Mar 06, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

17/666,286
Patent 12633290
Domain and User Intent Specific Disambiguation of Transcribed Speech
4y 3m to grant Granted May 19, 2026
17/573,651
Patent 12562177
CONFERENCE ROOM SYSTEM AND AUDIO PROCESSING METHOD
4y 1m to grant Granted Feb 24, 2026
17/649,183
Patent 12561629
IDENTIFYING REGULATORY DATA CORRESPONDING TO EXECUTABLE RULES
4y 0m to grant Granted Feb 24, 2026
17/708,679
Patent 12505302
SYSTEMS AND METHODS RELATING TO MINING TOPICS IN CONVERSATIONS
3y 8m to grant Granted Dec 23, 2025
17/364,074
Patent 12468884
Machine Learning-Based Argument Mining and Classification
4y 4m to grant Granted Nov 11, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

2-3
Expected OA Rounds
64%
Grant Probability
99%
With Interview (+49.2%)
3y 0m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 93 resolved cases by this examiner. Grant probability derived from career allowance rate.