Last updated: May 29, 2026

Application No. 18/747,007

METHOD OF ENCODING/DECODING SPEECH SIGNAL AND DEVICE FOR PERFORMING THE SAME

Non-Final OA §102

Filed

Jun 18, 2024

Priority

Jun 27, 2023 — RE 10-2023-0082941

Examiner

GODBOLD, DOUGLAS

Art Unit

2655

Tech Center

2600 — Communications

Assignee

UIF (University Industry Foundation), Yonsei University

OA Round

1 (Non-Final)

Interview Optional

— +10.6% interview lift. Interview lift (+10.6%) is below the 15.0% threshold. A written response is recommended.

Based on 1089 resolved cases, 2023–2026

Examiner Intelligence

GODBOLD, DOUGLAS View full profile →

Grants 83% — above average

Career Allowance Rate

906 granted / 1089 resolved

+21.2% vs TC avg

Moderate +11% lift

Without

With

+10.6%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

18 currently pending

Career history

1106

Total Applications

across all art units

Statute-Specific Performance

§101

6.5%

-33.5% vs TC avg

§103

76.6%

+36.6% vs TC avg

§102

7.0%

-33.0% vs TC avg

§112

4.7%

-35.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1089 resolved cases

Office Action

§102

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 18 June 2024 in reference to application 18/747,007.  Claims 1-14 are pending and have been examined.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-14 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Skordilis et al. (US PAP 2021/0074308).

Consider claim 1, Skordilis teaches A method of encoding a speech signal (abstract), the method comprising: 
outputting, based on a first input speech signal of a previous timepoint and a second input speech signal of a current timepoint, a predicted signal that predicts the second input speech signal from the first input speech signal (0055, LTP engine generates a prediction for a current portion of the signal based on a previous portion of the signal); and 
obtaining, based on the second input speech signal and the predicted signal, a residual signal by removing a correlation between the first input speech signal and the second input speech signal from the second input speech signal (0057, residual signal encodes what is remaining in the signal after predicted components are removed, i.e. the correlation between the first and second signal. ).

Consider claim 2, Skordilis teaches The method of claim 1, wherein the first input speech signal has a same signal length as the second input speech signal, and a greatest correlation with the second input speech signal (0055, length is based on pitch period, which is the same as the prediction length in the equation in 0055, where signal repeats itself, i.e. correlation, also see 0093-97, where correlation is calculated explicitly).

Consider claim 3, Skordilis teaches The method of claim 1, wherein the outputting of the predicted signal comprises: 
extracting feature information for predicting the second input speech signal, based on the first input speech signal and the second input speech signal (0053-55, 0063-064 various features which may represent the signal); 
predicting a kernel based on the feature information (equation in 0055, term “g”); and 
generating the predicted signal based on the kernel and the first input speech signal, wherein the kernel is a weight applied to the first input speech signal when predicting the second input speech signal (equation in 0055, term g is a gain or weight which is applied to the components from the previous portion to predict the current portion).

Consider claim 4, Skordilis teaches the method of claim 3, further comprising outputting a bitstream, wherein the bitstream comprises: 
a first bitstream encoding the feature information (0054, sending LP coefficients to decoder, 0068-69, feature coding); 
a second bitstream encoding a delay value (0069, parameters encoded including pitch lag,); and 
a third bitstream encoding the residual signal (0057, sending residual codebook index to decoder, also see 0068-69), 
wherein the delay value indicates a degree to which the first input speech signal is delayed from the second input speech signal (0093-97, where  delay is calculated based on correlation, denoted in 0055 as T, which also corresponds to pitch period, or pitch lag).

Consider claim 5, Skordilis teaches The method of claim 4, wherein the outputting of the bitstream comprises: 	
quantizing the feature information and the residual signal (0069-70 quantizing feature vectors); 
outputting the first bitstream by encoding quantized feature information (0069-70 quantizing feature vectors and encoding); and 
generating the third bitstream by encoding a quantized residual signal (0057-58, 0069, codebooks are used to encode the residual signal.).

Consider claim 6, Skordilis teaches a method of decoding a speech signal (abstract), the method comprising: 
receiving bitstreams from an encoder (0058, decoding, 0072, features sent to decoder); 
outputting, based on a first bitstream and a second bitstream, a predicted signal that predicts a second input speech signal of a current timepoint from a first input speech signal of a previous timepoint (0058, 0101, LTP decoding predicts egments based on LTP features); and 
outputting a restored speech signal obtained by restoring the second input speech signal, based on the predicted signal and a third bitstream (9958, 0106-0108, generating a predicted speech signal based on LTP decoding and decoding of other parameters such as LP encodings), 
wherein the first bitstream encodes feature information for predicting the second input speech signal (0057 and 0069, parameters encoded including pitch lag and index needed for LTP decoding), 
wherein the second bitstream encodes a delay value indicating a degree to which the first input speech signal is delayed from the second input speech signal (0093-97, where  delay is calculated based on correlation, denoted in 0055 as T, which also corresponds to pitch period, or pitch lag), and 
wherein the third bitstream encodes a residual signal obtained by removing a correlation between the first input speech signal and the second input speech signal from the second input speech signal (0057, sending residual codebook index to decoder, also see 0068-69).

Consider claim 7, Skordilis teaches the method of claim 6, wherein the first input speech signal has a same signal length as the second input speech signal, and a greatest correlation with the second input speech signal (0055, length is based on pitch period, which is the same as the prediction length in the equation in 0055, where signal repeats itself, i.e. correlation, also see 0093-97, where correlation is calculated explicitly).

Consider claim 8, Skordilis teaches The method of claim 6, wherein the outputting of the predicted signal comprises: 
obtaining the first input speech signal based on the second bitstream (0058, 0101, LTP decoding predicts frames based on LTP features including pitch lag); and 
generating the predicted signal based on the first bitstream and the first input speech signal (0058, 0101, LTP decoding predicts segments based on LTP features and previous segments).

Consider claim 9, Skordilis teaches The method of claim 1, wherein the outputting of the predicted signal comprises: 
predicting a kernel based on the first bitstream (equation in 0055, term “g”); and 
generating the predicted signal based on the kernel and the first input speech signal, wherein the kernel is a weight applied to the first input speech signal when predicting the second input speech signal (equation in 0055, term g is a gain or weight which is applied to the components from the previous portion to predict the current portion, also used in decoding, 0058).

Consider claim 10, Skordilis teaches A device for encoding a speech signal, the device comprising: 
a memory configured to store one or more instructions (0168, memories); and 
a processor configured to execute the one or more instructions (0169, processors), wherein, when the one or more instructions are executed, the processor is configured to perform a plurality of operations, wherein the plurality of operations comprises: 
outputting, based on a first input speech signal of a previous timepoint and a second input speech signal of a current timepoint, a predicted signal that predicts the second input speech signal from the first input speech signal (0055, LTP engine generates a prediction for a current portion of the signal based on a previous portion of the signal); and 
obtaining, based on the second input speech signal and the predicted signal, a residual signal by removing a correlation between the first input speech signal and the second input speech signal from the second input speech signal (0057, residual signal encodes what is remaining in the signal after predicted components are removed, i.e. the correlation between the first and second signal).

Claim 11 contains similar limitations a claim 2 is therefore rejected for the same reasons.

Claim 12 contains similar limitations a claim 3 is therefore rejected for the same reasons.

Claim 13 contains similar limitations a claim 4 is therefore rejected for the same reasons.

Claim 14 contains similar limitations a claim 5 is therefore rejected for the same reasons.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Onjanpera (US Patent 7,933,767) teaches a similar method of encoding audio signals.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451. The examiner can normally be reached 6:30am-5pm Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached at (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DOUGLAS GODBOLD
Examiner
Art Unit 2655



/DOUGLAS GODBOLD/           Primary Examiner, Art Unit 2655

Read full office action

Prosecution Timeline

Jun 18, 2024

Application Filed

Mar 02, 2026

Non-Final Rejection mailed — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/384,009

Patent 12640138

REAL-TIME VOICE RECOGNITION METHOD, MODEL TRAINING METHOD, APPARATUSES, DEVICE, AND STORAGE MEDIUM

2y 7m to grant Granted May 26, 2026

18/449,237

Patent 12626690

SYSTEMS, METHODS, AND DEVICES FOR LOW-POWER AUDIO SIGNAL DETECTION

2y 9m to grant Granted May 12, 2026

18/365,765

Patent 12614553

METHOD, APPARATUS, ELECTRONIC DEVICE, AND MEDIUM FOR SPEECH PROCESSING

2y 8m to grant Granted Apr 28, 2026

18/429,150

Patent 12614037

LARGE LANGUAGE MODEL INTERFACE FOR COMPLEX DATABASES

2y 2m to grant Granted Apr 28, 2026

18/739,304

Patent 12613919

Error Correcting of Programming Code Generated Through Integration with Generative Artificial Intelligence

1y 10m to grant Granted Apr 28, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

83%

Grant Probability

94%

With Interview (+10.6%)

2y 9m (~10m remaining)

Median Time to Grant

Low

PTA Risk

Based on 1089 resolved cases by this examiner. Grant probability derived from career allowance rate.