Last updated: April 19, 2026

Application No. 18/732,758

METHOD OF ENCODING/DECODING AUDIO SIGNAL AND DEVICE FOR PERFORMING THE SAME

Non-Final OA §102

Filed

Jun 04, 2024

Examiner

SHARMA, NEERAJ

Art Unit

2659

Tech Center

2600 — Communications

Assignee

ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

OA Round

1 (Non-Final)

Interview Optional

— +11.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 457 resolved cases, 2023–2026

Examiner Intelligence

SHARMA, NEERAJ View full profile →

Grants 85% — above average

Career Allow Rate

387 granted / 457 resolved

+22.7% vs TC avg

Moderate +12% lift

Without

With

+11.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

19 currently pending

Career history

476

Total Applications

across all art units

Statute-Specific Performance

§101

13.9%

-26.1% vs TC avg

§103

39.5%

-0.5% vs TC avg

§102

28.7%

-11.3% vs TC avg

§112

6.4%

-33.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 457 resolved cases

Office Action

§102

DETAILED ACTION

Introduction

1.	This office action is in response to Applicant's submission filed on 06/04/2024. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1-20 are currently pending and examined below. 

Drawings

2.	The drawings filed on 06/04/2024 have been accepted and considered by the Examiner. 

Information Disclosure Statement

3.	The Information Statement (IDS) filed on 06/04/2024 has been accepted/considered and is in compliance with the provisions of 37 CFR 1.97.

Priority

4.	The Applicants priority to Korean Patent Application # 10-2023-0077881, filed on June 19, 2023, has been accepted and considered in this office action. 

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(2) The claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

5.	Claims 1-20 are rejected under 35 U.S.C. 102 (a) (1) as being anticipated by Disch (U.S. Patent Application Publication # 2017/0256267 A1).

With regards to claim 1, Disch teaches a method of encoding an audio signal, the method comprising generating, based on the audio signal, a linear prediction coding (LPC) bitstream and a frequency-domain signal of the audio signal (Paragraphs 79-84 and figures 6-7, teach an audio encoder for encoding an audio signal comprising a first encoding processor for encoding a first audio signal portion in a frequency domain and a  prediction analyzer can be implemented as an LPC analyzer for determining LPC coefficients); 

generating, based on the LPC bitstream and the frequency-domain signal, a first
residual signal comprising information on a frequency envelope of the frequency-domain signal (Para 94, teaches an ACELT encoding path and LPC residual);

outputting a second residual signal by processing a first residual signal through one of a plurality of signal processing paths, wherein the plurality of signal processing paths comprises a first signal processing path that comprises a noise shaping operation (Para 103, teaches a noise shaping block. The noise shaping block is controlled by quantized LPC coefficients as generated by block. The quantized LPC coefficients used for noise shaping perform a spectral shaping of the high-resolution spectral values or spectral lines directly encoded and the result of block is similar to the spectrum of a signal subsequent to an LPC filtering stage operating in the time domain such as an LPC analysis filtering block);  

and a second signal processing path that does not comprise the noise shaping operation (Para 104, teaches a cross-processor comprising a spectral decoder for calculating a decoded version of the first encoded signal portion. The spectral decoder comprises an inverse noise shaping block, a gap filling decoder, a TNS/TTS synthesis block and the IMDCT block. A noise shaping block undoes the noise shaping performed by block based on the quantized LPC coefficients). 

With regards to claim 2, Disch teaches the method of claim 1, wherein the outputting of the second residual signal comprises selecting one of the plurality of signal processing paths based on a prediction gain of the first residual signal (Para 93, teaches that the decision whether ACELP or TCX path should be chosen is performed in the switching decision by simulating the ACELP and TCX encoder and switch to the better performing branch. For this, the SNR of the ACELP and TCX branch are estimated based on an ACELP and TCX encoder/decoder simulation. The TCX encoder/decoder simulation is performed without TNS/TTS analysis, IGF encoder, quantization-loop/arithmetic coder, or without any TCX decoder, Instead, the TCX SNR is estimated using an estimation of the quantizer distortion in the shaped MDCT domain. The ACELP encoder/decoder simulation is performed using only a simulation of the adaptive codebook and innovative codebook. The ACELP SNR is simply estimated by computing the distortion introduced by a LTP filter in the weighted signal domain and scaling this distortion by a constant factor. Thus, the complexity is greatly reduced compared to an approach where TCX and ACELP encoding is executed in parallel. The branch with the higher SNR is chosen for the subsequent complete encoding run). 

With regards to claim 3, Disch teaches the method of claim 2, wherein the selecting of one of the plurality of signal processing paths comprises selecting the first signal processing path when the prediction gain of the first residual signal is greater than or equal to a preset threshold value (Para 154, teaches that in case of stereo channel pairs an additional joint stereo processing is applied. This is done, because for a certain destination range the signal can a highly correlated panned sound source. In case the source regions chosen for this particular region are not well correlated, although the energies are matched for the destination regions, the spatial image can suffer due to the uncorrelated source regions. The encoder analyses each destination region energy band, typically performing a cross-correlation of the spectral values and if a certain threshold is exceeded, sets a joint flag for this energy band); 

and selecting the second signal processing path when the prediction gain of the first residual signal is less than the preset threshold value (Para 154, further teaches that in the decoder the left and right channel energy bands are treated individually if this joint stereo flag is not set. In case the joint stereo flag is set, both the energies and the patching are performed in the joint stereo domain. The joint stereo information for the IGF regions is signaled similar the joint stereo information for the core coding, including a flag indicating in case of prediction if the direction of the prediction is from downmix to residual or vice versa). 

With regards to claim 4, Disch teaches the method of claim 3, wherein the outputting of the second residual signal further comprises outputting flag information indicating whether the noise shaping operation has been performed, based on a type of a signal processing path, among the plurality of signal processing paths, through which the first residual signal is processed (Para 154, teaches that in case of stereo channel pairs an additional joint stereo processing is applied. This is done, because for a certain destination range the signal can a highly correlated panned sound source. In case the source regions chosen for this particular region are not well correlated, although the energies are matched for the destination regions, the spatial image can suffer due to the uncorrelated source regions. The encoder analyses each destination region energy band, typically performing a cross-correlation of the spectral values and if a certain threshold is exceeded, sets a joint flag for this energy band. In the decoder the left and right channel energy bands are treated individually if this joint stereo flag is not set. In case the joint stereo flag is set, both the energies and the patching are performed in the joint stereo domain. The joint stereo information for the IGF regions is signaled similar the joint stereo information for the core coding, including a flag indicating in case of prediction if the direction of the prediction is from downmix to residual or vice versa). 

With regards to claim 5, Disch teaches the method of claim 4, wherein the first signal processing path is a path through which to output a complex-LPC (C-LPC) bitstream based on the first residual signal (Para 164, teaches that first, TNS calculates a set of prediction coefficients using “forward prediction” in the transform domain, e.g. MDCT); 

and output a signal, which is obtained by removing noise from the first residual signal through the noise shaping operation, as the second residual signal, based on the C-LPC bitstream (Para 164, further teaches that these coefficients are then used for flattening the temporal envelope of the signal. As the quantization affects the TNS filtered spectrum, also the quantization noise is temporarily flat. By applying the invers TNS filtering on decoder side, the quantization noise is shaped according to the temporal envelope of the TNS filter and therefore the quantization noise gets masked by the transient). 

With regards to claim 6, Disch teaches the method of claim 5, wherein the second signal processing path is a path through which to output a real part of the first residual signal as the second residual signal (Para 164, teaches that these coefficients are then used for flattening the temporal envelope of the signal. As the quantization affects the TNS filtered spectrum, also the quantization noise is temporarily flat. By applying the invers TNS filtering on decoder side, the quantization noise is shaped according to the temporal envelope of the TNS filter and therefore the quantization noise gets masked by the transient. Para 45, teaches that MDCT is the real part of the complex transform is transmitted). 

With regards to claim 7, Disch teaches the method of claim 1, wherein the outputting of the second residual signal further comprises outputting a scale factor for quantizing the second residual signal for each sub-band (Para 170, teaches that the spectrum is subdivided in scale factor bands SCB where there are seven scale factor bands SCB1 to SCB7. The scale factor bands can be AAC scale factor bands which are defined in the AAC standard and have an increasing bandwidth to upper frequencies); 

and outputting, based on the scale factor and the second residual signal, a quantization bitstream obtained by quantizing the second residual signal (Para 170, teaches that the core encoder operates in the full range, but encodes a significant amount of zero spectral values, i.e., these zero spectral values are quantized to zero or are set to zero before quantizing or subsequent to quantizing). 

With regards to claim 8, Disch teaches a method of decoding an audio signal, the method comprising generating a second residual signal based on a quantization bitstream obtained by quantizing the second residual signal and a scale factor for quantizing the second residual signal for each sub-band (See figure 14B, wherein the decoder generates first decoded signal and second decoded signal); 

and outputting a first residual signal by processing the second residual signal through one of a plurality of signal restoration paths, wherein the plurality of signal restoration paths comprises a first signal restoration path that comprises an inverse noise shaping operation (Figure 14B, also shows the TCX decoder which performs inverse noise shaping); 

and a second signal restoration path that does not comprise the inverse noise shaping operation (Figure 14B, also shows the ACELP decoder which does not perform inverse noise shaping). 

With regards to claim 9, Disch teaches the method of claim 8, wherein the outputting of the first residual signal comprises selecting one of the plurality of signal restoration paths based on flag information that indicates whether a noise shaping operation has been performed (Paragraphs 31 and 163, teach that joint stereo flags are transmitted that indicate whether UR or M/S as an example for the general joint stereo coding shall be used. In the decoder, first, the core signal is decoded as indicated by the joint stereo flags for the core bands. Second, the core signal is stored in both L/R and M/S representation. For the IGF tile filling, the source tile representation is chosen to fit the target tile representation as indicated by the joint stereo information for the IGF bands). 

With regards to claim 10, Disch teaches the method of claim 9, wherein the outputting of the first residual signal comprises when the first signal restoration path is selected, outputting a signal, in which noise is synthesized with the second residual signal through the inverse noise shaping operation, as a first residual signal, based on the second residual signal and a complex linear prediction coding (LPC) bitstream (Figure 14B and para 122, show the TCX decoder path which performs inverse noise shaping and outputs the first decoded audio signal portion). 

With regards to claim 11, Disch teaches the method of claim 9, wherein the outputting of the first residual signal comprises when the second signal restoration path is selected, outputting the second residual signal as the first residual signal (Figure 14B and para 12, show the ACELP decoder path which outputs the second decoded audio signal portion). 

With regards to claim 12, Disch teaches the method of claim 10, wherein the outputting of the first residual signal further comprises restoring the audio signal based on an LPC bitstream and the first residual signal (Para 124, teaches that the ACELP or time domain low band decoder comprises an ACELP decoder stage for obtaining decoded gains and the innovative codebook information. Additionally, an ACELP adaptive codebook stage is provided and a subsequent ACELP post-processing stage and a final synthesis filter such as LPC synthesis filter, which is again controlled by the quantized LPC coefficients obtained from the bitstream demultiplexer corresponding to the encoded signal parser). 

With regards to claim 13, Disch teaches the method of claim 12, wherein the restoring of the audio signal comprises outputting a frequency-domain signal of the audio signal based on the LPC bitstream and the first residual signal (Para 124, teaches that the output of the LPC synthesis filter is input into a de-emphasis stage for canceling or undoing the processing introduced by the pre-emphasis stage of the pre-processor. The result is the time domain output signal at a low sampling rate and a low band); 

and outputting a time-domain signal of the audio signal based on the frequency-domain signal and the flag information (Para 124, further teaches that in case the frequency domain output is necessitated, the switch is in the indicated position and the output of the de-emphasis stage is introduced into the upsampler and then mixed with the high bands from the time domain bandwidth extension decoder).

With regards to claim 14, Disch teaches the method of claim 13, wherein the outputting of the time-domain signal of the audio signal comprises removing time-domain aliasing (TDA) of the audio signal based on the frequency-domain signal and the flag information (Para 45, teaches that it is advantageous to use complex TNS/TTS filtering. Thereby, the temporal aliasing artifacts of a critically sampled real representation, like MDCT, are avoided. On the decoder-side, however, it is possible to estimate the imaginary part of the transform using MDCT spectra of preceding or subsequent frames so that, on the decoder-side, the complex filter can be again applied in the inverse prediction over frequency and, specifically, the prediction over the border between the source range and the reconstruction range and also over the border between frequency-adjacent frequency tiles within the reconstruction range). 

With regards to claims 15-20, these are device claims for the corresponding method claims 1-7. These two sets of claims are related as method and apparatus of using the same, with each claimed system element's function corresponding to the claimed method step. Accordingly, claims 15-20 are similarly rejected under the same rationale as applied above with respect to method claims 1-7.

Conclusion

6.	The following prior art, made of record but not relied upon, is considered pertinent to applicant's disclosure: Wang (U.S. Patent Application Publication # 2017/0018277 A1), Kolesnik (U.S. Patent # 6263312 B1). These references are also included in the PTO-892 form attached with this office action.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. If you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). In case you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NEERAJ SHARMA whose contact information is given below.  The examiner can normally be reached on Monday to Friday 8 am to 5 pm. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Louis-Desir can be reached on 571-272-7799 (Direct Phone).  The fax number for the organization where this application or proceeding is assigned is 571-273-8300.

/NEERAJ SHARMA/
Primary Examiner, Art Unit 2659
571-270-5487 (Direct Phone)
571-270-6487 (Direct Fax)
neeraj.sharma@uspto.gov (Direct Email)

Read full office action

Prosecution Timeline

Jun 04, 2024

Application Filed

Feb 21, 2026

Non-Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/582,462

Patent 12597428

DISPLAY DEVICE, CONTROL METHOD OF DISPLAY DEVICE, AND RECORDING MEDIUM

2y 5m to grant Granted Apr 07, 2026

18/670,148

Patent 12591736

FINE-TUNED LARGE LANGUAGE MODELS FOR CAPABILITY CONTROLLER

2y 5m to grant Granted Mar 31, 2026

18/453,338

Patent 12579983

SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM

2y 5m to grant Granted Mar 17, 2026

18/339,670

Patent 12573403

SCENE-AWARE SPEECH RECOGNITION USING VISION-LANGUAGE MODELS

2y 5m to grant Granted Mar 10, 2026

18/016,732

Patent 12566076

AD-HOC NAVIGATION INSTRUCTIONS

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

85%

Grant Probability

96%

With Interview (+11.5%)

2y 9m

Median Time to Grant

Low

PTA Risk

Based on 457 resolved cases by this examiner. Grant probability derived from career allow rate.