Last updated: May 29, 2026
Application No. 18/351,634
AUDIO ENCODING METHOD AND APPARATUS, AND AUDIO DECODING METHOD AND APPARATUS

Non-Final OA §103§112
Filed
Jul 13, 2023
Priority
Jan 21, 2021 — CN 202110080645.0 +1 more
Examiner
BOGGS JR., JAMES
Art Unit
2657
Tech Center
2600 — Communications
Assignee
Vivo Mobile Communication Co., Ltd.
OA Round
3 (Non-Final)
This examiner grants 61% of cases after interview

— +35.9% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 112 resolved cases, 2023–2026
Examiner Intelligence

BOGGS JR., JAMES View full profile →
Grants 61% of resolved cases
Career Allowance Rate
68 granted / 112 resolved
-1.3% vs TC avg
Strong +36% interview lift
Without
With
+35.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
13 currently pending
Career history
136
Total Applications
across all art units
Statute-Specific Performance

§101
0.8%
-39.2% vs TC avg
§103
87.3%
+47.3% vs TC avg
§102
1.6%
-38.4% vs TC avg
§112
3.8%
-36.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 112 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on January 12, 2026, has been entered.
Response to Arguments
Applicant’s arguments, filed January 12, 2026, regarding the 35 U.S.C. 101 rejections of claims 1, 4 – 5, 8, 11 – 12, 15 and 18 – 19 have been fully considered and are persuasive.  The 35 U.S.C. 101 rejections of claims 1, 4 – 5, 8, 11 – 12, 15 and 18 – 19 of November 10, 2025, have been withdrawn. 
Applicant’s arguments, filed January 12, 2026, regarding the rejections of claims 1, 4 – 5, 8, 11 – 12, 15 and 18 – 19 under 35 U.S.C. 102(a)(1) and 35 U.S.C. 103 have been considered but they are not persuasive.
On page 11 of Applicant’s response, Applicant argues “ISO/IEC 14496-3 fails to disclose the feature "determining a code number corresponding to each element in the to-be-encoded sequence, wherein the code number is an absolute value of a value corresponding to the element in the to-be-encoded sequence".”.
However, "ISO/IEC 14496-3:2001 - Information technology - Coding of audio-visual objects - Part 3: Audio" ("ISO/IEC 14496-3:2001 - Information technology - Coding of audio-visual objects - Part 3: Audio", December 15, 2001, International Organization for Standardization (ISO).), hereinafter "ISO/IEC 14496-3", recites, in section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source.", disclosing “determining a code number corresponding to each element in the to-be-encoded sequence”, where Huffman coding the differentially encoded scale factors reads on determining a code number corresponding to each element in the to-be-encoded sequence.  "ISO/IEC 14496-3" further recites, in section 4.B.11.6, lines 1-8, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26.  There are two codebooks for each maximum absolute value, with each representing a distinct probability distribution function. The best fit is always chosen. In order to save on codebook storage (an important consideration in a mass-produced decoder), most codebooks represent unsigned values. For these codebooks the magnitude of the coefficients is Huffman coded and the sign bit of each non-zero coefficient is appended to the codeword.", disclosing “the code number is an absolute value of a value corresponding to the element in the to-be-encoded sequence”, where Huffman coding the magnitude of the coefficients reads on the code number being an absolute value of a value corresponding to the element in the to-be-encoded sequence.
On page 12 of Applicant’s response, Applicant argues “ISO/IEC 14496-3 fails to teach the usage of the "code number" in encoding the to-be-encoded sequence to obtain a second bitstream, and fails to disclose the feature "encoding, based on the code numbers and preset coding tables corresponding to preset coding orders, the to-be-encoded sequence to obtain a second bitstream".”.
However, "ISO/IEC 14496-3" recites, in section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source.", disclosing “determining a code number corresponding to each element in the to-be-encoded sequence”, where Huffman coding the differentially encoded scale factors reads on determining a code number corresponding to each element in the to-be-encoded sequence.  "ISO/IEC 14496-3" further recites, in section 4.B.11.6, lines 1-8, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26.  There are two codebooks for each maximum absolute value, with each representing a distinct probability distribution function. The best fit is always chosen. In order to save on codebook storage (an important consideration in a mass-produced decoder), most codebooks represent unsigned values. For these codebooks the magnitude of the coefficients is Huffman coded and the sign bit of each non-zero coefficient is appended to the codeword.", disclosing “encoding, based on the code numbers and preset coding tables corresponding to preset coding orders, the to-be-encoded sequence to obtain a second bitstream”, where the Huffman code drawn from one of 11 codebooks reads on encoding based on the code numbers and preset coding tables corresponding to preset coding orders, and Huffman coding being used to represent n-tuples of quantized coefficients reads on encoding the to-be-encoded sequence to obtain a second bitstream.
On page 12 of Applicant’s response, Applicant argues “ISO/IEC 14496-3 fails to teach the second bitstream comprising a first sub-bitstream and a second sub-bitstream, the second sub-bitstream is an encoded bitstream corresponding to K coding orders.”
However, Sugiura et al. (US Patent No. 10,840,944), hereinafter Sugiura, recites, in column 5, line 66 - column 6, line 8, “The encoding apparatus 100 of the first embodiment variable-length encodes a sequence of non-negative integer values using a plurality of predetermined code trees according to a predetermined rule to thereby implement an encoding process resulting in a shorter bit length than Golomb-Rice encoding on a sequence of non-negative integer values having a distribution more heavily biased than a Laplacian distribution. Here, the “code tree” refers to a graph representing a predetermined rule as to which code is to be assigned to an inputted non-negative integer value.”, recites, in column 7, lines 11-28, “For example, in the case of K=2 in FIG. 4, the rule applicable to an inputted integer value and a code corresponding to the integer value is such a rule that when an inputted integer value 0 is encoded by the code tree T(0), ‘1’ is obtained as a code corresponding to the inputted integer value 0, when an inputted integer value 1 is encoded by the code tree T(0), ‘100’ is obtained as a code corresponding to the inputted integer value 1, when an inputted integer value 2 is encoded by the code tree T(0), ‘10000’ is obtained as a code corresponding to the inputted integer value 2, when the inputted integer value 0 is encoded by the code tree T(1), no code corresponding to the inputted integer value 0 is obtained, when the inputted integer value 1 is encoded by the code tree T(1), ‘01’ is obtained as a code corresponding to the inputted integer value 1, and when the inputted integer value 2 is encoded by the code tree T(1), ‘0100’ is obtained as a code corresponding to the inputted integer value 2.”, recites, in column 6, lines 13-23, “A sequence of integer values inputted to the encoding apparatus 100 is inputted to the integer encoding part 110 by N samples (N is a natural number) at a time. The inputted sequence of integer values is assumed to be an integer sequence x_1, x_2, . . . , x_N. The integer encoding part 110 encodes the integer sequence x_1, x_2, . . . , x_N through an encoding process using the following code tree based on an encoding parameter K which is a natural number equal to or larger than 2 inputted by predetermined means (not shown) to obtain a code and outputs the obtained code as an integer code (S110).”, and recites, in column 21, lines 51-63, “An integer sequence x_1, x_2, . . . , x_N of the sequence of integer values inputted to the encoding apparatus 400 by N samples at a time is inputted to the parameter determination part 420. Based on the inputted integer sequence x_1, x_2, . . . , x_N, the parameter determination part 420 obtains and outputs a Rice parameter r corresponding to the integer sequence and a parameter code which is a code representing the parameter (S420). The parameter code may be obtained by encoding a Rice parameter so that the decoding apparatus 450 decodes the parameter code to thereby obtain the Rice parameter r determined by the parameter determination part 420.”; disclosing “the second bitstream comprising a first sub-bitstream and a second sub-bitstream, the second sub-bitstream is an encoded bitstream corresponding to K coding orders”, where the integer code reads on a first sub-bitstream and the parameter code reads on a second sub-bitstream.
On page 12 of Applicant’s response, Applicant argues “Moreover, ISO/IEC 14496-3 teaches in table 4.A.1 a mapping between indexs and codewords. ISO/IEC 14496-3 also teaches a table 4.A.15 for transition of differential scalefactor to index. That is, ISO/IEC 14496-3 requires two tables to implement mapping of differential scalefactors to codewords.  ISO/IEC 14496-3 fails to teach that the coding table comprises a mapping relationship between code numbers and code values; for any code number, querying the K coding tables to obtain a target code value corresponding to the code number, the target code value being a code value with the smallest code length among K code values obtained by querying the K coding tables; and sorting and packing all the target code values to obtain the first sub-bitstream.”.
However, "ISO/IEC 14496-3" recites, in section 4.B.11.6, lines 1-4, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26.", disclosing “determining K preset coding tables corresponding to the K coding orders, wherein the coding tables and the coding orders are in one-to-one correspondence, and the coding table comprises a mapping relationship between code numbers and code values”, where the Huffman code drawn from one of 11 codebooks reads on determining K preset coding tables corresponding to the K coding orders, wherein the coding tables and the coding orders are in one-to-one correspondence, and Huffman coding being used to represent n-tuples of quantized coefficients reads on the coding table comprises a mapping relationship between code numbers and code values.  Sugiura recites, in column 1, lines 10-13, "The present invention relates to a technique for encoding or decoding a sample sequence composed of integer values such as a sample sequence of voice or acoustic time-series digital signals.", recites, in column 1, lines 49-56, "In the above-described reversible encoding, it is a Golomb-Rice code that has been used as one of simplest variable length codes. When a sequence of integer values belongs to a Laplacian distribution, that is, when an appearance probability of integer values is exponentially lowered with respect to the magnitude of the values, the Golomb-Rice code is known to achieve a minimum expected bit length (minimum bit length).", recites, in column 21, lines 64-67, "The parameter determination part 420 obtains the Rice parameter r according to equation (5) using, for example, each integer value included in the inputted integer sequence x_1, x_2, . . . , x_N.", and recites, in column 22, lines 8-11, "The Rice parameter r determined in equation (5) minimizes an estimate value of a total bit length at the time of Golomb-Rice encoding estimated from equation (2) for the integer sequence x_1, x_2, . . . , x_N.", disclosing “for any code number, querying the K coding tables to obtain a target code value corresponding to the code number, the target code value being a code value with the smallest code length among K code values obtained by querying the K coding tables”, where determining the Rice parameter r to minimize an estimate value of a total bit length at the time of Golomb-Rice encoding for an integer sequence reads on querying the K coding tables to obtain a target code value corresponding to the code number, the target code value being a code value with the smallest code length among K code values obtained by querying the K coding tables.  "ISO/IEC 14496-3" further recites, in section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream", and shows a "Bitstream Formatter" in Figure 4.1, disclosing “sorting and packing all the target code values to obtain the first sub-bitstream”, where a bitstream formatter generating a bitstream that describes the quantized audio spectra reads on sorting and packing all the target code values to obtain the first sub-bitstream.
On page 13 of Applicant’s response, Applicant argues “ISO/IEC 14496-3 fails to teach obtaining an encoded bitstream based on a magnitude relationship between each element in the to-be-encoded sequence and a first preset value. Also, ISO/IEC 14496-3 fails to teach the claimed second/third bitstream, and consequently fails to teach the feature "sorting and packing the first bitstream, the second bitstream, and a third bitstream to obtain an audio encoded bitstream".”.
However, "ISO/IEC 14496-3" recites, in section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source.", and recites, in section 4.B.11.6, lines 1-8, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26.  There are two codebooks for each maximum absolute value, with each representing a distinct probability distribution function. The best fit is always chosen. In order to save on codebook storage (an important consideration in a mass-produced decoder), most codebooks represent unsigned values. For these codebooks the magnitude of the coefficients is Huffman coded and the sign bit of each non-zero coefficient is appended to the codeword.", disclosing “wherein the third bitstream is an encoded bitstream obtained based on a magnitude relationship between each element in the to-be-encoded sequence and a first preset value”, where appending a sign bit of each non-zero coefficient to the codeword, where the magnitude of the coefficients is Huffman coded, reads on the third bitstream being an encoded bitstream obtained based on a magnitude relationship between each element in the to-be-encoded sequence and a first preset value, for a preset value of zero.
Therefore, rejections of claims 1, 4 – 5, 8, 11 – 12, 15 and 18 – 19 under 35 U.S.C. 103 as being unpatentable over "ISO/IEC 14496-3" in view of Sugiura are maintained.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claims 4, 11 and 18 are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends.
  	Regarding claim 4, claim 4 recites the limitations "wherein the determining, based on audio parameters of a to-be-encoded audio signal, a to-be-encoded sequence and a first bitstream further comprises: determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" in lines 1-13.  However, claim 4 depends from claim 1, and the claim 4 limitations "wherein the determining, based on audio parameters of a to-be-encoded audio signal, a to-be-encoded sequence and a first bitstream further comprises: determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" do not further limit claim 1 because claim 1 recites the limitations "wherein the determining, based on the audio parameters of the to-be-encoded audio signal, the to-be-encoded sequence and the first bitstream comprises: sorting and packing binary numbers corresponding to the first parameter to obtain the first bitstream; determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to- be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" in lines 16-27.
  	Regarding claim 11, claim 11 recites the limitations "wherein the determining, based on audio parameters of a to-be-encoded audio signal, a to-be-encoded sequence and a first bitstream further comprises: determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" in lines 1-13.  However, claim 11 depends from claim 8, and the claim 11 limitations "wherein the determining, based on audio parameters of a to-be-encoded audio signal, a to-be-encoded sequence and a first bitstream further comprises: determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" do not further limit claim 8 because claim 8 recites the limitations "wherein the determining, based on the audio parameters of the to-be-encoded audio signal, the to-be-encoded sequence and the first bitstream comprises: sorting and packing binary numbers corresponding to the first parameter to obtain the first bitstream; determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" in lines 18-29.
Regarding claim 18, claim 18 recites the limitations "wherein the determining, based on audio parameters of a to-be-encoded audio signal, a to-be-encoded sequence and a first bitstream further comprises: determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" in lines 1-13.  However, claim 18 depends from claim 1, and the claim 18 limitations "wherein the determining, based on audio parameters of a to-be-encoded audio signal, a to-be-encoded sequence and a first bitstream further comprises: determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" do not further limit claim 1 because claim 1 recites the limitations "wherein the determining, based on the audio parameters of the to-be-encoded audio signal, the to-be-encoded sequence and the first bitstream comprises: sorting and packing binary numbers corresponding to the first parameter to obtain the first bitstream; determining, based on the first parameter and sorting of the N second parameters, a first target value and N-1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters; and sorting and packing the first target value and N-1 second target values to obtain the to- be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence" in lines 16-27.
Applicant may cancel the claims, amend the claims to place the claims in proper dependent form, rewrite the claims in independent form, or present a sufficient showing that the dependent claims comply with the statutory requirements.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4 – 5, 8, 11 – 12, 15 and 18 – 19 are rejected under 35 U.S.C. 103 as being unpatentable over "ISO/IEC 14496-3:2001 - Information technology - Coding of audio-visual objects - Part 3: Audio" ("ISO/IEC 14496-3:2001 - Information technology - Coding of audio-visual objects - Part 3: Audio", December 15, 2001, International Organization for Standardization (ISO).), hereinafter "ISO/IEC 14496-3", in view of Sugiura et al. (US Patent No. 10,840,944), hereinafter Sugiura.
Regarding claim 1, "ISO/IEC 14496-3" discloses an audio encoding method, comprising:
determining, based on audio parameters of a to-be-encoded audio signal, a to-be-encoded sequence and a first bitstream, wherein the audio parameters comprise a first parameter and N second parameters, N being a positive integer, the first bitstream is obtained by encoding based on the first parameter, and the to-be-encoded sequence is obtained by encoding based on the first parameter and the N second parameters (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; The global gain reads on a first parameter, the scale factors read on N second parameters where N in a positive integer, coding the global gain as an 8-bit unsigned integer reads on obtaining the first bitstream by encoding based on the first parameter, and differentially encoding the scale factors relative to the previous scale factor or the global gain reads on obtaining the to-be-encoded sequence by encoding based on the first parameter and the N second parameters.);
determining a code number corresponding to each element in the to-be-encoded sequence (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; Huffman coding the differentially encoded scale factors reads on determining a code number corresponding to each element in the to-be-encoded sequence.),
wherein the code number is an absolute value of a value corresponding to the element in the to-be-encoded sequence (Section 4.B.11.6, lines 1-8, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26.  There are two codebooks for each maximum absolute value, with each representing a distinct probability distribution function. The best fit is always chosen. In order to save on codebook storage (an important consideration in a mass-produced decoder), most codebooks represent unsigned values. For these codebooks the magnitude of the coefficients is Huffman coded and the sign bit of each non-zero coefficient is appended to the codeword."; Huffman coding the magnitude of the coefficients reads on the code number being an absolute value of a value corresponding to the element in the to-be-encoded sequence.);
encoding, based on the code numbers and preset coding tables corresponding to preset coding orders, the to-be-encoded sequence to obtain a second bitstream (Section 4.B.11.6, lines 1-4, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26."; The Huffman code drawn from one of 11 codebooks reads on encoding based on the code numbers and preset coding tables corresponding to preset coding orders, and Huffman coding being used to represent n-tuples of quantized coefficients reads on encoding the to-be-encoded sequence to obtain a second bitstream.);
and sorting and packing the first bitstream, the second bitstream, and a third bitstream to obtain an audio encoded bitstream (Section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream"; Figure 4.1, "Bitstream Formatter"), A bitstream formatter generating a bitstream that describes the quantized audio spectra reads on sorting and packing the first bitstream, the second bitstream, and a third bitstream to obtain an audio encoded bitstream.);
wherein the third bitstream is an encoded bitstream obtained based on a magnitude relationship between each element in the to-be-encoded sequence and a first preset value (Section 4.B.11.6, lines 1-8, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26.  There are two codebooks for each maximum absolute value, with each representing a distinct probability distribution function. The best fit is always chosen. In order to save on codebook storage (an important consideration in a mass-produced decoder), most codebooks represent unsigned values. For these codebooks the magnitude of the coefficients is Huffman coded and the sign bit of each non-zero coefficient is appended to the codeword."; Appending a sign bit of each non-zero coefficient to the codeword, where the magnitude of the coefficients is Huffman coded, reads on the third bitstream being an encoded bitstream obtained based on a magnitude relationship between each element in the to-be-encoded sequence and a first preset value, for a preset value of zero.);
wherein the determining, based on the audio parameters of the to-be-encoded audio signal, the to-be-encoded sequence and the first bitstream comprises: sorting and packing binary numbers corresponding to the first parameter to obtain the first bitstream (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; Section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream"; Figure 4.1, "Bitstream Formatter"); The global gain reads on a first parameter, coding the global gain as an 8-bit unsigned integer reads on binary numbers corresponding to the first parameter, and a bitstream formatter generating a bitstream that describes the quantized audio spectra reads on sorting and packing binary numbers corresponding to the first parameter to obtain the first bitstream.);
determining, based on the first parameter and sorting of the N second parameters, a first target value and N–1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; The global gain reads on a first parameter, the scale factors read on N second parameters, the coded spectrum being divided into scale factor bands reads on sorting of the N second parameters, differentially encoding the first scale factor relative to the global gain reads on the first target value being generated based on the second parameter sorted first and the first parameter, and differentially encoding the scale factors relative to the previous scale factor reads on determining N–1 arrays with each of the arrays comprising two adjacent second parameters, where each scale factor and its previous scale factor read on adjacent second parameters.);
and sorting and packing the first target value and N–1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; Section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream"; Figure 4.1, "Bitstream Formatter"; Differentially encoding the first scale factor relative to the global gain reads on the first target value, where the first target value is sorted first in the to-be-encoded sequence, differentially encoding the scale factors relative to the previous scale factor reads on N–1 second target values, where each of the second target values is generated based on two adjacent second parameters in a corresponding array, and a bitstream formatter generating a bitstream that describes the quantized audio spectra reads on sorting and packing the first target value and N–1 second target values to obtain the to-be-encoded sequence.);
and the encoding, based on the code numbers and preset coding tables corresponding to preset coding orders, the to-be-encoded sequence to obtain a second bitstream (Section 4.6.3.3, lines 16-17, "There are eleven Huffman codebooks for the spectral data, as shown in Table 4.95. The codebooks are shown in Table 4.A.2 through Table 4.A.12.") comprises:
determining K preset coding tables corresponding to the K coding orders, wherein the coding tables and the coding orders are in one-to-one correspondence, and the coding table comprises a mapping relationship between code numbers and code values (Section 4.B.11.6, lines 1-4, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26."; The Huffman code drawn from one of 11 codebooks reads on determining K preset coding tables corresponding to the K coding orders, wherein the coding tables and the coding orders are in one-to-one correspondence, and Huffman coding being used to represent n-tuples of quantized coefficients reads on the coding table comprises a mapping relationship between code numbers and code values.);
for any code number, querying the K coding tables to obtain a target code value corresponding to the code number (Section 4.B.11.6, lines 1-4, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26."; Huffman coding being used to represent n-tuples of quantized coefficients, where the Huffman code is drawn from one of 11 codebooks, reads on querying the K coding tables to obtain a target code value corresponding to the code number.);
and sorting and packing all the target code values to obtain the first sub-bitstream (Section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream"; Figure 4.1, "Bitstream Formatter"), A bitstream formatter generating a bitstream that describes the quantized audio spectra reads on sorting and packing all the target code values to obtain the first sub-bitstream.).
"ISO/IEC 14496-3" does not specifically disclose: wherein the number of coding orders is K, and in a case that K is greater than 1, the second bitstream comprises a first sub-bitstream and a second sub-bitstream, the second sub-bitstream is an encoded bitstream corresponding to K coding orders; for any code number, querying the K coding tables to obtain a target code value corresponding to the code number, the target code value being a code value with the smallest code length among K code values obtained by querying the K coding tables.
Sugiura teaches:
wherein the number of coding orders is K, and in a case that K is greater than 1, the second bitstream comprises a first sub-bitstream and a second sub-bitstream, the second sub-bitstream is an encoded bitstream corresponding to K coding orders (Column 7, lines 11-28, “For example, in the case of K=2 in FIG. 4, the rule applicable to an inputted integer value and a code corresponding to the integer value is such a rule that when an inputted integer value 0 is encoded by the code tree T(0), ‘1’ is obtained as a code corresponding to the inputted integer value 0, when an inputted integer value 1 is encoded by the code tree T(0), ‘100’ is obtained as a code corresponding to the inputted integer value 1, when an inputted integer value 2 is encoded by the code tree T(0), ‘10000’ is obtained as a code corresponding to the inputted integer value 2, when the inputted integer value 0 is encoded by the code tree T(1), no code corresponding to the inputted integer value 0 is obtained, when the inputted integer value 1 is encoded by the code tree T(1), ‘01’ is obtained as a code corresponding to the inputted integer value 1, and when the inputted integer value 2 is encoded by the code tree T(1), ‘0100’ is obtained as a code corresponding to the inputted integer value 2.”; Column 6, lines 13-23, “A sequence of integer values inputted to the encoding apparatus 100 is inputted to the integer encoding part 110 by N samples (N is a natural number) at a time. The inputted sequence of integer values is assumed to be an integer sequence x_1, x_2, . . . , x_N. The integer encoding part 110 encodes the integer sequence x_1, x_2, . . . , x_N through an encoding process using the following code tree based on an encoding parameter K which is a natural number equal to or larger than 2 inputted by predetermined means (not shown) to obtain a code and outputs the obtained code as an integer code (S110).”; Column 21, lines 51-63, “An integer sequence x_1, x_2, . . . , x_N of the sequence of integer values inputted to the encoding apparatus 400 by N samples at a time is inputted to the parameter determination part 420. Based on the inputted integer sequence x_1, x_2, . . . , x_N, the parameter determination part 420 obtains and outputs a Rice parameter r corresponding to the integer sequence and a parameter code which is a code representing the parameter (S420). The parameter code may be obtained by encoding a Rice parameter so that the decoding apparatus 450 decodes the parameter code to thereby obtain the Rice parameter r determined by the parameter determination part 420.”; The case of K=2 with code tree T(0) and code tree T(1) reads on a case that the number of coding orders K is greater than 1, the integer code reads on a first sub-bitstream and the parameter code reads on a second sub-bitstream, where the second sub-bitstream is an encoded bitstream corresponding to K coding orders.);
for any code number, querying the K coding tables to obtain a target code value corresponding to the code number, the target code value being a code value with the smallest code length among K code values obtained by querying the K coding tables (Column 1, lines 10-13, "The present invention relates to a technique for encoding or decoding a sample sequence composed of integer values such as a sample sequence of voice or acoustic time-series digital signals."; Column 1, lines 49-56, "In the above-described reversible encoding, it is a Golomb-Rice code that has been used as one of simplest variable length codes. When a sequence of integer values belongs to a Laplacian distribution, that is, when an appearance probability of integer values is exponentially lowered with respect to the magnitude of the values, the Golomb-Rice code is known to achieve a minimum expected bit length (minimum bit length)."; Column 21, lines 64-67, "The parameter determination part 420 obtains the Rice parameter r according to equation (5) using, for example, each integer value included in the inputted integer sequence x_1, x_2, . . . , x_N."; Column 22, lines 8-11, "The Rice parameter r determined in equation (5) minimizes an estimate value of a total bit length at the time of Golomb-Rice encoding estimated from equation (2) for the integer sequence x_1, x_2, . . . , x_N."; Figure 1:

    PNG
    media_image1.png
    630
    516
    media_image1.png
    Greyscale

Determining the Rice parameter r to minimize an estimate value of a total bit length at the time of Golomb-Rice encoding for an integer sequence reads on querying the K coding tables to obtain a target code value corresponding to the code number, the target code value being a code value with the smallest code length among K code values obtained by querying the K coding tables.).
Sugiura is considered to be analogous to the claimed invention because it is in the same field of audio encoding. Therefore, it would have been obvious to someone of
ordinary skill in the art before the effective filing date of the claimed invention to have
modified "ISO/IEC 14496-3" to further incorporate the teachings of Sugiura to encode audio, using two code trees, as an integer code and a parameter code, and determine the Rice parameter to minimize an estimate value of a total bit length at the time of Golomb-Rice encoding for an integer sequence.  Doing so would allow for performing encoding and decoding with a small average number of bits (Sugiura; Column 32, lines 62-67).
Regarding claim 4, "ISO/IEC 14496-3" in view of Sugiura discloses the method as claimed in claim 1.
"ISO/IEC 14496-3" further discloses:
wherein the determining, based on audio parameters of a to-be-encoded audio signal, a to-be-encoded sequence and a first bitstream further comprises:
determining, based on the first parameter and sorting of the N second parameters, a first target value and N–1 arrays, the first target value being generated based on the second parameter sorted first and the first parameter, and each of the arrays comprising two adjacent second parameters (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; The global gain reads on a first parameter, the scale factors read on N second parameters, the coded spectrum being divided into scale factor bands reads on sorting of the N second parameters, differentially encoding the first scale factor relative to the global gain reads on the first target value being generated based on the second parameter sorted first and the first parameter, and differentially encoding the scale factors relative to the previous scale factor reads on determining N–1 arrays with each of the arrays comprising two adjacent second parameters, where each scale factor and its previous scale factor read on adjacent second parameters.);
and sorting and packing the first target value and N–1 second target values to obtain the to-be-encoded sequence, each of the second target values being generated based on two adjacent second parameters in a corresponding array, and the first target value being sorted first in the to-be-encoded sequence (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; Section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream"; Figure 4.1, "Bitstream Formatter"; Differentially encoding the first scale factor relative to the global gain reads on the first target value, where the first target value is sorted first in the to-be-encoded sequence, differentially encoding the scale factors relative to the previous scale factor reads on N–1 second target values, where each of the second target values is generated based on two adjacent second parameters in a corresponding array, and a bitstream formatter generating a bitstream that describes the quantized audio spectra reads on sorting and packing the first target value and N–1 second target values to obtain the to-be-encoded sequence.).
Regarding claim 5, "ISO/IEC 14496-3" discloses an audio decoding method, comprising:
decoding an audio encoded bitstream corresponding to an audio signal to obtain a first bitstream, a second bitstream, and a third bitstream (Section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream"; Figure 4.2, "Bitstream deformatter"; A bitstream deformatter decoding the quantized values and other reconstruction information reads on decoding an audio encoded bitstream corresponding to an audio signal to obtain a first bitstream, a second bitstream, and a third bitstream.),
wherein audio parameters of the audio signal comprise a first parameter and N second parameters, N being a positive integer, and the first bitstream is obtained by encoding based on the first parameter (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; The global gain reads on a first parameter, the scale factors read on N second parameters where N in a positive integer, coding the global gain as an 8-bit unsigned integer reads on obtaining the first bitstream by encoding based on the first parameter.);
determining a value corresponding to the first bitstream as the first parameter (Section 4.6.2.3.2, lines 2-3, "The start value is given explicitly as a 8 bit PCM in the bitstream element global_gain."; Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; The global gain reads on a first parameter, and the global gain coded as an 8-bit unsigned integer reads on determining a value corresponding to the first bitstream as the first parameter.);
decoding, based on preset coding tables corresponding to preset coding orders, each code value in the second bitstream to obtain a code number corresponding to each element in a to-be-encoded sequence (Section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream"; Figure 4.2, "Bitstream deformatter"; Section 4.6.2.3.2, lines 1-8, "For all scalefactors the difference to the preceeding value is coded using the Huffman code book given in Table 4.A.1. See subclause 4.6.3 for a detailed description of the Huffman decoding process. The start value is given explicitly as a 8 bit PCM in the bitstream element global_gain. A scalefactor is not transmitted for scalefactor bands which are coded with the Huffman codebook ZERO_HCB. If the Huffman codebook for a scalefactor band is coded with INTENSITY_HCB or INTENSITY_HCB2, the scalefactor is used for intensity stereo (see subclauses 4.6.3 and 4.6.8.2). In that case a normal scalefactor does not exist (but is initialized to zero to have an valid entry in the array).  The following pseudo code describes how to decode the scalefactors sf[g][sfb]"; Decoding the scale factors with the Huffman codebook reads on decoding each code value in the second bitstream to obtain a code number corresponding to each element in a to-be-encoded sequence based on preset coding tables corresponding to preset coding orders.),
the to-be-encoded sequence being obtained by encoding based on the first parameter and the N second parameters (Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; The global gain reads on a first parameter, the scale factors read on N second parameters, and differentially encoding the scale factors relative to the previous scale factor or the global gain reads on obtaining the to-be-encoded sequence by encoding based on the first parameter and the N second parameters.),
wherein the code number is an absolute value of a value corresponding to the element in the to-be-encoded sequence (Section 4.B.11.6, lines 1-8, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26.  There are two codebooks for each maximum absolute value, with each representing a distinct probability distribution function. The best fit is always chosen. In order to save on codebook storage (an important consideration in a mass-produced decoder), most codebooks represent unsigned values. For these codebooks the magnitude of the coefficients is Huffman coded and the sign bit of each non-zero coefficient is appended to the codeword."; Huffman coding the magnitude of the coefficients reads on the code number being an absolute value of a value corresponding to the element in the to-be-encoded sequence.);
decoding the third bitstream to obtain a magnitude relationship between each element in the to-be-encoded sequence and a first preset value (Section 4.1.1.1, lines 6-10, "The basic structure of the MPEG-4 GA system is shown in Figure 4.1 and Figure 4.2. The data flow in this diagram is from left to right, top to bottom. The functions of the decoder are to find the description of the quantized audio spectra in the bitstream, decode the quantized values and other reconstruction information, reconstruct the quantized spectra, process the reconstructed spectra through whatever tools are active in the bitstream in order to arrive at the actual signal spectra as described by the input bitstream"; Figure 4.2, "Bitstream deformatter"; Section 4.6.3.1, lines 18-20, "As indicated in Table 4.95, spectrum Huffman codebooks can represent signed or unsigned n-tuples of coefficients. For unsigned codebooks, sign bits for every non-zero coefficient in the n-tuple immediately follow the associated codeword."; Section 4.6.3.3, lines 60-63, "If the Huffman codebook represents signed values, the decoding of the quantized spectral n-tuple is complete after Huffman decoding and translation of codeword index to quantized spectral coefficients. If the codebook represents unsigned values then the sign bits associated with non-zero coefficients immediately follow the Huffman codeword, with a ‘1’ indicating a negative coefficient and a ‘0’ indicating a positive one."; Huffman decoding and translation of codewords to quantized spectral coefficients, where the sign bits associated with non-zero coefficients immediately follow the Huffman codeword, reads on decoding the third bitstream to obtain a magnitude relationship between each element in the to-be-encoded sequence and a first preset value, for a preset value of zero.);
determining the to-be-encoded sequence based on the code numbers corresponding to elements in the to-be-encoded sequence and the magnitude relationship between each element in the to-be-encoded sequence and the first preset value (Section 4.6.3.1, lines 18-20, "As indicated in Table 4.95, spectrum Huffman codebooks can represent signed or unsigned n-tuples of coefficients. For unsigned codebooks, sign bits for every non-zero coefficient in the n-tuple immediately follow the associated codeword."; Section 4.6.3.3, lines 60-63, "If the Huffman codebook represents signed values, the decoding of the quantized spectral n-tuple is complete after Huffman decoding and translation of codeword index to quantized spectral coefficients. If the codebook represents unsigned values then the sign bits associated with non-zero coefficients immediately follow the Huffman codeword, with a ‘1’ indicating a negative coefficient and a ‘0’ indicating a positive one."; Huffman decoding and translation of codewords to quantized spectral coefficients, where the sign bits associated with non-zero coefficients immediately follow the Huffman codeword and a ‘1’ indicates a negative coefficient and a ‘0’ indicates a positive coefficient, reads on determining the to-be-encoded sequence based on the code numbers corresponding to elements in the to-be-encoded sequence and the magnitude relationship between each element in the to-be-encoded sequence and the first preset value.);
and decoding the to-be-encoded sequence based on the first parameter to obtain the N second parameters (Section 4.6.2.3.2, lines 1-8, "For all scalefactors the difference to the preceeding value is coded using the Huffman code book given in Table 4.A.1. See subclause 4.6.3 for a detailed description of the Huffman decoding process. The start value is given explicitly as a 8 bit PCM in the bitstream element global_gain. A scalefactor is not transmitted for scalefactor bands which are coded with the Huffman codebook ZERO_HCB. If the Huffman codebook for a scalefactor band is coded with INTENSITY_HCB or INTENSITY_HCB2, the scalefactor is used for intensity stereo (see subclauses 4.6.3 and 4.6.8.2). In that case a normal scalefactor does not exist (but is initialized to zero to have an valid entry in the array).  The following pseudo code describes how to decode the scalefactors sf[g][sfb]"; Section 4.B.11.5, lines 1-8, "The coded spectrum uses one quantizer per scalefactor band. The step sizes of each of these quantizers is specified as a set of scalefactors and a global gain which normalizes these scalefactors. In order to increase compression, scalefactors associated with scalefactor bands that have only zero-valued coefficients are ignored in the coding process and therefore do not have to be transmitted. Both the global gain and scalefactors are quantized in 1.5 dB steps. The global gain is coded as an 8-bit unsigned integer and the scalefactors are differentially encoded relative to the previous scalefactor (or global gain for the first scalefactor) and then Huffman coded. The dynamic range of the global gain is sufficient to represent full-scale values from a 24-bit PCM audio source."; The global gain reads on a first parameter, the scale factors read on N second parameters, and decoding the scale factors where the scale factors are differentially encoding relative to the previous scale factor or the global gain reads on decoding the to-be-encoded sequence based on the first parameter to obtain the N second parameters.);
and the decoding, based on preset coding tables corresponding to preset coding orders, each code value in the second bitstream to obtain a code number corresponding to each element in a to-be-encoded sequence (Section 4.6.3.3, lines 16-17, "There are eleven Huffman codebooks for the spectral data, as shown in Table 4.95. The codebooks are shown in Table 4.A.2 through Table 4.A.12.") comprises:
determining K preset coding tables corresponding to the K coding orders, wherein the coding tables and the coding orders are in one-to-one correspondence, and the coding table comprises a mapping relationship between code numbers and code values (Section 4.B.11.6, lines 1-4, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26."; The Huffman code drawn from one of 11 codebooks reads on determining K preset coding tables corresponding to the K coding orders, wherein the coding tables and the coding orders are in one-to-one correspondence, and Huffman coding being used to represent n-tuples of quantized coefficients reads on the coding table comprises a mapping relationship between code numbers and code values.);
and for any code value in the first sub-bitstream, determining a code number corresponding to the code value obtained by querying the K coding tables as a code number for an element corresponding to the code value in the to-be-encoded sequence (Section 4.B.11.6, lines 1-4, "Huffman coding is used to represent n-tuples of quantized coefficients, with the Huffman code drawn from one of 11 codebooks. The spectral coefficients within n-tuples are ordered (low to high) and the n-tuple size is two or four coefficients. The maximum absolute value of the quantized coefficients that can be represented by each Huffman codebook and the number of coefficients in each n-tuple for each codebook is shown in Table 5.26."; Huffman coding being used to represent n-tuples of quantized coefficients, where the Huffman code is drawn from one of 11 codebooks, reads on querying the K coding tables to obtain a code number corresponding to the code value.).
"ISO/IEC 14496-3" does not specifically disclose: wherein the number of coding orders is K, and in a case that K is greater than 1, the second bitstream comprises a first sub-bitstream and a second sub-bitstream; the second sub-bitstream is an encoded bitstream corresponding to K coding orders.
Sugiura teaches:
wherein the number of coding orders is K, and in a case that K is greater than 1, the second bitstream comprises a first sub-bitstream and a second sub-bitstream; the second sub-bitstream is an encoded bitstream corresponding to K coding orders (Column 7, lines 11-28, “For example, in the case of K=2 in FIG. 4, the rule applicable to an inputted integer value and a code corresponding to the integer value is such a rule that when an inputted integer value 0 is encoded by the code tree T(0), ‘1’ is obtained as a code corresponding to the inputted integer value 0, when an inputted integer value 1 is encoded by the code tree T(0), ‘100’ is obtained as a code corresponding to the inputted integer value 1, when an inputted integer value 2 is encoded by the code tree T(0), ‘10000’ is obtained as a code corresponding to the inputted integer value 2, when the inputted integer value 0 is encoded by the code tree T(1), no code corresponding to the inputted integer value 0 is obtained, when the inputted integer value 1 is encoded by the code tree T(1), ‘01’ is obtained as a code corresponding to the inputted integer value 1, and when the inputted integer value 2 is encoded by the code tree T(1), ‘0100’ is obtained as a code corresponding to the inputted integer value 2.”; Column 6, lines 13-23, “A sequence of integer values inputted to the encoding apparatus 100 is inputted to the integer encoding part 110 by N samples (N is a natural number) at a time. The inputted sequence of integer values is assumed to be an integer sequence x_1, x_2, . . . , x_N. The integer encoding part 110 encodes the integer sequence x_1, x_2, . . . , x_N through an encoding process using the following code tree based on an encoding parameter K which is a natural number equal to or larger than 2 inputted by predetermined means (not shown) to obtain a code and outputs the obtained code as an integer code (S110).”; Column 21, lines 51-63, “An integer sequence x_1, x_2, . . . , x_N of the sequence of integer values inputted to the encoding apparatus 400 by N samples at a time is inputted to the parameter determination part 420. Based on the inputted integer sequence x_1, x_2, . . . , x_N, the parameter determination part 420 obtains and outputs a Rice parameter r corresponding to the integer sequence and a parameter code which is a code representing the parameter (S420). The parameter code may be obtained by encoding a Rice parameter so that the decoding apparatus 450 decodes the parameter code to thereby obtain the Rice parameter r determined by the parameter determination part 420.”; The case of K=2 with code tree T(0) and code tree T(1) reads on a case that the number of coding orders K is greater than 1, the integer code reads on a first sub-bitstream and the parameter code reads on a second sub-bitstream, where the second sub-bitstream is an encoded bitstream corresponding to K coding orders.).
Sugiura is considered to be analogous to the claimed invention because it is in the same field of audio encoding. Therefore, it would have been obvious to someone of
ordinary skill in the art before the effective filing date of the claimed invention to have
modified "ISO/IEC 14496-3" to further incorporate the teachings of Sugiura to encode audio, using two code trees, as an integer code and a parameter code.  Doing so would allow for performing encoding and decoding with a small average number of bits (Sugiura; Column 32, lines 62-67).
Regarding claim 8, arguments analogous to claim 1 are applicable.  In addition, "ISO/IEC 14496-3" discloses an electronic device, comprising a processor, a memory, and a program or instructions stored in the memory and capable of running on the processor, wherein when the program or instructions are executed by the processor (Section 5.6.8.6.17, lines 2-4, “If the instrument instantiation is running entirely on one CPU, then that CPU shall be measured; if the instrument instantiation is running on multiple CPUs, then the exact measurement procedure is nonnormative.”; Section 5.3.79, “A unit of memory, labelled with a name, that holds intermediate processing results.”), the steps of claim 1 are implemented.
Regarding claim 11, arguments analogous to claim 4 are applicable.
Regarding claim 12, arguments analogous to claim 5 are applicable.  In addition, "ISO/IEC 14496-3" discloses an electronic device, comprising a processor, a memory, and a program or instructions stored in the memory and capable of running on the processor, wherein when the program or instructions are executed by the processor (Section 5.6.8.6.17, lines 2-4, “If the instrument instantiation is running entirely on one CPU, then that CPU shall be measured; if the instrument instantiation is running on multiple CPUs, then the exact measurement procedure is nonnormative.”; Section 5.3.79, “A unit of memory, labelled with a name, that holds intermediate processing results.”), the steps of the method according to claim 5 are implemented.
Regarding claim 15, arguments analogous to claim 1 are applicable.  In addition, "ISO/IEC 14496-3" discloses a non-transitory readable storage medium, wherein the non-transitory readable storage medium stores a program or instructions, and when the program or instructions are executed by a processor (Section 5.6.8.6.17, lines 2-4, “If the instrument instantiation is running entirely on one CPU, then that CPU shall be measured; if the instrument instantiation is running on multiple CPUs, then the exact measurement procedure is nonnormative.”; Section 5.3.79, “A unit of memory, labelled with a name, that holds intermediate processing results.”), the steps of claim 1 are implemented.
Regarding claim 18, arguments analogous to claim 4 are applicable.
Regarding claim 19, arguments analogous to claim 5 are applicable.  In addition, "ISO/IEC 14496-3" discloses a non-transitory readable storage medium, wherein the non-transitory readable storage medium stores a program or instructions, and when the program or instructions are executed by a processor (Section 5.6.8.6.17, lines 2-4, “If the instrument instantiation is running entirely on one CPU, then that CPU shall be measured; if the instrument instantiation is running on multiple CPUs, then the exact measurement procedure is nonnormative.”; Section 5.3.79, “A unit of memory, labelled with a name, that holds intermediate processing results.”), the steps of the method according to claim 5 are implemented.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to James Boggs whose telephone number is (571)272-2968. The examiner can normally be reached M-F 8:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/JAMES BOGGS/Examiner, Art Unit 2657
Read full office action
Prosecution Timeline

Jul 13, 2023
Application Filed
Jun 16, 2025
Non-Final Rejection mailed — §103, §112
Sep 16, 2025
Response Filed
Nov 10, 2025
Final Rejection mailed — §103, §112
Jan 12, 2026
Response after Non-Final Action
Feb 06, 2026
Request for Continued Examination
Feb 17, 2026
Response after Non-Final Action
Mar 26, 2026
Non-Final Rejection mailed — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/041,710
Patent 12620399
VOICE PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND COMPUTER READABLE MEDIUM
3y 2m to grant Granted May 05, 2026
18/163,848
Patent 12586600
Streaming Vocoder
3y 1m to grant Granted Mar 24, 2026
17/977,443
Patent 12573406
VOICE AUTHENTICATION BASED ON ACOUSTIC AND LINGUISTIC MACHINE LEARNING MODELS
3y 4m to grant Granted Mar 10, 2026
18/314,249
Patent 12572752
DYNAMIC CONTENT GENERATION METHOD
2y 10m to grant Granted Mar 10, 2026
18/483,896
Patent 12562170
BIOMETRIC AUTHENTICATION DEVICE, BIOMETRIC AUTHENTICATION METHOD, AND RECORDING MEDIUM
2y 4m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
61%
Grant Probability
97%
With Interview (+35.9%)
3y 2m (~4m remaining)
Median Time to Grant
High
PTA Risk
Based on 112 resolved cases by this examiner. Grant probability derived from career allowance rate.