Last updated: May 29, 2026
Application No. 18/538,006
CONTENT-BASED SWITCHABLE AUDIO CODEC

Final Rejection §103
Filed
Dec 13, 2023
Examiner
PASHA, ATHAR N
Art Unit
2657
Tech Center
2600 — Communications
Assignee
Qualcomm Incorporated
OA Round
2 (Final)
Interview Optional

— +16.4% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 90% grant rate with +16.4% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 156 resolved cases, 2023–2026
Examiner Intelligence

PASHA, ATHAR N View full profile →
Grants 90% — above average
Career Allowance Rate
140 granted / 156 resolved
+27.7% vs TC avg
Strong +16% interview lift
Without
With
+16.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
13 currently pending
Career history
174
Total Applications
across all art units
Statute-Specific Performance

§101
4.3%
-35.7% vs TC avg
§103
89.8%
+49.8% vs TC avg
§102
3.0%
-37.0% vs TC avg
§112
1.3%
-38.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 156 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Amendments/Arguments received on 4/7/26 have been entered.
In light of the Amendments, the examiner rejects the Application with new references rendering the arguments moot.

Claim Rejections - 35 USC § 103
 In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-3, 5, 14-19, 20, 21, 27 and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan (US 20120065980 A1) in further view of Atti (US 20150106106 A1) and Lee (US 20190164052 A1).
With respect to claim 1 Krishnan teaches A device comprising: 
[[a machine-learning audio encoder; 
a waveform-matching audio encoder; ]]
and a controller configured to cause a segment of audio data to be input to the machine-learning audio encoder, to the waveform-matching audio encoder, or to both, based on a classification associated with the segment (Krishnan¶[0053] For example, if the frame type 126 indicates [indicator] that the frame 110 is transient 106 [non-target audio] , then the encoder selection block/module 130 may provide the transient frame 134 to the transient encoder 104 [waveform-matching encoder] . However, if the frame type 126 indicates that the frame 110 is another kind of frame 136 [target audio] that is not transient (e.g., voiced, unvoiced, silent, etc.), then the encoder selection block/module 130 may provide the other frame 136 to another encoder 140 [machine-learning encoder]);
 Krishnan does not explicitly disclose however Atti teaches waveform matching audio encoder (Atti ¶ [0005] Various encoding schemes may be used when communicating audio data. For example, depending on the audio frame type, a code-excited linear prediction (CELP) approach or a frequency-domain based modified discrete cosine transform (MDCT) can be used to compactly represent the speech and audio. In order to improve coding efficiency at low bit rates, (e.g., 13.2 kilobits per second (kbps), 24.4 kbps, etc.) when encoding larger bandwidths, e.g., up to 8 kilohertz (kHz) wideband (WB), 16 kHz super-wideband (SWB), or 20 kHz full-band, the lower band core (e.g., up to 6.4 kHz or up to 8 kHz) is typically encoded using waveform-matching coding techniques such as CELP or MDCT.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan to include the waveform encoder of Atti in order to reconstruct high quality audio ([0005], Atti); 
None of Krishnan and Atti explicitly disclose however Lee teaches machine learning audio encoder (Lee ¶ [0003] Recently, machine learning has been applied to various fields, and such attempts are also considered in a field of audio signal processing. A machine learning model such as a deep neural network (DNN) may improve the efficiency of coding audio signals).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti to include machine learning encoder of Lee in order to improve efficiency of coding audio ([0005], Lee ); 

With respect to claim 2 Krishnan teaches further comprising an audio classifier configured to generate an indicator of the classification based on whether the segment represents audio content of a particular type and configured to provide the indicator to the controller (Krishnan¶[0053] For example, if the frame type 126 indicates [indicator] that the frame 110 is transient 106 [non-target audio] , then the encoder selection block/module 130 may provide the transient frame 134 to the transient encoder 104 [waveform-matching encoder] . However, if the frame type 126 indicates that the frame 110 is another kind of frame 136 [target audio] that is not transient (e.g., voiced, unvoiced, silent, etc.), then the encoder selection block/module 130 may provide the other frame 136 to another encoder 140 [machine-learning encoder]);



With respect to claims 3 and 21, Krishnan teaches further comprising a modem coupled to the machine-learning audio encoder and the waveform-matching audio encoder and configured to represent, in a bitstream, an output of the machine-learning audio encoder, an output of the waveform-matching audio encoder, or both (Krishnan ¶[0062] In a case where the current frame 110 is not a transient frame 134, but is some other kind of frame 136, another encoder 140 (e.g., silence encoder, quarter-rate prototype pitch period (QPPP) encoder, noise excited linear prediction (NELP) encoder, etc.) may be used to encode the frame 136. The other encoder 140 may produce an encoded non-transient speech signal 178).

With respect to claim 5 Krishnan teaches wherein the controller is configured to select the machine-learning audio encoder to process a first set of segments that represent speech based on an indication that the first set of segments include a target audio type and to select the waveform-matching audio encoder to process a second set of segments that represent non-speech sounds based on an indication that the second set of segments include one or more non-target audio types, wherein the target audio type includes speech. (Krishnan¶[0053] For example, if the frame type 126 indicates [indicator] that the frame 110 is transient 106 [non-target audio] , then the encoder selection block/module 130 may provide the transient frame 134 to the transient encoder 104 [waveform-matching encoder] . However, if the frame type 126 indicates that the frame 110 is another kind of frame 136 [target audio] that is not transient (e.g., voiced, unvoiced, silent, etc.), then the encoder selection block/module 130 may provide the other frame 136 to another encoder 140 [machine-learning encoder]); 


With respect to claim 15 Krishnan teaches wherein the controller is integrated into one or more processors (Krishnan ¶[0006] An electronic device for coding a transient frame is disclosed. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor.) 
 
With respect to claim 16 Krishnan teaches wherein the controller is integrated into processing circuitry processors (Krishnan ¶[0006] An electronic device for coding a transient frame is disclosed. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor.) 

With respect to claim 17 Krishnan teaches wherein the machine-learning audio encoder, the waveform-matching audio encoder, or both, are integrated into a processor (Krishnan ¶[0067] For instance, electronic device A 102 may be a digital voice recorder that encodes and stores speech signals 106 in memory, which may then be decoded to produce a decoded speech signal 164). Examiner Note: digital signals require a processor to get data to/from memory) 

With respect to claim 18 Atti further teaches wherein the controller, the machine-learning audio encoder, and the waveform-matching audio encoder are integrated in at least one of a mobile phone, a tablet computer device, a wearable electronic device, a camera device, a virtual reality headset, a mixed reality headset, or an augmented reality headset (Atti¶[0105] By using the adjustment parameter to determine the selection, audio frames may be classified as speech frames or music frames and a number of misclassified speech frames may be reduced as compared to conventional classification techniques. Based on the classified audio frames, an encoder (e.g., a speech encoder or a non-speech encoder) may be selected to encode the audio frame. By using the selected encoder to encode the speech frames, artifacts and poor signal quality that result from misclassification of audio frames and from using the wrong encoder to encode the audio frames may be reduced, ¶[0114] The device 700 may include a communication device, an encoder, a decoder, a smart phone, a cellular phone, a mobile communication device, a laptop computer, a computer, a tablet, a personal digital assistant (PDA), a set top box, a video player, an entertainment unit, a display device, a television, a gaming console, a music player, a radio, a digital video player, a digital video disc (DVD) player, a tuner, a camera, a navigation device, a decoder system, an encoder system, a base station, a vehicle, or a combination thereof.)

With respect to claim 19 Atti further teaches the machine-learning audio encoder, and the waveform-matching audio encoder are integrated in a vehicle (Atti¶[0105] By using the adjustment parameter to determine the selection, audio frames may be classified as speech frames or music frames and a number of misclassified speech frames may be reduced as compared to conventional classification techniques. Based on the classified audio frames, an encoder (e.g., a speech encoder or a non-speech encoder) may be selected to encode the audio frame. By using the selected encoder to encode the speech frames, artifacts and poor signal quality that result from misclassification of audio frames and from using the wrong encoder to encode the audio frames may be reduced. ¶[0114] The device 700 may include a communication device, an encoder, a decoder, a smart phone, a cellular phone, a mobile communication device, a laptop computer, a computer, a tablet, a personal digital assistant (PDA), a set top box, a video player, an entertainment unit, a display device, a television, a gaming console, a music player, a radio, a digital video player, a digital video disc (DVD) player, a tuner, a camera, a navigation device, a decoder system, an encoder system, a base station, a vehicle, or a combination thereof.)

With respect to claims 20, 27 Krishnan teaches obtaining, by one or more processors ([0006] An electronic device for coding a transient frame is disclosed. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor.), an indication of a type of audio content associated with a segment of audio data; and selectively, based on the indication, causing the segment to be sent as input to a machine- learning audio encoder, a waveform-matching audio encoder, or both, wherein the machine- learning audio encoder is separate from the waveform-matching audio encoder(Krishnan¶[0053] For example, if the frame type 126 indicates [indicator] that the frame 110 is transient 106 [non-target audio] , then the encoder selection block/module 130 may provide the transient frame 134 to the transient encoder 104 [waveform-matching encoder] . However, if the frame type 126 indicates that the frame 110 is another kind of frame 136 [target audio]that is not transient (e.g., voiced, unvoiced, silent, etc.), then the encoder selection block/module 130 may provide the other frame 136 to another encoder 140 [machine-learning encoder]);
 Krishnan does not explicitly disclose however Atti teaches waveform matching audio encoder (Atti ¶ [0005] Various encoding schemes may be used when communicating audio data. For example, depending on the audio frame type, a code-excited linear prediction (CELP) approach or a frequency-domain based modified discrete cosine transform (MDCT) can be used to compactly represent the speech and audio. In order to improve coding efficiency at low bit rates, (e.g., 13.2 kilobits per second (kbps), 24.4 kbps, etc.) when encoding larger bandwidths, e.g., up to 8 kilohertz (kHz) wideband (WB), 16 kHz super-wideband (SWB), or 20 kHz full-band, the lower band core (e.g., up to 6.4 kHz or up to 8 kHz) is typically encoded using waveform-matching coding techniques such as CELP or MDCT.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan to include the waveform encoder of Atti in order to reconstruct high quality audio ([0005], Atti); 
None of Krishnan and Atti explicitly disclose however Lee teaches machine learning audio encoder (Lee ¶ [0003] Recently, machine learning has been applied to various fields, and such attempts are also considered in a field of audio signal processing. A machine learning model such as a deep neural network (DNN) may improve the efficiency of coding audio signals.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti to include machine learning encoder of Lee in order to improve efficiency of coding audio ([0005], Lee ); 

With respect to claims 30 Krishnan teaches means for obtaining an indication of a type of audio content associated with a segment of audio data; and means for selectively, based on the indication, causing the segment to be sent as input to a machine-learning audio encoder, a waveform-matching audio encoder, or both, wherein the machine-learning audio encoder is separate from the waveform-matching audio encode (Krishnan¶[0053] For example, if the frame type 126 indicates [indicator] that the frame 110 is transient 106 [non-target audio] , then the encoder selection block/module 130 may provide the transient frame 134 to the transient encoder 104 [waveform-matching encoder] . However, if the frame type 126 indicates that the frame 110 is another kind of frame 136 [target audio]that is not transient (e.g., voiced, unvoiced, silent, etc.), then the encoder selection block/module 130 may provide the other frame 136 to another encoder 140 [machine-learning encoder]);
 Krishnan does not explicitly disclose however Atti teaches waveform matching audio encoder (Atti ¶ [0005] Various encoding schemes may be used when communicating audio data. For example, depending on the audio frame type, a code-excited linear prediction (CELP) approach or a frequency-domain based modified discrete cosine transform (MDCT) can be used to compactly represent the speech and audio. In order to improve coding efficiency at low bit rates, (e.g., 13.2 kilobits per second (kbps), 24.4 kbps, etc.) when encoding larger bandwidths, e.g., up to 8 kilohertz (kHz) wideband (WB), 16 kHz super-wideband (SWB), or 20 kHz full-band, the lower band core (e.g., up to 6.4 kHz or up to 8 kHz) is typically encoded using waveform-matching coding techniques such as CELP or MDCT.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan to include the waveform encoder of Atti in order to reconstruct high quality audio ([0005], Atti); 
None of Krishnan and Atti explicitly disclose however Lee teaches machine learning audio encoder (Lee ¶ [0003] Recently, machine learning has been applied to various fields, and such attempts are also considered in a field of audio signal processing. A machine learning model such as a deep neural network (DNN) may improve the efficiency of coding audio signals.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti to include machine learning encoder of Lee in order to improve efficiency of coding audio ([0005], Lee ); 


Claims 13-14 26, 29 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan, Atti and Lee in further view of Ishikawa (US 20130090929 A1).
With respect to claim 13, 26 and 29 none of Krishnan, Atti and Lee explicitly disclose however Ishikawa teaches wherein the controller is configured to, in response to a determination to transition which audio encoder is provided segments of the audio data, provide at least one segment of the audio data to both the machine-learning audio encoder and the waveform-matching audio encoder (Ishikawa¶ [0141] FIG. 13 illustrates the coding process. The previous frame is coded in the AAC-ELD mode. In order to cancel the aliasing of the previous frame i-1 introduced by the AAC-ELD mode, the current frame i is concatenated with the previous frame i-1 to form a long frame. The processing frame size is 2N, where N is the frame size. The extended frame is coded in the TCX mode as shown in FIG. 13.) Examiner Note: Fig 13 below shows frames i-1 coded with two different encoders.

    PNG
    media_image1.png
    274
    604
    media_image1.png
    Greyscale
 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include indicator of Ishikawa in order to further improve sound quality ([0004], Ishikawa); 

With respect to claim 14 Krishnan teaches further comprising a modem coupled to the machine-learning audio encoder and the waveform-matching audio encoder and configured to represent, in a bitstream, an output of the machine-learning audio encoder and an output of the waveform-matching audio encoder (Krishnan ¶[0062] In a case where the current frame 110 is not a transient frame 134, but is some other kind of frame 136, another encoder 140 (e.g., silence encoder, quarter-rate prototype pitch period (QPPP) encoder, noise excited linear prediction (NELP) encoder, etc.) may be used to encode the frame 136. The other encoder 140 may produce an encoded non-transient speech signal 178).



With respect to claim 21 Krishnan teaches further comprising generating a bitstream representing an output of the machine-learning audio encoder, an output of the waveform-matching audio encoder, or both (Krishnan ¶[0062] In a case where the current frame 110 is not a transient frame 134, but is some other kind of frame 136, another encoder 140 (e.g., silence encoder, quarter-rate prototype pitch period (QPPP) encoder, noise excited linear prediction (NELP) encoder, etc.) may be used to encode the frame 136. The other encoder 140 may produce an encoded non-transient speech signal 178).

Claims 4 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan, Atti, Lee in further view of in further view of Heikkinen (US 20050228648 A1) .
With respect to claim 4 Krishnan, Atti, Lee do not explicitly disclose however Heikkinen teaches wherein the [[machine-learning]] audio encoder is configured to encode a first an input segment that includes a particular number of bits to generate a first output segment that includes a first number of bits to represent the first input segment, wherein the waveform-matching audio encoder is configured to encode a second input segment that includes the particular number of bits to generate a second output segment that includes a second number of bits to represent the second input segment, and wherein the first number is less than the second number (Heikkinen ¶ [0006] For lower bit rates (in the range of 4 kbits/s) parametric coders are considered to be a more promising approach for achieving good speech quality, ¶[0005] The most prominent waveform matching codec is the code excited linear prediction (CELP). Typically, good speech quality has been achieved with waveform coders at bit rates approximately above 5 kbits/s. For example the enhanced full rate speech codec (EFR) according to the IS-641 standard approved in 1996 for the north american TDMA digital cellular system (IS-136) is based on an ACELP (algebraic code excited linear prediction) codec, which is an improved code excited linear prediction (CELP) codec and provides a speech coding at a bit rate of 7.4 kbits/s.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan to include the waveform encoder of Atti in order to reconstruct high quality audio ([0005], Atti); 

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Krishnan, Atti and Lee in further view of MIttal (US 20110218797 A1).
With respect to claim 6 none of Krishnan, Atti and Lee explicitly disclose however Mittal teaches wherein the controller is configured to select a single audio encoder to process each respective segment of the audio data [Mittal ¶ [0018] In FIG. 5, an input speech/audio frame sequence 502 comprises sequential speech frames (m-2) and (m-1) and a subsequent generic audio frame (m). The speech frames (m-2) and (m-1) may be coded based in part on LPC analysis windows, both illustrated at 504. A coded speech frame corresponding to the input speech frame (m-1) is illustrated at 506. This frame may be preceded by another coded speech frame, not illustrated, corresponding to the input frame (m-2). The coded speech frames are delayed relative to the corresponding input frames by an interval resulting from algorithmic delay associated with the LPC "look-ahead" processing buffer, i.e., the audio samples ahead of the frame that are required to estimate the LPC parameters that are centered around the end (or near the end) of the coded speech frame.,) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include single encoder of Mittal in order to reduce gaps in processed audio sound quality ([0003], Mittal); 

 Claims 7, 8 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan, Atti and Lee in further view of Gilg (US 20100020954 A1).
With respect to claim 7 none of Krishnan, Atti and Lee explicitly disclose however Gilg teaches wherein, when a particular audio encoder is selected to process two sequential segments of the audio data, encoder state data resulting from processing a first segment of the two sequential segments is used to process a second segment of the two sequential segments, where the second segment is subsequent to the first segment (Gilg ¶ [0043] In an alternate, advantageous embodiment of the switchover method, within the framework of the switchover of the audio data connection from the first encoder to the second encoder, a state of the second encoder is modified such that the encoder parameters of the first encoder are detected and set as encoder parameters for the second encoder. This process is preferably undertaken at the end of a time segment or between two time segments, so that the switchover to the second encoder can already be made during the following time segment.) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include state encoding of Gilg in order to reduce computing outlay and the background noise ([0003], Gilg ); 

With respect to claim 8 none of Krishnan, Atti and Lee explicitly disclose however Gilg teaches wherein, when different audio encoders are selected to process two sequential segments of the audio data, default encoder state data is used to process a second segment of the two sequential segments, where the second segment is subsequent to a first segment of the two sequential segments (Gilg ¶[0042]In an alternate, advantageous embodiment of the switchover method, within the framework of the switchover of the audio data connection from the first encoder to the second encoder, a state of the second encoder is modified such that the encoder parameters of the first encoder are detected and set as encoder parameters for the second encoder. This process is preferably undertaken at the end of a time segment or between two time segments, so that the switchover to the second encoder can already be made during the following time segment).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include state encoding of Gilg in order to reduce computing outlay and the background noise ([0003], Gilg ); 

With respect to claim 10 none of Krishnan, Atti and Lee explicitly disclose however Gilg teaches wherein, when different audio encoders are selected to process two sequential segments of the audio data, encoder state data used to process a second segment of the two sequential segments is based on processing of a first segment of the two sequential segments, where the second segment is subsequent to the first segment (Gilg ¶[0042]) In an alternate, advantageous embodiment of the switchover method, within the framework of the switchover of the audio data connection from the first encoder to the second encoder, a state of the second encoder is modified such that the encoder parameters of the first encoder are detected and set as encoder parameters for the second encoder. This process is preferably undertaken at the end of a time segment or between two time segments, so that the switchover to the second encoder can already be made during the following time segment.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include state encoding of Gilg in order to reduce computing outlay and the background noise ([0003], Gilg ); 

Claims 9 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan, Atti and Lee in further view of Lecomte (US 20110173010 A1).
With respect to claim 9 none of Krishnan, Atti, Lee explicitly disclose however Lecomte teaches when different audio encoders are selected to process two sequential segments of the audio data, encoder state data used by a second audio encoder to process a second segment of the two sequential segments is based on a prior state of the second audio encoder, where the second segment is subsequent to a first segment of the two sequential segments (Lecomte L220.1 ¶[0034] a second encoder for encoding samples in a second encoding domain, the second encoder having a predetermined frame size number of audio samples, and a coding warm-up period number of audio samples, the second encoder having a different second framing rule [prior state of second encoder], a frame of the second encoder being an encoded representation of a number of timely subsequent audio samples, the number being equal to the predetermined frame size number of audio samples; and a controller for switching from the first encoder to the second encoder or vice versa in response to a characteristic of the audio samples, and for modifying the start window or the stop window of the first encoder to the extent that a zero part thereof extends across a first quarter of an MDCT size and cross fade starts in a second quarter of the MDCT size so that the cross fade begins after a MDCT folding axis relative to the zero part, wherein the second framing rule remains unmodified.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee in to include encoder state of second audio of Lecomte in order to improve switching of audio coding ([0048], Lecomte); 

 Claims 11, 25 and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan, Atti and Lee in further view of Harada (US 20040107092 A1).
With respect to claim 11 none of Krishnan, Atti, Lee not explicitly disclose however Harada teaches wherein the controller is configured to use a first delay when transitioning to causing segments to be input to the waveform-matching audio encoder and is configured to use a second delay when transitioning to causing segments to be input to the machine-learning audio encoder, wherein the first delay is different from the second delay (Harada ¶ [0076] In FIG. 6, an example is illustrated in which the delay time T.sub.dB1 involved in the encoding by the second speech codec 8 is greater than the delay time T.sub.dA1 involved in the encoding by the first speech codec).

    PNG
    media_image2.png
    442
    469
    media_image2.png
    Greyscale

It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include delays of Harada in order to achieve high efficiency transmission ([0001], Harada ); 

With respect to claims 25 and 28, Krishnan, Atti, Lee do not explicitly disclose however Harada teaches further comprising applying a first delay when transitioning to causing segments to be input to the waveform-matching audio encoder and applying a second delay when transitioning to causing segments to be input to the machine-learning audio encoder, wherein the first delay is different from the second delay (Harada ¶ [0076] In FIG. 6, an example is illustrated in which the delay time T.sub.dB1 involved in the encoding by the second speech codec 8 is greater than the delay time T.sub.dA1 involved in the encoding by the first speech codec).

    PNG
    media_image2.png
    442
    469
    media_image2.png
    Greyscale

It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include delays of Harada in order to achieve high efficiency transmission ([0001], Harada ); 


Claims 12 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan, Atti, Lee and Harada in further view of O’Connell (US 20100007773 A1).
With respect to claim 12 none of Krishnan, Atti, Lee and Harada explicitly disclose however O’Connell teaches wherein the first delay is fixed and the second delay is variable and is selected based on content of the segments (O’Connell ¶[0086] The codec 32 is arranged to apply a delay to the audio stream in order to ensure that the video and audio streams are displayed/sounded synchronously at the location that they are sent and to provide echo cancellation. In one embodiment, the delay applied to the audio signal is a variable delay, ¶[0087] The codec 32 is programmed with a fixed time delay and during transmission of the video and audio streams the codec 318 or 322 periodically transmits to the other codec 322 or 318 a test signal.). 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee in view of delays of Harada to include fixed/variable delays of O’Connell in order to reduce latency ([0006], O’Connell ); 

Claims 22-24 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan, Atti and Lee Ishikawa in further view of Gilg.
With respect to claims 22, Krishnan, Atti, Lee do not explicitly disclose however Gilg teaches when a particular audio encoder is selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment (Gilg ¶ [0043] In an alternate, advantageous embodiment of the switchover method, within the framework of the switchover of the audio data connection from the first encoder to the second encoder, a state of the second encoder is modified such that the encoder parameters of the first encoder are detected and set as encoder parameters for the second encoder. This process is preferably undertaken at the end of a time segment or between two time segments, so that the switchover to the second encoder can already be made during the following time segment.) 
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include state encoding of Gilg in order to reduce computing outlay and the background noise ([0003], Gilg );
With respect to claims 23, Krishnan, Atti, Lee do not explicitly disclose however Gilg teaches further comprising, when different audio encoders are selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using encoder state data that is independent of processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment (Gilg ¶[0042]) In an alternate, advantageous embodiment of the switchover method, within the framework of the switchover of the audio data connection from the first encoder to the second encoder, a state of the second encoder is modified such that the encoder parameters of the first encoder are detected and set as encoder parameters for the second encoder. This process is preferably undertaken at the end of a time segment or between two time segments, so that the switchover to the second encoder can already be made during the following time segment.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include state encoding of Gilg in order to reduce computing outlay and the background noise ([0003], Gilg );. 

With respect to claims 24, Krishnan, Atti, Lee and Ishikawa do not explicitly disclose however Gilg teaches further comprising, when different audio encoders are selected to process two sequential segments of the audio data, processing a second segment of the two sequential segments using encoder state data resulting from processing a first segment of the two sequential segments, where the second segment is subsequent to the first segment (Gilg ¶[0042]) In an alternate, advantageous embodiment of the switchover method, within the framework of the switchover of the audio data connection from the first encoder to the second encoder, a state of the second encoder is modified such that the encoder parameters of the first encoder are detected and set as encoder parameters for the second encoder. This process is preferably undertaken at the end of a time segment or between two time segments, so that the switchover to the second encoder can already be made during the following time segment.)
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to modify switching encoder of Krishnan in view of waveform encoder of Atti in view of machine learning encoder of Lee to include state encoding of Gilg in order to reduce computing outlay and the background noise ([0003], Gilg ); 
 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ATHAR N PASHA whose telephone number is (408)918-7675. The examiner can normally be reached Monday-Thursday Alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ATHAR N PASHA/Primary Examiner, Art Unit 2657
Read full office action
Prosecution Timeline

Dec 13, 2023
Application Filed
Oct 22, 2025
Non-Final Rejection mailed — §103
Feb 27, 2026
Interview Requested
Mar 07, 2026
Examiner Interview Summary
Mar 07, 2026
Applicant Interview (Telephonic)
Apr 07, 2026
Response Filed
Apr 29, 2026
Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/573,622
Patent 12639516
CLASSIFICATION AND AUGMENTATION OF UNSTRUCTURED DATA FOR AUTOFILL
2y 5m to grant Granted May 26, 2026
17/749,578
Patent 12632445
Applied Artificial Intelligence Technology for Natural Language Generation Using a Story Graph and Configurable Structurer Code
4y 0m to grant Granted May 19, 2026
18/264,595
Patent 12614040
SIMULTANEOUS TRANSLATION DEVICE AND COMPUTER PROGRAM
2y 8m to grant Granted Apr 28, 2026
18/747,081
Patent 12608556
INTENTION RECOGNITION METHOD, DEVICE, ELECTRONIC DEVICE AND STORAGE MEDIUM BASED ON LARGE MODEL
1y 10m to grant Granted Apr 21, 2026
18/747,499
Patent 12608557
CHINESE DIALOGUE SYSTEM FOR COGNITIVELY IMPAIRED ADULTS BASED ON COGNITIVE STIMULATION THERAPY PRINCIPLES
1y 10m to grant Granted Apr 21, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
90%
Grant Probability
99%
With Interview (+16.4%)
2y 6m (~1m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 156 resolved cases by this examiner. Grant probability derived from career allowance rate.