Prosecution Insights
Last updated: April 19, 2026
Application No. 18/843,231

AUDIO PLAYING METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM

Final Rejection §103
Filed
Aug 30, 2024
Examiner
LIN, JASON K
Art Unit
2425
Tech Center
2400 — Computer Networks
Assignee
BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.
OA Round
2 (Final)
49%
Grant Probability
Moderate
3-4
OA Rounds
3y 7m
To Grant
84%
With Interview

Examiner Intelligence

Grants 49% of resolved cases
49%
Career Allow Rate
221 granted / 454 resolved
-9.3% vs TC avg
Strong +35% interview lift
Without
With
+34.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 7m
Avg Prosecution
28 currently pending
Career history
482
Total Applications
across all art units

Statute-Specific Performance

§101
5.2%
-34.8% vs TC avg
§103
61.2%
+21.2% vs TC avg
§102
16.0%
-24.0% vs TC avg
§112
9.3%
-30.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 454 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . DETAILED ACTION This office action is responsive to application No. 18/843,231 filed on 12/03/2025. Claim(s) 2, 8, 12, and 18 are canceled. Claim(s) 1, 3-7, 9-11, 13-17, and 19-21 is/are pending and have been examined. Claim Objections Claim(s) 3 is/are objected to because of the following informalities: Claim 3 recites: “wherein the switching switching from playing the current audio to playing the target audio at the second time during the process of playing the video based on the target audio” There is an additional recitation of “switching”. Please amend to: --wherein the -- Appropriate correction is required. Response to Arguments Applicant’s arguments with respect to claim(s) 1, 3-7, 9-11, 13-17, and 19-21 have been considered but are moot in view of the new ground(s) of rejection. Although a new ground(s) of rejection has been made. Some of Applicant’s arguments need to be addressed. Applicants assert on P.11 that “Instead, Swaminathan appears to suggest switching to playing the different audio sub-stream whenever the user makes the request to switch, without any regard to I-frames of the video. As such, Swaminathan does not disclose switching from playing the current audio to playing the target audio at the second time during the process of playing the video based on the target audio data.” In response, the Examiner respectfully disagrees. Swaminathan teaches in Paragraph 0041 that each video segment of sub-stream 120 has a corresponding audio segment of sub-stream 124 for playback, client device 102a refrains from playing back media segment 120a until the audio content of media segment 124a is also received and ready for playback. Paragraph 0053 teaches where received media segments correspond to at least partially concurrent portions of the multimedia content, a client device may play back the received media segments at least partially concurrently, i.e. the content of those segments may be played back at least partially concurrently. For example, audio and video segments that correspond to substantially the same time period in the multimedia content may be played back together. Figs.2, Paragraph 0060 teaches pattern of requesting unsent media segments, where unsent media segments are request and sent in each repetition. Thus, switching of target audio, would result in a request for unsent media segment of target audio sub-stream, which would be subsequent to the segment of the current audio sub-stream. New reference Hoffmann in combination taught where each segment of content, audio or video content, starts with an I-frame to enable seamless adapation. Thus, the combination of Swaminathan and Hoffmann teach the claimed limitation(s). Please also see Office Action below. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claim(s) 1, 3-7, 9-11, 13-17, and 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Swaminathan (US 2016/0182600) in view of Hoffmann et al. (US 2021/0258632). Consider claims 1, 9, and 10, Swaminathan teaches an audio playing method, electronic device, and a storage medium comprising computer-executable instructions, wherein the computer-executable instructions, when executed by a computer processor, cause the computer processor to/comprising: at least one processor; and a storage device with at least one program stored thereon, the at least one program, when executed by the at least one processor, (Fig.1A-B, processor(s) 614, memory 612 - Fig.6, Paragraph 0086-0088) causes the at least one processor to: receive an audio data switching instruction at a first time during a process of playing a video, wherein the audio data switching instruction comprises an instruction to switch from playing current audio associated with the video to playing target audio associated with the video (Paragraph 0033 teaches at least two sub-streams such as two of sub-streams 120, 122, 124, 126, and 128 are included in streaming multimedia content 116 to a client device. For example, the stream may include at least one video sub-stream, e.g. corresponding to sub-stream 120, and at least one audio sub-stream, e.g. corresponding to sub-stream 124. Paragraph 0034 teaches default sub-streams may be changed before or during the stream. Sub-streams in the stream may change during the stream, for example, as selected by the client device and/or the server. Client device may select a lower bitrate for the stream, such that the server switches from sending video segments of sub-stream 120 to sending video segments of sub-stream 122, e.g. client device may request segments of a lower bitrate sub-stream(s). A client device may similarly select between sub-streams 124 and 126 to select the language to be played with the video content. Paragraph 0070 teaches a client device, e.g. a user of the client device, could selectively switch to an audio sub-stream corresponding to a different language during a stream); request, upon receiving the audio data switching instruction, target audio data associated with the target audio from a server during the process of playing current the video (Paragraph 0032 teaches sub-stream 124 is an audio sub-stream of multimedia content 116 corresponding to Language A, e.g. English. Sub-stream 126 is an audio sub-stream of multimedia content 116 corresponding to Language B, e.g. Spanish. Paragraph 0034 teaches default sub-streams may be changed before or during the stream. Sub-streams in the stream may change during the stream, for example, as selected by the client device and/or the server. Client device may select a lower bitrate for the stream, such that the server switches from sending video segments of sub-stream 120 to sending video segments of sub-stream 122, e.g. client device may request segments of a lower bitrate sub-stream(s). A client device may similarly select between sub-streams 124 and 126 to select the language to be played with the video content. Paragraph 0037 teaches the client device may use a request comprising a URL provided by the manifest to request a corresponding resource. Paragraph 0041 teaches client device 102a is to play back both sub-streams 120 and 124 substantially concurrently, for example, where sub-stream 124 is an audio sub-stream that accompanies sub-stream 120, which is a video sub-stream. Client device 102a may request media segment 120a and receive a response that includes media segment 120a. Client device 102a may subsequently request media segment 124a and receive a response that includes media segment 124a. Paragraph 0042 teaches a server can receive requests to stream media segments of multimedia content to a client device. Based on each request, the server can send to the client device a plurality of media segments of the multimedia content. Paragraph 0070 teaches the sub-streams included in a stream of multimedia content can vary throughout the stream. The client device could select at least one new sub-stream, which could optionally replace another sub-stream in the stream. A client device, e.g. a user of the client device, could selectively switch to an audio sub-stream corresponding to a different language during a stream); determine a video image information of the video corresponding to the first time at which the audio data switching instruction was received; determine a content of the video that occurs after the video image information and that has not yet been played, wherein the content is to be played at a second time during the process of playing the video (Fig.1B, Paragraph 0029 teaches a plurality of sub-streams divided into a sequence of media segments, which can be played back in order by a video player on a client device. Each media segment may correspond to a substantially fixed time period of multimedia content 116. Paragraph 0034 teaches default sub-streams may be changed before or during the stream. Sub-streams in the stream may change during the stream, for example, as selected by the client device and/or the server. Client device may select a lower bitrate for the stream, such that the server switches from sending video segments of sub-stream 120 to sending video segments of sub-stream 122, e.g. client device may request segments of a lower bitrate sub-stream(s). A client device may similarly select between sub-streams 124 and 126 to select the language to be played with the video content. Paragraph 0041 teaches as each video segment of sub-stream 120 has a corresponding audio segment of sub-stream 124 for playback, client device 102a refrains from playing back media segment 120a until the audio content of media segment 124a is also received and ready for playback. Paragraph 0052 taches media segments that are sent can correspond at least partially to concurrent portions of the multimedia content. In particular, these media segments may at least partially temporally overlap in the multimedia content, may completely overlap in the multimedia content, or may correspond to a substantially same time period in the multimedia content. Examples of concurrent portions of multimedia content are media segments in FIG. 1B that share the same letter in their reference signs. Paragraph 0053 teaches where received media segments correspond to at least partially concurrent portions of the multimedia content, a client device may play back the received media segments at least partially concurrently, i.e. the content of those segments may be played back at least partially concurrently. For example, audio and video segments that correspond to substantially the same time period in the multimedia content may be played back together. Paragraph 0061 teaches the requests could instead be for media segments in sub-stream 124 where the media segments of sub-stream 120, or of another sub-stream, are sent in response to the requests. As client(s) play back the content, when requesting to switch audio data, system would determine the playback time of the content, in order to request subsequent sub-stream(s) corresponding to the desired audio data, in order to receive subsequent audio sub-stream(s) along with the corresponding video sub-stream); and switch from playing the current audio to playing the target audio at the second time during the process of playing the video based on the target audio data (Paragraph 0034 teaches default sub-streams may be changed before or during the stream. Sub-streams in the stream may change during the stream, for example, as selected by the client device and/or the server. Client device may select a lower bitrate for the stream, such that the server switches from sending video segments of sub-stream 120 to sending video segments of sub-stream 122, e.g. client device may request segments of a lower bitrate sub-stream(s). A client device may similarly select between sub-streams 124 and 126 to select the language to be played with the video content. Paragraph 0053 teaches where received media segments correspond to at least partially concurrent portions of the multimedia content, a client device may play back the received media segments at least partially concurrently, i.e. the content of those segments may be played back at least partially concurrently. For example, audio and video segments that correspond to substantially the same time period in the multimedia content may be played back together. Paragraph 0070 teaches the sub-streams included in a stream of multimedia content can vary throughout the stream. The client device could select at least one new sub-stream, which could optionally replace another sub-stream in the stream. A client device, e.g. a user of the client device, could selectively switch to an audio sub-stream corresponding to a different language during a stream. Once requested subsequent sub-stream(s) are received, the target audio and video sub-streams are played back together). Swaminathan does not explicitly teach wherein video image information is video image frame; wherein content is an I-frame. In an analogous art, Hoffmann teaches wherein video image information is video image frame (Paragraph 0084 teaches AV bistream may include a sequence of video I-frames and video P-frames); wherein content is an I-frame (Paragraph 0085 teaches each segment of content, audio content or video content, of the AV bitstream start with an I-frame, to enable seamless adaptation, e.g., splicing or switching, at segment boundaries, e.g., at the start of a segment of the video content of a video elementary stream, indicated by a video I-frame, time-aligned with the start of a segment, indicated by an audio I-frame, of corresponding audio content of each of at least one audio elementary stream). Therefore, it would have been obvious to a person of ordinary skill in the art to modify the system of Swaminathan to include wherein video image information is video image frame; wherein content is an I-frame, as taught by Hoffmann, for the advantage of enabling seamless adaptation at segment boundaries (Hoffmann – Paragraph 0085), providing high level of temporal accuracy, allowing better and finer synchronization of audio/video content. Combination of Swaminathan and Hoffmann teach determine a video image frame of the video corresponding to the first time at which the audio data switching instruction was received; determine an I-frame of the video that occurs after the video image frame and that has not yet been played, wherein the I-frame is to be played at a second time during the process of playing the video. Where Swaminathan teaches during playback of content, at a particular time during playback of a segment of the content, requesting of subsequent segment of audio/video for playback. Where playback of a subsequent segment of the video at a second time, that occurs after the point at where the switching instruction is received, where the subsequent segment of the video has not yet been played. Hoffman teaches that each segment of content such as video content starts with an I-frame. Thus, the combination would yield for where segments made up of frames, where a the request is received at a certain frame within the segment of content, resulting in request for subsequent segment that are after that particular frame/segment, where the segment that starts with an I-frame is determined and played back at the second time. Consider claims 3, 13, and 19, Swaminathan and Hoffmann teach wherein the switching switching from playing the current audio to playing the target audio at the second time during the process of playing the video based on the target audio (Swaminathan - Paragraph 0034 teaches default sub-streams may be changed before or during the stream. Sub-streams in the stream may change during the stream, for example, as selected by the client device and/or the server. Client device may select a lower bitrate for the stream, such that the server switches from sending video segments of sub-stream 120 to sending video segments of sub-stream 122, e.g. client device may request segments of a lower bitrate sub-stream(s). A client device may similarly select between sub-streams 124 and 126 to select the language to be played with the video content. Paragraph 0053 teaches where received media segments correspond to at least partially concurrent portions of the multimedia content, a client device may play back the received media segments at least partially concurrently, i.e. the content of those segments may be played back at least partially concurrently. For example, audio and video segments that correspond to substantially the same time period in the multimedia content may be played back together. Paragraph 0070 teaches the sub-streams included in a stream of multimedia content can vary throughout the stream. The client device could select at least one new sub-stream, which could optionally replace another sub-stream in the stream. A client device, e.g. a user of the client device, could selectively switch to an audio sub-stream corresponding to a different language during a stream. Hoffmann – Paragraph 0085) comprises: clearing cached data of the current audio after the second time, and reading and playing cached data of the target audio data from the second time (Swaminathan - Paragraph 0027). Consider claims 4, 14, and 20, Swaminathan and Hoffmann teach wherein the requesting, upon receiving the audio data switching instruction, the target audio data from the server comprises: establishing a data obtaining link for the target audio data with the server, and obtaining and caching the target audio data based on the data obtaining link (Swaminathan - Paragraph 0032 teaches sub-stream 124 is an audio sub-stream of multimedia content 116 corresponding to Language A, e.g. English. Sub-stream 126 is an audio sub-stream of multimedia content 116 corresponding to Language B, e.g. Spanish. Paragraph 0034 teaches a client device may similarly select between sub-streams 124 and 126 to select the language to be played with the video content. Paragraph 0037 teaches the client device may use a request comprising a URL provided by the manifest to request a corresponding resource. Paragraph 0041 teaches a client device 102a is to play back both sub-streams 120 and 124 substantially concurrently, for example, where sub-stream 124 is an audio sub-stream that accompanies sub-stream 120, which is a video sub-stream. Client device 102a may request media segment 120a and receive a response that includes media segment 120a. Client device 102a may subsequently request media segment 124a and receive a response that includes media segment 124a. Paragraph 0042 teaches a server can receive requests to stream media segments of multimedia content to a client device. Based on each request, the server can send to the client device a plurality of media segments of the multimedia content. Paragraph 0070 teaches the sub-streams included in a stream of multimedia content can vary throughout the stream. The client device could select at least one new sub-stream, which could optionally replace another sub-stream in the stream. A client device, e.g. a user of the client device, could selectively switch to an audio sub-stream corresponding to a different language during a stream. Paragraph 0027 teaches cache 112 can be associated with client device 102a for storing content received from server 106. Each media segment may correspond to an HTTP resource, for example, in implementations where an HTTP protocol is employed). Consider claims 5, 15, and 21, Swaminathan and Hoffmann teach upon receiving the audio data switching instruction (Swaminathan – Paragraph 0070), further comprising: disconnecting a data obtaining link for obtaining initial audio data associated with the current audio from the server (Swaminathan - Paragraph 0034, 0047; Paragraph 0037, 0070). Consider claims 6 and 16, Swaminathan and Hoffmann teach before the playing the current audio during the process of playing the video, further comprising: requesting, according to an audio and video data obtaining instruction of a user, a data obtaining link for video data associated with the audio and video data obtaining instruction and a data obtaining link for at least one piece of audio data corresponding to the video data from the server; and determining, according to an initial audio data determination instruction of the user, initial audio data associated with the current audio from the at least one piece of audio data, and obtaining the video data and the initial audio data as the current audio and video data through data obtaining links respectively corresponding to the video data and the initial audio data (Swaminathan – Fig.1B, Paragraph 0032, 0034, 0037, 0041-0042). Consider claims 7 and 17, Swaminathan and Hoffmann teach wherein initial audio data associated with the current audio and the target audio data are dubbing audio data of different languages matching a same video data (Swaminathan – Fig.1B, Paragraph 0029, 0032, 0034, 0041, 0070). Consider claims 11, Swaminathan and Hoffmann teach a computer program product, comprising a computer program carried on a non-transitory computer-readable storage medium, wherein the computer program comprises program codes for performing the audio playing method according to claim 1 (Swaminathan – Paragraph 0087). Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON K LIN whose telephone number is (571)270-1446. The examiner can normally be reached on Monday-Friday 9AM-5PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian Pendleton can be reached on 571-272-7527. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JASON K LIN/Primary Examiner, Art Unit 2425
Read full office action

Prosecution Timeline

Aug 30, 2024
Application Filed
Sep 09, 2025
Non-Final Rejection — §103
Dec 03, 2025
Response Filed
Mar 11, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12604047
JUST IN TIME CONTENT CONDITIONING
2y 5m to grant Granted Apr 14, 2026
Patent 12593082
JUST IN TIME CONTENT CONDITIONING
2y 5m to grant Granted Mar 31, 2026
Patent 12556760
CREDITING EXPOSURE TO MEDIA IDENTIFIED USING SOURCE FILTERING
2y 5m to grant Granted Feb 17, 2026
Patent 12548455
GROUND-BASED CONTENT CURATION PLATFORM DISTRIBUTING GEOGRAPHICALLY-RELEVANT CONTENT TO AIRCRAFT INFLIGHT ENTERTAINMENT SYSTEMS
2y 5m to grant Granted Feb 10, 2026
Patent 12537993
SMART HOME AUTOMATION USING MULTI-MODAL CONTEXTUAL INFORMATION
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
49%
Grant Probability
84%
With Interview (+34.8%)
3y 7m
Median Time to Grant
Moderate
PTA Risk
Based on 454 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month