Prosecution Insights
Last updated: April 19, 2026
Application No. 18/659,966

AUDIO DECODER, AUDIO ENCODER, METHOD FOR DECODING, METHOD FOR ENCODING AND BITSTREAM, USING A PLURALITY OF PACKETS, THE PACKETS COMPRISING ONE OR MORE SCENE CONFIGURATION PACKETS DEFINING A TEMPORAL EVOLUTION OF A RENDERING SCENARIO AND COMPRISING A TIMESTAMP INFORMATION

Non-Final OA §103§112
Filed
May 09, 2024
Examiner
CAUDLE, PENNY LOUISE
Art Unit
2657
Tech Center
2600 — Communications
Assignee
Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
OA Round
1 (Non-Final)
67%
Grant Probability
Favorable
1-2
OA Rounds
3y 2m
To Grant
82%
With Interview

Examiner Intelligence

Grants 67% — above average
67%
Career Allow Rate
46 granted / 69 resolved
+4.7% vs TC avg
Strong +16% interview lift
Without
With
+15.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
19 currently pending
Career history
88
Total Applications
across all art units

Statute-Specific Performance

§101
21.0%
-19.0% vs TC avg
§103
43.7%
+3.7% vs TC avg
§102
15.8%
-24.2% vs TC avg
§112
17.1%
-22.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 69 resolved cases

Office Action

§103 §112
DETAILED ACTION This examination is in response to the communication filed on 08/28/2024. Claims 1-28 are currently pending, where claim 29 has been canceled and claims 1, 13 and 25-27 are independent. Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Information Disclosure Statement The information disclosure statements (IDS) submitted on 11/19/2025, 10/17/2025, 07/01/2025, 12/18/2024, 11/11/2024 and 08/07/2024 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. Priority Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55. Claim Objections Claims 1-12, 14, 20, 23, 25 and 27 are objected to because of the following informalities: Claims 1, 4, 14, 25 and 27, the term “time stamp” should be “timestamp” for consistency with the other recitations of the term through the claims; Claims 2-12 depend from claim 1 and are objected to the same reason as claim 1l Claims 2 and 3, the phrase “…or when the audio decoder tunes in into a stream…” should read “…or when the audio decoder tunes in Claims 20 and 23, the term “Wherein” in line two should read “wherein” for consistency; and Claim 11, the term “scene configurations packets” in line 2 should read “…scene configuration packets”. Appropriate correction is required. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. Claims 1-28 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. The term “all relevant” in claims 1, 13 and 25-27 is a relative term which renders the claim indefinite. The term “relevant” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The use of the term “relevant” renders it unclear whether only the minimum information required for the renderer to configure itself is required or whether all possible information is required. For purposes of Examination this limitation is interpreted as requiring the minimum information needed for the renderer to configured itself. Claim 1, recites the limitation “the packets comprising a plurality of…” in line 6. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “the plurality of packets comprising a plurality of…”. Claim 1, the recites the limitation “wherein the scene configuration packets provide…” in line 8. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “wherein the plurality of scene configuration packets provide…”. Claim 1, the recites the limitation “…corresponding to the time stamp…” in line 15. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “…corresponding to the time stamp information…”. Claim 8 recites the limitation "the scene configuration packets…" in line 2. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…the plurality of scene configuration packets…”. Claim 9 recites the limitation "the scene configuration packets…" in line 2. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…the plurality of scene configuration packets…”. Claim 10 recites the limitation "the one or more scene configuration packets from a bitstream" in lines 2-3. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…the plurality of scene configuration packets…”. Claim 11 recites the limitation "the one or more scene configuration packets" in lines 2-3. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…the plurality of scene configuration packets…”. Claim 12, the recites the limitation “…on the basis of the timestamp of a first scene configuration packet…” in line 15. There is insufficient antecedent basis for this limitation in the claim. Claim 13, recites the limitation “the packets comprising a plurality of…” in line 6. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “the plurality of packets comprising a plurality of…”. Claim 13, the recites the limitation “wherein the scene configuration packets provide…” in line 8. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “wherein the plurality of scene configuration packets provide…”. Claim 14, recites the limitation “…provide, in one of the packets…” in line 2. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “…provide, in one of the plurality of packets…”. Claim 15 recites the limitation "…the scene configuration packets…" in line 2. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…the plurality of scene configuration packets…”. Claim 16 recites the limitation "…the scene configuration packets…" in line 2. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…the plurality of scene configuration packets…”. Claim 17 recites the limitation "…and the one or more scene configuration packets" in lines 4-5. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…and the plurality of scene configuration packets”. Claim 18 recites the limitation "…and the one or more scene configuration packets…" in lines 4-5. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…and the plurality of scene configuration packets…”. Claim 19 recites the limitation "…repeat the configuration packet" in line 2. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…repeat a scene configuration packet…”. Claim 20 recites the limitation "…repeat the configuration packet…" in line 2. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…repeat a scene configuration packet…”. Claim 21 recites the limitation "…repeat the configuration packet…" in line 2. There is insufficient antecedent basis for this limitation in the claim. For Examination purposed this limitation is interpreted as “…repeat a scene configuration packet…”. Claim 25, recites the limitation “the packets comprising a plurality of…” in line 5. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “the plurality of packets comprising a plurality of…”. Claim 25, the recites the limitation “wherein the scene configuration packets provide…” in line 7. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “wherein the plurality of scene configuration packets provide…”. Claim 25, the recites the limitation “…corresponding to the time stamp…” in line 14. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “…corresponding to the time stamp information…”. Claim 26, recites the limitation “the packets comprising a plurality of…” in line 5. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “the plurality of packets comprising a plurality of…”. Claim 26, the recites the limitation “wherein the scene configuration packets provide…” in line 7. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “wherein the plurality of scene configuration packets provide…”. Claim 27 recites the limitation “perform the method for providing…” in line 2. There is insufficient antecedent basis for this limitation in the claim. For examination purposed this limitation is interpreted as “…perform a method for providing…”. Claim 27, recites the limitation “the packets comprising a plurality of…” in line 6. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “the plurality of packets comprising a plurality of…”. Claim 27, the recites the limitation “wherein the scene configuration packets provide…” in line 8. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “wherein the plurality of scene configuration packets provide…”. Claim 27, the recites the limitation “…corresponding to the time stamp…” in line 15. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “…corresponding to the time stamp information…”. Claim 28 recites the limitation “…perform the method for providing…” in line 2. There is insufficient antecedent basis for this limitation in the claim. For examination purposed this limitation is interpreted as “…perform a method for providing…”. Claim 28, recites the limitation “the packets comprising a plurality of…” in line 6. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “the plurality of packets comprising a plurality of…”. Claim 28, the recites the limitation “wherein the scene configuration packets provide…” in line 8. There is insufficient antecedent basis for this limitation in the claim. For examination purposes this limitation is interpreted as “wherein the plurality of scene configuration packets provide…”. Claims 2-12 depend from independent claim 1 and are rejected for the same reasons as claim 1. Claims 14-24 depend from independent claim 13 and are rejected for the same reasons as claim 13. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or non-obviousness. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claims 1-8 and 23-28 are rejected under 35 U.S.C. 103 as being unpatentable over Mate et al. (WO 2021/186104 A1; herein “Mate”) (submitted by Applicant in IDS filed on 08/07/2024) in view of ATSC Standard: A/342 Part 3, MPEG-H System, Doc. A/342-3:2017, 3 March 2017 (herein “ATSC Part 3) (submitted by Applicant in the IDS filed on 10/17/2025. Regarding claims 1, 25 and 27, Mate teaches an audio decoder, method, and non-transitory digital storage medium having a computer program stored thereon to perform a method for providing a decoded audio representation on the basis of an encoded audio representation wherein the audio decoder is configured to spatially render one or more audio signals (Fig. 1, MPEG-I 6DoF RENDERER ); wherein the audio decoder is configured to receive a plurality of packets of different packet types (Fig. 1,SOCIAL VR AUDIO BITSTREAM and MPEG-I AUDIO BITSTREAM (MHAS), these two different bitstreams inherently include different types of packet. In addition, the MPEG-I MHAS bitstream includes a plurality of packet types defined by Clause 14 of the ISO/IEC 23008-3 standard as evidenced by the Draft of ISO/IEC 2308-30201x 3D Audio, Second Edition, dated 10/31/2016 submitted by Application in the IDS filed on 08/07/2024), the packets comprising a plurality of scene configuration packets (Page 1, lines 13-19 teaches “Bitstream content is data which has been created by encoding the 6DoF audio scene description, the raw audio signals and the MPEG-H encoded/decoded audio signal….An example representation of the encoded bitstream may comprise the scene description obtained as “EIF” (Encoder Input Format) and metadata required for 6DoF rendering” In addition, the MPEG-I MHAS bitstream packet types defined by Clause 14 of the ISO/IEC 23008-3 standard includes PACTYP_MPEGH3DACFG which embed an MPEG-H audio configuration structure and PACTYP_MPEGHSCENEINFO which embed an MPEG-H audio scene information structure and mae_AudioSceneInfo() in the packet payload as evidenced by the Draft of ISO/IEC 2308-30201x 3D Audio, Second Edition, dated 10/31/2016 submitted by Application in the IDS filed on 08/07/2024) and a timestamp information (Page 17, lines 9-11 teaches “The currently specified updates may be done based on a predetermined timestamp” and page 19, line 30 to page 20 line 10 teaches “The proposed update in EIF may be as follows…<timestamp>…In the above, the timestamp can also be a sequence number to enable temporal association with the bitstream content…” In addition, page 38, claim 3 teaches “the dynamic content is received as a MPEG-H stream packet” and claim 4 further teaches “the received dynamic content arrives with a timestamp to enable association of the received dynamic content with a playback timeline, or one or more bitstream content time segments” ), wherein the scene configuration packets provide a renderer configuration information (page 13, line 22 to page 15, line 2 teaches “Thus, at this point one may have 1) an audio scene in the bitstream, 2) rendering instructions for dynamic updates also in the bitstream…the renderer 206…may perform the following…1) obtain AudioScene and rendering instructions from the bitstream…” the rendering instructions are interpreted as renderer configuration information.), wherein the renderer configuration information defines a temporal evolution of a rendering scenario (Based on page 33, lines 30-35 of the Specification “temporal evolution of a rendering scenario” is interpreted as information “defining a usage of scene objects” or information “defining when or under which conditions different scene objects…should be used in a rendering process” Mate page 17, line 9 to page 18, line 5 teaches “The currently specified updates may be done based on a predetermined timestamp…which describes Scene Updates with the declaration part in a scene.xml file may be followed by any number of <Update> nodes. They have the following syntax:…Attribute time…Time when update is performed (seconds) Note: Must be less than or equal to the duration attribute of the AudioScene…” and page 18, line 6 to page 19, line 29 teaches “The following updates synchronously move three ObjectSources of a vehicle in motion along a trajectory…The following example turns on the sources of a car when the listener gets close…The scene loops at the rate of the scene duration as specified in the AudioScene attribute. Timed updates are triggered for every loop of the scene” The AudioScene attribute duration and Timed Updates are interpreted as information defining temporal evolution of a rendered scene object), and wherein the audio decoder is configured to evaluate the timestamp information and to set a rendering configuration to a rendering scenario corresponding to the time stamp using the renderer configuration information (Page 20, lines 5-10 teaches “In the above, the timestamp can also be a sequence number to enable temporal association with the bitstream content. For example, the renderer loop will apply the dynamic content to the right temporal segment of the bitstream content. The timestamp is thus used for associating the update message with the appropriate playback timeline” Using the timestamp to determine the appropriate playback timeline is interpreted as evaluating the timestamp information to set a rendering configuration, i.e., the appropriate playback timeline). In addition Mate teaches on page 13, line 22 to page 15, line 2 “Thus, at this point one may have 1) an audio scene in the bitstream, 2) rendering instructions for dynamic updates also in the bitstream…the renderer 206…may perform the following…1) obtain AudioScene and rendering instructions from the bitstream…” the AudioScene and rendering instructions provided in the bitstream, i.e., different data packets, are interpreted as all the relevant information to configure the renderer), however, Mate fails to explicitly teach the renderer configuration information provides all relevant information for a renderer to configure itself. ATSC Part 3 teaches “MPEG-H Audio uses a set of static metadata, the ‘Metadata Audio Element’ (MAE), to define an ‘Audio Scene.’…Audio Object are associated with metadata that contains all information necessary for personalization, interactive reproduction, and rendering in flexible reproduction layouts” (Section 4.2.1 Metadata Audio Elements). Therefore, ATSC teaches the MAE and associated Audio Objects include all the information necessary for rendering. Information for a renderer to configure itself is considered part of the information necessary for rendering. Thus, ATSC Part 3 teaches the renderer configuration information provides all relevant information for a renderer to configure itself. Mate differs from the claimed invention, as defined in claims 1, 25, and 27, in that Mate fails to explicitly disclose that the renderer configuration information provides all relevant information for a renderer to configure itself. Transmitting the necessary information for personalization, interactive reproduction, and rendering in flexible reproduction layouts of audio scene information in known in the art as evidenced by ASTC Part 3. Therefore, it would have been obvious to one having ordinary skill in the art to have modified Mate to conform to the ATSC Part 3 MPEG-H standard by providing all the necessary information for the renderer to configure itself in the MAE metadata as taught by ATSC Part 3 is it merely constitutions the combination of known structures/processes to achieve the predictable result of being standard compliant. Regarding claim 2, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1( see detailed element mapping above). In addition, ATSC Part 3 further teaches wherein the audio decoder is configured to evaluate the timestamp information when the audio decoder has missed one or more preceding scene configuration packets of a stream (the “or” makes this limitation optional), or when the audio decoder tunes in into a stream (Page 2, Section B.3.1 Tune-in teaches “A tune-in happens at a channel change of a receiving device. The audio decoder is able to tune in to a new audio stream at every random access point (RAP)…As defined above, the sync sample contains the configuration information (PACTYP_MPEGH3DACFG and PACTYP_AUDIOSCENEINFO that is used to initialize the audio decoder. After initialization, the audio decoder reads encoded audio frames (PACTYP_MPEGH3DAFRAME) and decodes them” and Section), and wherein the audio decoder is configured to evaluate the timestamp information when the audio decoder tunes in to a stream (Page 2, Section B.3.1 Tune-in teaches “…As defined above, the sync sample contains the configuration information (PACTYP_MPEGH3DACFG and PACTYP_AUDIOSCENEINFO that is used to initialize the audio decoder. After initialization, the audio decoder reads encoded audio frames (PACTYP_MPEGH3DAFRAME) and decodes them” initializing the decoder and decoding the packets inherently includes evaluation the timestamp information). Mate differs from the claimed invention, as defined in claim 2, in that Mate fails to explicitly disclose that the decoder is configured to evaluate the timestamp information when the audio decoder tunes in to a stream. Initializing and decoding scene information in response to tuning in to a stream in known in the art as evidenced by ASTC Part 3. Therefore, it would have been obvious to one having ordinary skill in the art to have modified Mate to conform to the ATSC Part 3 MPEG-H standard by configuring the decoder to initialize and decode scene information in response to tuning into a stream as taught by ATSC Part 3 is it merely constitutions the combination of known structures/processes to achieve the predictable result of being standard compliant. Regarding claim 3, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1( see detailed element mapping above). In addition, ATSC Part 3 further teaches wherein the audio decoder is configured to execute a temporal development of a rendering scene up to a playout time defined by the timestamp information when the audio decoder has missed one or more preceding scene configuration packets of a stream (the “or” make this element optional), or when the audio decoder tunes in into a stream (Page 2, Section B.3.1 Tune-in teaches “…As defined above, the sync sample contains the configuration information (PACTYP_MPEGH3DACFG and PACTYP_AUDIOSCENEINFO that is used to initialize the audio decoder. After initialization, the audio decoder reads encoded audio frames (PACTYP_MPEGH3DAFRAME) and decodes them” initializing the decoder and decoding the packets is part of executing a temporal development of a rendering scene). Mate differs from the claimed invention, as defined in claim 2, in that Mate fails to explicitly disclose that the decoder is configured to evaluate the timestamp information when the audio decoder tunes in to a stream. Initializing and decoding scene information in response to tuning in to a stream in known in the art as evidenced by ASTC Part 3. Therefore, it would have been obvious to one having ordinary skill in the art to have modified Mate to conform to the ATSC Part 3 MPEG-H standard by configuring the decoder to initialize and decode scene information in response to tuning into a stream as taught by ATSC Part 3 is it merely constitutions the combination of known structures/processes to achieve the predictable result of being standard compliant. Regarding claim 4, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1( see detailed element mapping above). In addition, Mate further teaches wherein the audio decoder is configured to acquire a time scale information which is comprised in a packet (See update table spanning pages 17-18, the time attribute is provided in seconds; seconds is interpreted as time scale information); and wherein the audio decoder is configured to evaluate the time stamp information using the time scale information (The update table spanning pages 17-18, teaches the audio scene update is performed when the specified time is reached). Regarding claim 5, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1( see detailed element mapping above). In addition, b further teaches wherein the audio decoder is configured to determine, in dependence on the timestamp information, which scene objects should be used for the rendering (Page 20, lines 5-10 teaches “In the above, the timestamp can also be a sequence number to enable temporal association with the bitstream content. For example, the renderer loop will apply the dynamic content to the right temporal segment of the bitstream content. The timestamp is thus used for associating the update message with the appropriate playback timeline” Using the timestamp to right temporal segment of the bitstream content is interpreted as determining which scene objects should be used or updated for rendering. In addition, page 38, claim 3 teaches “the dynamic content is received as a MPEG-H stream packet” and claim 4 further teaches “the received dynamic content arrives with a timestamp to enable association of the received dynamic content with a playback timeline, or one or more bitstream content time segments”). Regarding claim 6, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1( see detailed element mapping above). In addition, Mate further teaches wherein the audio decoder is configured to evaluate a scene configuration packet, which defines an evolution of a rendering scene starting from a point of time which lies before a time defined by the timestamp information (Update table spanning pages 17 to 18 teaches “The update is performed, when ▪ the specified time is reached” where the time is specified in seconds and must be less than or equal to the duration attribute of the AudioScene. Therefore, the update is performed after the start and before the end of the AudioScene. Thus, the AudioScene starts before the update time); and wherein the audio decoder is configured to derive a scene configuration associated with a point in time defined by the timestamp information on the basis of the information in the scene configuration packet (Page 20, lines 5-10 teaches “In the above, the timestamp can also be a sequence number to enable temporal association with the bitstream content. For example, the renderer loop will apply the dynamic content to the right temporal segment of the bitstream content. The timestamp is thus used for associating the update message with the appropriate playback timeline”). Regarding claim 7, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 6 (see detailed element mapping above). In addition, Mate further teaches wherein the audio decoder is configured to derive the scene configuration associated with a point in time defined by the timestamp information using one or more scene update packets (Page 9, lines 29-31 teaches “the positions of the audio objects may be modified whenever the incoming dynamic update modified the position of the anchor objects”; page 38, claim 3 teaches “the dynamic content is received as a MPEG-H stream packet” i.e., update packets and page 38, claim 4 further teaches “the received dynamic content arrives with a timestamp to enable association of the received dynamic content with a playback timeline, or one or more bitstream content time segments” thus the time information associated with the update packets are used to derive the scene configuration). Regarding claim 8, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1( see detailed element mapping above). In addition, Mate further teaches wherein the scene configuration packets are conformant to a MPEG-I MHAS packet definition (Fig. 1, MPEG-I AUDIO BITSTREAM (MHAS) teaches that the decoder is configured to extract the scene configuration information from MPEG-I MHAS conformant packets). Regarding claim 9, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1( see detailed element mapping above). In addition, Mate further teaches wherein the scene configuration packets each comprise a packet type identifier, a packet label, a packet length information and a packet payload (Fig. 1, MPEG-I AUDIO BITSTREAM (MHAS) teaches that the decoder is configured to extract the scene configuration information from MPEG-I MHAS conformant packets. In addition, the MPEG-I MHAS bitstream packets defined by Clause 14 of the ISO/IEC 23008-3 standard include packet type identified, packet label, packet length, and packet payload as evidenced by table 219 in the Draft of ISO/IEC 2308-30201x 3D Audio, Second Edition, dated 10/31/2016 submitted by Application in the IDS filed on 08/07/2024 ). Regarding claim 10, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1( see detailed element mapping above). In addition, Mate further teaches wherein the audio decoder is configured to extract the one or more scene configuration packets from a bitstream comprising a plurality of MPEG-H packets, comprising packets representing one or more audio channels to be rendered (Fig. 1, MPEG-I Audio BITSTREAM (MHAS) and CHANNELS OBJECTS HOA and page 16, lines 3-32 teaches “The anchor object related AudioElements may also be a multi-channel ObjectSource…” ). Regarding claim 11, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 1 (see detailed element mapping above). In addition, Mate further teaches wherein the audio decoder is configured to receive the one or more scene configurations packets via a broadcast stream (Page 11, lines 10-11 teaches “For 6DoF stream or broadcast environments based on (such as MPEG-DASH or MPEG-H MMT)….”). Regarding claim 12, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 11( see detailed element mapping above). In addition, ATSC Part 3 further teaches wherein the audio decoder is configured to tune into the broadcast stream and to determine a playout time on the basis of the timestamp of a first scene configuration packet identified by the audio decoder after the tune-in (Page 2, Section B.3.1 Tune-in teaches “A tune-in happens at a channel change of a receiving device. The audio decoder is able to tune in to a new audio stream at every random access point (RAP)…As defined above, the sync sample contains the configuration information (PACTYP_MPEGH3DACFG and PACTYP_AUDIOSCENEINFO that is used to initialize the audio decoder. After initialization, the audio decoder reads encoded audio frames (PACTYP_MPEGH3DAFRAME) and decodes them” initializing the decoder and decoding the packets inherently includes evaluation the timestamp information). Mate differs from the claimed invention, as defined in claim 2, in that Mate fails to explicitly disclose that the decoder is configured to evaluate the timestamp information when the audio decoder tunes in to a stream. Initializing and decoding scene information in response to tuning in to a stream in known in the art as evidenced by ASTC Part 3. Therefore, it would have been obvious to one having ordinary skill in the art to have modified Mate to conform to the ATSC Part 3 MPEG-H standard by configuring the decoder to initialize and decode scene information in response to tuning into a stream as taught by ATSC Part 3 is it merely constitutions the combination of known structures/processes to achieve the predictable result of being standard compliant. Regarding claims 13, 26 and 28, Mate teaches an apparatus (Fig. 2, encoder 202 ), method, and non-transitory digital storage medium having a computer program stored thereon to perform a method for providing an encoded audio representation wherein the apparatus is configured to provide an information for a spatial rendering of one or more audio signals (Fig. 6, MPEG-I AUDIO ENCODER and MPEG-I AUDIO RENDERER); wherein the apparatus is configured to provide a plurality of packets of different packet types, the packets comprising a plurality of scene configuration packets and a timestamp information (Fig. 6 teaches the Scene Description and Audio signal are encoded in MPEG-I audio encoder. In addition, the MPEG-I MHAS bitstream includes a plurality of packet types defined by Clause 14 of the ISO/IEC 23008-3 standard as evidenced by the Draft of ISO/IEC 2308-30201x 3D Audio, Second Edition, dated 10/31/2016 submitted by Application in the IDS filed on 08/07/2024 ), wherein the scene configuration packets provide a renderer configuration information (Fig. 6 teaches the Scene Description and Audio signal are encoded in MPEG-I audio encoder. In addition, the MPEG-I packet types include PACTYP_MPEGH3DACFG and PACTYP_AUDIOSCENEINFOR as defined by Clause 14, table 220 of the ISO/IEC 23008-3 standard as evidenced by the Draft of ISO/IEC 2308-30201x 3D Audio, Second Edition, dated 10/31/2016 submitted by Application in the IDS filed on 08/07/2024. As discussed above with respect to the MPEG-I MHAS decoder, the plurality of packet types includes scene configuration packets) wherein the renderer configuration information defines a temporal evolution of a rendering scenario (Fig. 6 teaches the Scene Description and Audio signal are encoded in MPEG-I audio encoder. As discussed above regarding the MPEG-I decoder the encoded information includes temporal evolution information. See page 33, lines 30-35 of the Specification “temporal evolution of a rendering scenario” is interpreted as information “defining a usage of scene objects” or information “defining when or under which conditions different scene objects…should be used in a rendering process” Mate page 17, line 9 to page 18, line 5 teaches “The currently specified updates may be done based on a predetermined timestamp…which describes Scene Updates with the declaration part in a scene.xml file may be followed by any number of <Update> nodes. They have the following syntax:…Attribute time…Time when update is performed (seconds) Note: Must be less than or equal to the duration attribute of the AudioScene…” and page 18, line 6 to page 19, line 29 teaches “The following updates synchronously move three ObjectSources of a vehicle in motion along a trajectory…The following example turns on the sources of a car when the listener gets close…The scene loops at the rate of the scene duration as specified in the AudioScene attribute. Timed updates are triggered for every loop of the scene” The AudioScene attribute duration and Timed Updates are interpreted as information defining temporal evolution of a rendered scene object), and In addition Mate teaches on page 13, line 22 to page 15, line 2 “Thus, at this point one may have 1) an audio scene in the bitstream, 2) rendering instructions for dynamic updates also in the bitstream…the renderer 206…may perform the following…1) obtain AudioScene and rendering instructions from the bitstream…” the AudioScene and rendering instructions provided in the bitstream, i.e., different data packets, are interpreted as all the relevant information to configure the renderer), however, Mate fails to explicitly teach the renderer configuration information provides all relevant information for a renderer to configure itself. ATSC Part 3 teaches “MPEG-H Audio uses a set of static metadata, the ‘Metadata Audio Element’ (MAE), to define an ‘Audio Scene.’…Audio Object are associated with metadata that contains all information necessary for personalization, interactive reproduction, and rendering in flexible reproduction layouts” (Section 4.2.1 Metadata Audio Elements). Therefore, ATSC teaches the MAE and associated Audio Objects include all the information necessary for rendering. Information for a renderer to configure itself is considered part of the information necessary for rendering. Thus, ATSC Part 3 teaches the renderer configuration information provides all relevant information for a renderer to configure itself. Mate differs from the claimed invention, as defined in claims 13, 26, and 28, in that Mate fails to explicitly disclose that the renderer configuration information provides all relevant information for a renderer to configure itself. Transmitting the necessary information for personalization, interactive reproduction, and rendering in flexible reproduction layouts of audio scene information in known in the art as evidenced by ASTC Part 3. Therefore, it would have been obvious to one having ordinary skill in the art to have modified Mate to conform to the ATSC Part 3 MPEG-H standard by providing all the necessary information for the renderer to configure itself in the MAE metadata as taught by ATSC Part 3 is it merely constitutions the combination of known structures/processes to achieve the predictable result of being standard compliant. Regarding claim 14, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). In addition, Mate further teaches wherein the apparatus is configured to provide, in one of the packets, a time scale information (See update table spanning pages 17-18, the time attribute is provided in seconds; seconds is interpreted as time scale information), wherein the time stamp information is provided in a representation related to the time scale information (The update table spanning pages 17-18, teaches the audio scene update is performed when the specified time is reached). Regarding claim 15, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). In addition, Mate further teaches wherein the apparatus is configured to provide the scene configuration packets such that the scene configuration packets are conformant to a MPEG-H MHAS packet definition (Fig. 6 teaches the Scene Description and Audio signal are encoded in MPEG-I audio encoder.). Regarding claim 16, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). In addition, Mate further teaches wherein the apparatus is configured to provide the scene configuration packets such that the scene configuration packets each comprise a packet type identifier, a packet label, a packet length information and a packet payload (Fig. 6 teaches the Scene Description and Audio signal are encoded in MPEG-I audio encoder. In addition, the MPEG-I MHAS bitstream packets defined by Clause 14 of the ISO/IEC 23008-3 standard include packet type identified, packet label, packet length, and packet payload as evidenced by table 219 in the Draft of ISO/IEC 2308-30201x 3D Audio, Second Edition, dated 10/31/2016 submitted by Application in the IDS filed on 08/07/2024). Regarding claim 17, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). In addition, Mate further teaches wherein the apparatus is configured to provide a bitstream comprising a plurality of MPEG- H packets, comprising packets representing one or more audio channels to be rendered and the one or more scene configuration packets (Fig. 6 teaches the Scene Description and Audio signal are encoded in MPEG-I audio encoder. In addition, Fig. 1, MPEG-I Audio BITSTREAM (MHAS) and CHANNELS OBJECTS HOA and page 16, lines 3-32 teaches “The anchor object related AudioElements may also be a multi-channel ObjectSource…” therefore, the encoder must be able to generate packets representing one or more audio channels to be rendered). Regarding claim 18, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). In addition, Mate further teaches wherein the apparatus is configured to provide a bitstream comprising a plurality of MPEG- H packets, comprising packets representing one or more audio channels to be rendered and the one or more scene configuration packets in an interleaved manner (Fig. 6 teaches the Scene Description and Audio signal are encoded in MPEG-I audio encoder and Fig. 1, MPEG-I Audio BITSTREAM (MHAS) and CHANNELS OBJECTS HOA and page 16, lines 3-32 teaches “The anchor object related AudioElements may also be a multi-channel ObjectSource…” Interleaving of the plurality of packet types is inherent in the MPEG-H standard). Regarding claim 23, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). In addition, Mate further teaches wherein the apparatus is configured to adapt the timestamp information to a playout time (Page 20, lines 5-10 teaches “In the above, the timestamp can also be a sequence number to enable temporal association with the bitstream content. For example, the renderer loop will apply the dynamic content to the right temporal segment of the bitstream content. The timestamp is thus used for associating the update message with the appropriate playback timeline” Using the timestamp to right temporal segment of the bitstream content is interpreted as determining which scene objects should be used or updated for rendering. In addition, page 38, claim 3 teaches “the dynamic content is received as a MPEG-H stream packet” and claim 4 further teaches “the received dynamic content arrives with a timestamp to enable association of the received dynamic content with a playback timeline, or one or more bitstream content time segments”). Regarding claim 24, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). In addition, Mate further teaches wherein the apparatus is configured to adapt the timestamp information to a playout time of rendering scene information comprised in packets which are provided by the apparatus in a temporal environment of a respective scene configuration packet in which the respective timestamp information is comprised (Page 20, lines 5-10 teaches “In the above, the timestamp can also be a sequence number to enable temporal association with the bitstream content. For example, the renderer loop will apply the dynamic content to the right temporal segment of the bitstream content. The timestamp is thus used for associating the update message with the appropriate playback timeline” Using the timestamp to right temporal segment of the bitstream content is interpreted as determining which scene objects should be used or updated for rendering. In addition, page 38, claim 3 teaches “the dynamic content is received as a MPEG-H stream packet” and claim 4 further teaches “the received dynamic content arrives with a timestamp to enable association of the received dynamic content with a playback timeline, or one or more bitstream content time segments”). Allowable Subject Matter Claims 19-22 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims. Regarding claim 19, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). However, the prior art of record fails to disclose or suggest wherein the apparatus is configured to periodically repeat the plurality of scene configuration packets). Regarding claim 20, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). However, the prior art of record fails to disclose or suggest wherein the apparatus is configured to periodically repeat the plurality of scene configuration packets, with one or more scene payload packets and one or more packets representing one or more audio channels to be rendered in between two subsequent scene configuration packets. Regarding claim 21, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). However, the prior art of record fails to disclose or suggest wherein the apparatus is configured to periodically repeat the scene configuration packet, with one or more packets representing one or more audio channels to be rendered in between two subsequent scene configuration packets; and wherein the apparatus is configured to provide one or more scene payload packets at request . Regarding claim 22, the combination of Mate and ATSC Part 3 teaches all of the elements of claim 13 (see detailed element mapping above). However, the prior art of record fails to disclose or suggest wherein the apparatus is configured to provide a plurality of otherwise identical scene configuration packets differing in the timestamp information. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to PENNY L CAUDLE whose telephone number is (703)756-1432. The examiner can normally be reached M-Th 8:00 am to 5:00 pm eastern. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /PENNY L CAUDLE/Examiner, Art Unit 2657 /DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657
Read full office action

Prosecution Timeline

May 09, 2024
Application Filed
Mar 19, 2026
Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12592243
METHOD AND ELECTRONIC DEVICE FOR PERSONALIZED AUDIO ENHANCEMENT
2y 5m to grant Granted Mar 31, 2026
Patent 12573371
VOCABULARY SELECTION FOR TEXT PROCESSING TASKS USING POWER INDICES
2y 5m to grant Granted Mar 10, 2026
Patent 12566924
Apparatus for Evaluating and Improving Response, Method and Computer Readable Recording Medium Thereof
2y 5m to grant Granted Mar 03, 2026
Patent 12567433
AUTOMATED EVALUATION OF SYNTHESIZED SPEECH USING CROSS-MODAL AND CROSS-LINGUAL TRANSFER OF LANGUAGE ENCODING
2y 5m to grant Granted Mar 03, 2026
Patent 12554937
FEW SHOT INCREMENTAL LEARNING FOR NAMED ENTITY RECOGNITION
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
67%
Grant Probability
82%
With Interview (+15.5%)
3y 2m
Median Time to Grant
Low
PTA Risk
Based on 69 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month