DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2-4, 6-7,12, 14, 23-25, 30, 32 and 33 is/are rejected under 35 U.S.C. 103 as being unpatentable over Brandenburg (US 20180098173 A1), and further in view of Seefeldt (US 20220360899 A1).
Regarding claim 32, Brandenburg (US 20180098173 A1) discloses an apparatus, for six degrees of freedom rendering, the apparatus comprising:
at least one processor (Brandenburg, ¶ [0060]: “a processor, preferably a microprocessor”); and
at least one memory storing instructions that, when executed with the at least one processor (Brandenburg, ¶ [0013]: “a computer readable storage medium …store a program for use by or in connection with an instruction execution system, apparatus, or device.”), cause the apparatus to:
generate one or more audio signal content sets that provide a plurality of audio sources of at least one audio scene (Brandenburg, ¶ [0142]: “may be static (listener in the center of a 5.1, 7.1 or 22.2 set-up) or dynamic (e.g. head phones where the listener can turn his head)... Scenarios including dynamic spatial listener information”);
obtain positions of two or more audio sources of the plurality of audio sources in the at least one audio scene (Brandenburg, Fig. 1A; ¶ [0083]: “Audio objects provide a spatial description of audio data, including parameters such as the audio source position (using e.g. 3D coordinates) in a multi-dimensional space (e.g. 2D or 3D space), audio source dimensions, audio source directionality, etc.”);
determine two or more subsets (cluster) of the two or more audio sources in the at least one audio scene (Brandenburg, Fig. 1A, items 1121, 1122, O1, O2, Q3, Q4, O5, O6);
obtain at least one listener position in the at least one audio scene (Brandenburg, ¶ [0021]: “determining spatial listener information, the spatial listener information including one or more listener positions and/or listener orientations of one or more listeners in a three dimensional (3D) space, the 3D space defining an audio space”).
However, Brandenburg fails to disclose select at least one subset of the two or more subsets of audio sources based on the at least one listener position and audio source positions for determining spatial audio output.
In an analogous field of endeavor, Seefeldt (US 20220360899 A1) discloses select at least one subset of the two or more subsets of audio sources based on the at least one listener position and audio source positions for determining spatial audio output (Seefeldt, ¶ [0055]: “the audio data may be based on spatial zones, each of the spatial zones corresponding to a subset of the listening environment.” ¶ [0171]: “the one or more dynamically configurable functions may be based on proximity of loudspeakers to one or more listeners; proximity of loudspeakers to an attracting force position”)
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to combine Seefeldt with Brandenburg to select a subset of speakers depending on the zone the user is moving to provide a better sound.
Regarding claim 12, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 32.
Brandenburg further discloses, wherein the instructions, when executed with the at least one processor (Brandenburg, ¶ [0062]: “the processor is configured to perform executable operations”), cause the apparatus to at least one of: create a manifest storing the associations between the positions and the subsets of the one or more audio signal content sets (Brandenburg, ¶ [0062]: “storage medium comprises a manifest file comprising audio object metadata, including audio object identifiers… audio object comprising audio data associated with a position in the audio space and an aggregated audio object comprising aggregated audio data of at least a part of the atomic audio objects defined in the manifest file”); or enable the manifest to be accessed with a rendering device.
Regarding claim 14, Brandenburg discloses a method
generating one or more audio signal content sets that provide a plurality of audio sources of one or more spatial audio scenes (Brandenburg, ¶ [0142])
obtaining positions of two or more audio sources (Brandenburg, ¶ [0021]). of the plurality of audio sources in which the one or more spatial audio scenes are audible to a listener (Brandenburg, ¶ [0142]) enabling six degrees of freedom of movement (Brandenburg, ¶ [0021]).
However, Brandenburg fails to disclose determining two or more subsets of the two or more audio sources in the one or more spatial audio scenes; obtaining at least one listener position in the one or more spatial audio scenes; and selecting at least one subset of the two or more subsets of the two or more audio sources based on the at least one listener position and the positions of the two or more audio sources for determining spatial audio output; wherein the method enables six degrees of freedom rendering.
In an analogous field of endeavor, Seefeldt discloses determining two or more subsets of the two or more audio sources (loudspeakers) in the one or more spatial audio scenes (Seefeldt, ¶ [0169]: “the plurality of loudspeakers separately for each of the one or more spatial zones may be based, at least in part, on a loudspeaker participation value for each loudspeaker in each of the one or more spatial zones” , ¶ [0055]: “ the audio data may be based on spatial zones, each of the spatial zones corresponding to a subset of the listening environment (scene);
obtaining at least one listener position in the one or more spatial audio scenes (Seefeldt, ¶ [0176]: “the audio signal is heard by the listener near its intended spatial position”); and
selecting at least one subset of the two or more subsets of the two or more audio sources based on the at least one listener position and the positions of the two or more audio sources for determining spatial audio output (Seefeldt, ¶ [0055]: “the audio data may be based on spatial zones, each of the spatial zones corresponding to a subset of the listening environment.” ¶ [0171]: “the one or more dynamically configurable functions may be based on proximity of loudspeakers to one or more listeners”); wherein
the method enables six degrees of freedom rendering (Seefeldt, ¶ [0029]: “the audio data may be based on spatial zones (3D or 6DoF), each of the spatial zones corresponding to a subset of the listening environment”).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to combine Seefeldt with Brandenburg to select a subset of speakers depending on the zone the user is moving to provide a better sound.
Regarding claim 30, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 14.
Seefeldt further discloses a non-transitory program storage device readable with an apparatus, tangibly embodying a program of instructions executable with the apparatus for performing the method of claim 14 (Seefeldt, ¶ [0024]).
Regarding claim 33, the combination of Brandenburg and Seefeldt discloses an apparatus as claimed in claim 32.
Seefeldt further discloses, wherein the determining spatial audio output comprises determining one or more spatial metadata parameters (Seefeldt, ¶ [0026: “the spatial data includes channel data and/or spatial metadata.”).
Regarding claims 2 and 23, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 32 and 14 respectively.
Brandenburg further discloses, wherein the one or more audio signal content sets comprise at least one of: two or more audio channels (Brandenburg, Fig. 1A, item 109; ¶ [0083]; or one or more audio channels and metadata corresponding to the one or more audio channels.
Regarding claims 3 and 24, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 2 and 14 respectively.
Brandenburg further discloses, wherein the instructions, when executed with the at least one processor, provide the one or more audio signal content sets in data structures comprising audio channel data and associated rendering metadata (Brandenburg, ¶ [0016]: “program instructions may be provided to
a processor” and ¶ [0025]: “audio object metadata, including audio object identifiers and positions of the audio objects in audio space may be provided to the audio client in a data structure”).
Regarding claims 4 and 25, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 3 and 14 respectively.
Brandenburg further discloses, wherein a plurality of data structures is provided in a track grouping with metadata (Brandenburg, Fig. 6, item 604, 608, 610, 612; ¶ [0138]: “A manifest file generator may generate a data structure referred to as a manifest file (MF) comprising audio object metadata including audio object identifiers or information for determining audio object identifiers for signaling an audio client which audio objects are available for retrieval by an audio client.”) associating the data structures to one or more determined positions (Brandenburg, ¶ [0191]: “associating a coordinate metadata track with audio tracks of different resolutions”).
Regarding claim 6, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 32.
Brandenburg further discloses, wherein the subsets of the one or more audio signal content sets cover a plurality of areas within which a listener associated with the at least one listener position moves, and the size of the areas covered with the subsets of the one or more audio signal content sets is determined with one or more factors comprising speed of movement of the listener (Brandenburg, Figs 1A-1C; ¶ [0083], ¶ [0106]).
Regarding claim 7, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 6.
Brandenburg further discloses, wherein the size of areas covered with one or more subsets of the one or more audio signal content sets is configured to change so that different sized areas are covered at different times (Brandenburg, Fig. 1C, ¶ [0106], item 118, 119).
Claim(s) 8-9, 11 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Brandenburg (US 20180098173 A1), in view of Seefeldt (US 20220360899 A1), and further in view of Murtaza (US 20200278828 A1).
Regarding claim 8, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 32.
However, the combination of Brandenburg and Seefeldt fails to disclose, wherein the one or more audio signal content sets comprise higher order ambisonic source data, and wherein the higher order ambisonic source data comprises one or more sets of multi-channel audio signals or one or more sets of metadata.
In an analogous field of endeavor, Murtaza (US 20200278828 A1) discloses one or more audio signal content sets comprise higher order ambisonic source data, and wherein the higher order ambisonic source data comprises one or more sets of multi-channel audio signals or one or more sets of metadata (Murtaza, ¶ [0016]: “audio signals that may be represented for example as audio objects, audio channels, scene based audio (Higher Order Ambisonics—HOA), or any combination of all.” And ¶ [0406-0410]).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to add Murtaza to the combination of Brandenburg and Seefeldt to allow correct rendering of the encapsulated audio scenes.
Regarding claim 9, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 32.
However, the combination of Brandenburg and Seefeldt fails to disclose, wherein the at least one subset of the two or more subsets of audio signal content sets that are associated with a position comprises at least one of: audio signal content that enables rendering of the spatial audio scene with a predetermined quality level; or audio signal content that enables rendering of spatial audio scenes at listener positions close to the determined position.
In an analogous field of endeavor, Murtaza discloses at least one subset of the two or more subsets of audio signal content sets that are associated with a position comprises at least one of: audio signal content that enables rendering of the spatial audio scene with a predetermined quality level; or audio signal content that enables rendering of spatial audio scenes at listener positions close to the determined position (Murtaza, Fig. 1a, ¶ [0155]: “relevant streams (e.g., streams coming from audio objects within the current environment) may be delivered by the server system 120 to the client system 102 at the highest bitrate and/or highest quality level (as a consequence of the fact that the less relevant streams are at lower bitrate and or quality level, hence leaving free band for the more relevant streams)”).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to add Murtaza to the combination of Brandenburg and Seefeldt to provide three-dimensional audio via speakers at the listener’s position.
Regarding claim 11, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 32.
However, the combination of Brandenburg and Seefeldt fails to disclose, wherein the instructions, when executed with the at least one processor, cause the apparatus to determine the plurality of obtain the positions with dividing the listening space into a plurality of subspaces such that a subset of the one or more audio signal content sets is associated with a subspace
In an analogous field of endeavor, Murtaza discloses wherein the instructions, when executed with the at least one processor (Murtaza, ¶ [0102]: “the system comprises a metadata processor configured to manipulate the metadata”), cause the apparatus to determine the plurality of obtain the positions with dividing the listening space into a plurality of subspaces such that a subset of the one or more audio signal content sets is associated with a subspace (Murtaza, ¶ [0097]: “the system may be configured to request and/or obtain the streams on the basis of the user's orientation and/or user's direction of movement ”, creating subspaces, ¶ [0406-0407]).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to add Murtaza to the combination of Brandenburg and Seefeldt to provide a system for a virtual reality environment configured to receive video and audio streams to be reproduced in a media device.
Regarding claim 13, the combination of Brandenburg and Seefeldt discloses all the limitations of claim 32.
However, the combination of Brandenburg and Seefeldt fails to disclose, wherein the instructions, when executed with the at least one processor, cause the apparatus to provide the one or more audio signal content sets in an adaptation set comprising a plurality of audio signal content sets and metadata associating the one or more audio signal content sets with one or more positions.
In an analogous field of endeavor, Murtaza discloses the instructions, when executed with the at least one processor, cause the apparatus to provide the one or more audio signal content sets in an adaptation set comprising a plurality of audio signal content sets and metadata associating the one or more audio signal content sets with one or more positions (Murtaza, ¶ [0044]: “adaptation sets to be delivered to the client, the audio streams and/or audio elements and/or adaptation sets being associated to at least one audio scene” and ¶ [0047]: “the request being associated to at least the user's current viewport and/or head orientation and/or movement data and/or interaction metadata and/or virtual positional data and to an audio scene associated to the environment”).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to add Murtaza to the combination of Brandenburg and Seefeldt to provide a system configured to request at least one audio stream and/or one audio element of an audio stream and/or one adaptation
set to a server on the basis of at least the user's current viewport and/or head orientation and/or movement data and/or interaction metadata and/or virtual positional data.
Claim(s) 26, 28-29 and 31 is/are rejected under 35 U.S.C. 103 as being unpatentable over Brandenburg (US 20180098173 A1), and further in view of Murtaza (US 20200278828 A1).
Regarding claim 26, Brandenburg discloses an apparatus, comprising:
at least one processor (Brandenburg, ¶ [0060]: “a processor, preferably a microprocessor”); and at least one non-transitory memory storing instructions that, when executed with the at least one processor (Brandenburg, ¶ [0060]: “a computer readable storage medium having computer readable program code embodied therewith”), cause the apparatus at least to:
obtain information declaring available audio signal content sets that provide one or more spatial audio scenes;
determine a plurality of positions in which the one or more spatial audio scenes are audible to a listener (Brandenburg, ¶ [0142]:“The spatial listener location may be determined in two aspects, namely relative to the loudspeaker configuration and relative in the audio scene.”), wherein six degrees of freedom of movement is enabled (Brandenburg, ¶ [0021]: “the spatial listener information including one or more listener positions and/or listener orientations of one or more listeners in a three dimensional (3D) space, the 3D space defining an audio space”, a 3D space has inherently 6 degrees of freedom).
However, Brandenburg fails to disclose associate a subset of the audio signal content sets with the determined plurality of positions such that a first subset of the audio signal content sets is associated with a first position and a second subset of the audio signal content sets is associated with a second position; and retrieve audio signal content for rendering such that when the listener is at the first position the first subset of the audio signal content sets is retrieved and when the listener is at the second position the second subset of the audio signal content sets is retrieved.
In an analogous field of endeavor, Murtaza (US 20200278828 A1) discloses an associate a subset of the audio signal content sets with the determined plurality of positions such that a first subset of the audio signal content sets is associated with a first position and a second subset of the audio signal content sets is associated with a second position; and retrieve audio signal content for rendering such that when the listener is at the first position the first subset of the audio signal content sets is retrieved and when the listener is at the second position the second subset of the audio signal content sets is retrieved (Murtaza, Fig. 3, case 2; ¶ [0213-0222]); wherein the apparatus enables six degrees of freedom rendering (Brandenburg, ¶ [0021]: “the spatial listener information including one or more listener positions and/or listener orientations of one or more listeners in a three dimensional (3D) space, the 3D space defining an audio space”, a 3D space has inherently 6 degrees of freedom).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to combine Murtaza with Brandenburg to “jump” from one scene to the next one, the audio content might also change (e.g., audio sources which are not audible in one scene may become audible in the next scene”).
Regarding claim 28, Brandenburg discloses a method for six degrees of freedom (Brandenburg, ¶ [0021]) rendering, comprising:
obtaining information declaring available audio signal content sets that provide one or more spatial audio scenes (Brandenburg, ¶ [0142]);
determining a plurality of positions in which the one or more spatial audio scenes are audible to a listener (Brandenburg, ¶ [0142]), wherein
six degrees of freedom of movement is enabled (Brandenburg, ¶ [0021]).
However, Brandenburg fails to disclose associating a subset of the audio signal content sets with the determined plurality of positions such that a first subset of the audio signal content sets is associated with a first position and a second subset of the audio signal content sets is associated with a second position; and retrieving audio content for rendering such that when the listener is at the first position the first subset of the audio signal content sets is retrieved and when the listener is at the second position the second subset of the audio signal content sets is retrieved.
In an analogous field of endeavor, Murtaza (US 20200278828 A1) discloses associating a subset of the audio signal content sets with the determined plurality of positions such that a first subset of the audio signal content sets is associated with a first position and a second subset of the audio signal content sets is associated with a second position; and retrieving audio content for rendering such that when the listener is at the first position the first subset of the audio signal content sets is retrieved and when the listener is at the second position the second subset of the audio signal content sets is retrieved (Murtaza, Fig. 3,case 2; ¶ [00213-0222]).
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to combine Murtaza with Brandenburg to “jump” from one scene to the next one, the audio content might also change (e.g., audio sources which are not audible in one scene may become audible in the next scene”).
Regarding claim 29, the combination of Brandenburg and Murtaza discloses all the limitations of claim 28.
Murtaza further discloses a method, wherein when the audio signal content is provided for rendering to the listener, the audio signal content that is retrieved is restricted to the subset of the audio signal content sets that is associated with the position of the listener (Murtaza, Fig. 6, items s1, s2, viewports; ¶ [0172]: “ the rendering of each audio source is adapted to the user position”).
Regarding claim 31, the combination of Brandenburg and Murtaza discloses all the limitations of claim 28.
Murtaza further discloses a non-transitory program storage device readable with an apparatus, tangibly embodying a program of instructions executable with the apparatus for performing the method of claim 28 (Murtaza, ¶ [0068] and ¶ [0438]).
Claim(s) 27 is/are rejected under 35 U.S.C. 103 as being unpatentable over Brandenburg (US 20180098173 A1), in view of Murtaza (US 20200278828 A1), and further in view of Seefeldt (US 20220360899 A1).
Regarding claim 27, the combination of Brandenburg and Murtaza discloses all the limitations of claim 26.
However, the combination of Brandenburg and Murtaza fails to disclose wherein when the audio signal content is provided for rendering to the listener, the audio signal content that is retrieved is restricted to the subset of the audio signal content sets that is associated with the position of the listener.
In an analogous field of endeavor, Seefeldt discloses an audio signal content is provided for rendering to the listener, the audio signal content that is retrieved is restricted to the subset of the audio signal content sets that is associated with the position of the listener (Seefeldt, ¶ [0181]: “based on the set of loudspeaker
positions with respect to the listener position…This construction yields an optimal set of speaker activations that is sparse, where only speakers in close proximity to the desired audio signal's position are significantly activated, and practically results in a spatial reproduction of the audio signal that is perceptually more robust to listener movement around the set of speakers. ”)
Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to add Seefeldt to the combination of Brandenburg and Murtaza to adjust the speaker loudness and signal content according to the position of the listener in a three-dimensional space.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FRIEDRICH FAHNERT whose telephone number is (571)270-7797. The examiner can normally be reached 7:00 am-4:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CAROLYN EDWARDS can be reached at (571)270-7136. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CAROLYN R EDWARDS/Supervisory Patent Examiner, Art Unit 2692
/FRIEDRICH FAHNERT/
Examiner
Art Unit 2692