DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-18 and 21 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-11, 14-18, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Honma et al. (US 2021/0195363 A1, previously cited and hereafter Honma) in view of Peters et al. (US 2020/0304935 A1 and hereafter Peters).
Regarding claim 1, Honma teaches a signal processing device, method, and program, where an audio object and an associated reverb parameter is transmitted with high encoding efficiency (see Honma, abstract, figure 1, and ¶ 0033).
Honma teaches “An apparatus for processing at least one immersive audio signal, the at least one immersive audio signal comprising at least one audio signal associated with a sound source, at least one sound source parameter defining the sound source and at least one scene parameter for acoustically defining a … scene within which the sound source is located” because the signal processing device receives and decodes a bitstream to obtain an audio object signal and audio object information, or metadata, of the audio object signal (see Honma, figure 1, unit 11 and ¶ 0035-0038), and the metadata of the audio object signal includes sound source, or audio object, parameters defining the audio object and scene parameters, such as the audio objects position in three dimensions and the room_reverb_data (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122).
However, Honma does not appear to teach the “six degrees of freedom audio scene”, because Honma does not appear to teach a moveable listener, or reference, position with respect to the audio objects (i.e., Honma appears to teach that only the audio objects move with respect to azimuth, elevation, and distance, see Honma, ¶ 0040 and 0103). Honma also does not appear to teach or reasonably suggest the features to “determine information, for the sound source, about disabling of a time-varying propagation delay”.
Peters teaches rendering metadata to control user movement based audio rendering, where rendering metadata includes controls for enabling or disabling adaptations of the audio rendering according to movements of the user (see Peters, abstract). In particular, Peters teaches audio rendering for virtual reality (VR) with six degrees of freedom (6DOF) (see Peters, ¶ 0058-0060). It would have been obvious to one of ordinary skill in the art at the time of the effective filing date to modify Honma with the teachings of Peters to provide a VR experience with more immersive audio (see Honma, ¶ 0040 in view of Peters, ¶ 0056-0057).
Therefore, the combination of Honma and Peters makes obvious:
“An apparatus for processing at least one immersive audio signal, the at least one immersive audio signal comprising at least one audio signal associated with a sound source, at least one sound source parameter defining the sound source and at least one scene parameter for acoustically defining a six degrees of freedom audio scene within which the sound source is located” because the signal processing device receives and decodes a bitstream to obtain an audio object signal and audio object information, or metadata, of the audio object signal (see Honma, figure 1, unit 11 and ¶ 0035-0038), and the metadata of the audio object signal includes sound source, or audio object, parameters defining the audio object and scene parameters, such as the audio objects position in three dimensions, the room_reverb_data, and makes obvious the 6DOF audio scene for audio object movement relative to a listeners movement in VR (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122, in view of Peters, ¶ 0056-0057 and 0060),
“the apparatus comprising:
at least one processor” (see Honma, figure 12, unit 501 and ¶ 0220-0222); and
“at least one memory storing instructions that, when executed with the at least one processor” because programs are loaded from one or more mediums to be executed by the processor (see Honma, figure 12, units 502, 508, and 510-511, and ¶ 0224-0227),
“cause the apparatus at least to:
obtain the at least one audio signal associated with the sound source” because the signal processing device obtains an audio object signal by receiving and decoding a bitstream (see Honma, figure 1, unit 11 and ¶ 0035-0038);
“obtain the at least one sound source parameter defining the sound source” because the signal processing device obtains the audio object information, or metadata, of the audio object signal (see Honma, figure 1, unit 11 and ¶ 0035-0038), and the metadata of the audio object signal includes sound source, or audio object, parameters defining the audio object (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122);
“obtain the at least one scene parameter for acoustically defining the six degrees of freedom audio scene within which the sound source is located” because the signal processing device obtains scene parameters, such as room_reverb_data (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122), and Peters makes obvious the 6DOF audio scene for audio object movement relative to a listeners movement in VR (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122, in view of Peters, ¶ 0056-0057 and 0060);
“determine information, for the sound source, about disabling of a time-varying propagation delay” because the signal processing device determines the distance from the audio object to the viewer or listener for direct sound components, initial reflected sound components, and the room reverberation (see Honma, figures 6 and 8, and ¶ 0144-0146 and 0150-0158), and Peters makes obvious the information that disables time-varying propagation delay, such as the metadata that disables doppler adaptations based on a speed of the user in the VR environment (see Peters, ¶ 0103); and
“process the at least one audio signal based on the information, wherein the instructions, when executed by the at least one processor, cause the apparatus to:
determine at least one early reverberation parameter” because the object-specific reverb is determined based on the metadata and the viewer or listener position (see Honma, figures 6, 8, and 9, and ¶ 0172-0175); and
“render the at least one audio signal based on the at least one early reverberation parameter” because the rendering processing unit renders audio based on the direct component information and the object reverb, or early reverberation, parameters to output the audio (see Honma, figure 1, unit 22, figure 6, unit Q12, and ¶ 0042).
Regarding claim 2, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the information comprises at least one of:
an indication of the time-varying propagation delay; or
a propagation delay value” because the metadata includes reverb impulse data, which indicates the delay between the direct sound components and the early reverberation parameters (see Honma, figures 4 and 6, and ¶ 0121-0130, 0144-0146, and 0157-0158), and Peters makes obvious the inclusion of doppler adaptations based on the speed of movements of the user (see Peters, ¶ 0103-0107).
Regarding claim 3, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to determine a control of propagation delay processing based on the information” because the signal processing device determines the propagation delay based on determined distances between an audio object’s position, which is included in the metadata, and the viewer or listener position (see Honma, figures 6, 8, and 9, and ¶ 0157-0158 and 0172-0175).
Regarding claim 4, see the preceding rejection with respect to claim 3 above. The combination makes obvious the “apparatus as claimed in claim 3, wherein the instructions, when executed with the at least one processor, cause the apparatus to: control processing of the time-varying propagation delay for the at least one audio signal based on the determined control of propagation delay processing” because the signal processing device controls the propagation delay based on the received metadata that controls the delay of the reflections (see Honma, figures 6, 8, and 9, and ¶ 0157-0158 and 0172-0175), and makes obvious the enabling or disabling of time-varying propagation delay for doppler effects (see Peters, ¶ 0103-0107).
Regarding claim 5, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
disable early reverberation based processing of the at least one audio signal” because the metadata includes different gain values for the direct sound, the early reverberation, and the late, or room, reverberation, such that the gain of the early reverberation (wet_gain) is set to zero to disable the early reverberation portion (see Honma, figure 3 and ¶ 0110-0111 and 0150-0156); and
“enable late reverberation based processing of the at least one audio signal, wherein the late reverberation based processing of the at least one audio signal comprises an enabled startup phase” because the late reverberation is enabled with included metadata as the room reverb data and a gain is set by room_gain, such that the late reverberation is processed according to the included metadata when the gain is not zero (see Honma, figure 3 and ¶ 0110-0111 and 0150-0156).
Regarding claim 6, see the preceding rejection with respect to claim 5 above. The combination makes obvious the “apparatus as claimed in claim 5, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
obtain a dimension of the six degrees of freedom audio scene based on the at least one scene parameter” because the metadata includes the length of the impulse response (see Honma, figure 4, and ¶ 0121-0125), and Peters makes obvious the 6DOF audio scene for audio object movement relative to a listeners movement in VR (see Peters, ¶ 0056-0057 and 0060);
“determine at least one time delay for at least one reflection path based on the dimension of the six degrees of freedom audio scene” because the metadata includes the impulse response length and filter coefficients (see Honma, figure 3 and ¶ 0121-0125, in view of Peters, ¶ 0056-0057, 0060, 0099, 0103, and 0107); and
“generate reverberation audio signals based on application of the at least one time delay to at least part of the at least one audio signal associated with the sound source” because the signal processing device and the rendering processing unit renders audio based on the direct component information and the object reverb, or early reverberation, parameters to output the audio (see Honma, figure 1, unit 22, figure 6, unit Q12, and ¶ 0042).
Regarding claim 7, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
enable early reverberation based processing of the at least one audio signal based on the at least one early reverberation parameter using a static propagation delay value, a static sound level value and a static direction of arrival value” because when the metadata describing the audio object position is static (i.e., when the audio object is stationary with respect to the viewer or listener position) then the initial reflected sound component, or early reverberation parameter, is static and the delay value remains static as long as the audio object position with respect to the viewer or listener position remains static (see Honma, ¶ 0157-0158); and
“enable late reverberation based processing of the at least one audio signal” because the metadata includes a room reverb flag and impulse response data associated with the audio object for processing (see Honma, ¶ 0117-0120 and 0126-0129).
Regarding claim 8, see the preceding rejection with respect to claim 7 above. The combination makes obvious the “apparatus as claimed in claim 7, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
determine a position of the sound source based on the at least one sound source parameter” because the metadata includes the position of the audio object (see Honma, figure 3, and ¶ 0104);
“obtain a dimension of the six degrees of freedom audio scene based on the at least one scene parameter” because the metadata includes the length of the impulse response (see Honma, figure 4, and ¶ 0121-0125, and also see Peters, ¶ 0060 and 0099);
“determine the static propagation delay value, the static sound level value and the static direction of arrival value for a reflection path based on the dimension of the six degrees of freedom audio scene and the position of the sound source” because the included metadata allows the system to reuse the parameters when the audio object is not moving from frame to frame (see Honma, ¶ 0115, 0132, and 0157-0158), makes obvious to disable or enable delay adaptations (see Peters, ¶ 0060, 0099, and 0107), and the static parameters are determined by the signal processing device (see Honma, figure 9, and ¶ 0172-0175); and
“generate early reverberation audio signals based on the application of the propagation time delay value, the static sound level value and the static direction of arrival value to at least part of the at least one audio signal associated with the sound source” because the signal processing device and the rendering processing unit renders audio based on the direct component information and the object reverb, or early reverberation, parameters to output the audio (see Honma, figure 1, unit 22, figure 6, unit Q12, and ¶ 0042, and see Peters, ¶ 0099 and 0107).
Regarding claim 9, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
enable early reverberation based processing of the at least one audio signal based on the at least one early reverberation parameter using a static propagation delay value, a static sound level value, and a time varying direction of arrival value” because when the metadata describing the audio object position is static (i.e., when the audio object is stationary with respect to the viewer or listener position) then the initial reflected sound component, or early reverberation parameter, is static and the delay value, sound level or gain, and the direction of arrival remains static as long as the audio object position with respect to the viewer or listener position remains static (see Honma, ¶ 0157-0158); and
“enable late reverberation based processing of the at least one audio signal” because the metadata includes a room reverb flag and impulse response data associated with the audio object for processing (see Honma, ¶ 0117-0120 and 0126-0129).
Regarding claim 10, see the preceding rejection with respect to claim 9 above. The combination makes obvious the “apparatus as claimed in claim 9, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
determine a static position of the sound source based on the at least one sound source parameter, and a time-varying position of the sound source based on at least one of the at least one sound source parameter or a time-varying position of a listener” because the metadata includes the position of the audio object (see Honma, figure 3, and ¶ 0104), the position remains static when the reuse flag is equal to 1 (see Honma, ¶ 0115, 0132, and 0157-0158), and Peters makes obvious time-varying positions (see Peters, ¶ 0099 and 0103-0107);
“obtain a dimension of the six degrees of freedom audio scene based on the at least one scene parameter” because the metadata includes the length of the impulse response (see Honma, figure 4, and ¶ 0121-0125, and also see Peters, ¶ 0060 and 0099);
“determine the propagation time delay value, and the static sound level value for a reflection path based on the dimension of the six degrees of freedom audio scene and the static position of the sound source” because the included metadata allows the system to reuse the parameters when the audio object is not moving from frame to frame (see Honma, ¶ 0115, 0132, and 0157-0158), makes obvious to disable or enable delay adaptations (see Peters, ¶ 0060, 0099, and 0107), and the static parameters are determined by the signal processing device (see Honma, figure 9, and ¶ 0172-0175);
“determine the time-varying direction of arrival value for the reflection path based on the dimension of the six degrees of freedom audio scene and at least one of the time-varying position of the sound source or the time-varying position of the listener” because the metadata is updated as necessary when the audio object is moving in the scene (see Honma, ¶ 0116, 0132, and 0157-0158, and see Peters, ¶ 0060, 0099, and 0103-0107)
“generate early reverberation audio signals based on application of the static propagation delay value, the static sound level value and the time-varying direction of arrival value to at least part of the at least one audio signal associated with the sound source” because the signal processing device and the rendering processing unit renders audio based on the direct component information and the object reverb, or early reverberation, parameters to output the audio (see Honma, figure 1, unit 22, figure 6, unit Q12, and ¶ 0042, and see Peters, ¶ 0099, 0103, and 0107).
Regarding claim 11, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
enable early reverberation based processing of the at least one audio signal based on the at least one early reverberation parameter using a time-varying propagation delay value, a time-varying sound level value, and a time-varying direction of arrival value” because when the metadata describing the audio object position is not static (i.e., when the audio object is moving with respect to the viewer or listener position) then the initial reflected sound component, or early reverberation parameter, is static and the delay value, sound level or gain, and the direction of arrival is updated as the audio object position with respect to the viewer or listener position moves (see Honma, ¶ 0157-0158), and Peters makes obvious time-varying positions (see Peters, ¶ 0099 and 0103-0107); and
“enable late reverberation based processing of the at least one audio signal” because the metadata includes a room reverb flag and impulse response data associated with the audio object for processing (see Honma, ¶ 0117-0120 and 0126-0129 and see Peters, ¶ 0099, 0103, and 0107).
Regarding claim 14, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to obtain at least one of:
at least one scene geometry parameter; or
at least one scene acoustic material parameter” because the metadata corresponding to the room, or scene geometry, is described in the impulse response (see Honma, ¶ 0126-0129, 0146, and 0157-0159).
Regarding claim 15, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to obtain the at least one scene parameter from at least one of:
an encoder input format description;
a content creator;
an augmented reality sensing apparatus;
a camera; or
a light ranging and detection sensor” because Honma teaches that the scene parameters, such as the room reverb parameters, are included in the metadata of the bitstream, where the content creator has included the object audio and the scene parameters together in bitstream (see Honma, figures 1-4 and ¶ 0035-0045).
Regarding claim 16, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
determine at least one of:
information indicating, for the sound source, a disabling of dynamic source updating” because the included metadata allows the system to reuse the parameters when the audio object is not moving from frame to frame (see Honma, figure 3 and ¶ 0115, 0132, and 0157-0158);
“a flag within the at least one immersive audio signal indicating the disabling of the dynamic source updating” (see Honma, figure 3 and ¶ 0115, and see Peters, ¶ 0099);
“information within an application programming interface indicating the disabling of the dynamic source updating for the sound source; or
a quality determiner configured to determine a lowering of quality of an output audio signal when the sound source is processed with the dynamic source updating” where Honma and Peters makes obvious at least the information, because the information comprises metadata to enable or disable the process (see Honma, figures 1 and 3, and ¶ 0035-0040, 0103, 0115, 0132, and 0157-0158, in view of Peters, ¶ 0099, 0103, and 0107).
Regarding claim 17, see the preceding rejection with respect to claim 1 above. The combination makes obvious the “apparatus as claimed in claim 1, wherein the instructions, when executed with the at least one processor, cause the apparatus to: perform a disabling of the time-varying propagation delay” because the system processes the decoded metadata and reuses the parameters when the audio object is not moving from frame to frame (see Honma, figure 3 and ¶ 0035-0036, 0115, 0132, 0157-0158, and 0220-0227), and makes obvious to disable doppler processing based on movement speed (see Peters, ¶ 0103).
Regarding claim 18, see the preceding rejection with respect to claim 1 above. The combination makes obvious the apparatus of claim 1, and likewise makes obvious:
“A method for an apparatus for processing at least one immersive audio signal, the at least one immersive audio signal comprising at least one audio signal associated with a sound source, at least one sound source parameter defining the sound source and at least one scene parameter for acoustically defining a six degrees of freedom audio scene within which the sound source is located” because the signal processing device receives and decodes a bitstream to obtain an audio object signal and audio object information, or metadata, of the audio object signal (see Honma, figure 1, unit 11 and ¶ 0035-0038), and the metadata of the audio object signal includes sound source, or audio object, parameters defining the audio object and scene parameters, such as the audio objects position in three dimensions, the room_reverb_data, and makes obvious the 6DOF audio scene for audio object movement relative to a listeners movement in VR (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122, in view of Peters, ¶ 0056-0057 and 0060),
“the method comprising:
obtaining the at least one audio signal associated with the sound source” because the signal processing device obtains an audio object signal by receiving and decoding a bitstream (see Honma, figure 1, unit 11 and ¶ 0035-0038);
“obtaining the at least one sound source parameter defining the sound source” because the signal processing device obtains the audio object information, or metadata, of the audio object signal (see Honma, figure 1, unit 11 and ¶ 0035-0038), and the metadata of the audio object signal includes sound source, or audio object, parameters defining the audio object (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122);
“obtaining the at least one scene parameter for acoustically defining the six degrees of freedom audio scene within which the sound source is located” because the signal processing device obtains scene parameters, such as room_reverb_data (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122), and Peters makes obvious the 6DOF audio scene for audio object movement relative to a listeners movement in VR (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122, in view of Peters, ¶ 0056-0057 and 0060);
“determining information, for the sound source, about a disabling of a time-varying propagation delay” because the signal processing device determines the distance from the audio object to the viewer or listener for direct sound components, initial reflected sound components, and the room reverberation (see Honma, figures 6 and 8, and ¶ 0144-0146 and 0150-0158), and Peters makes obvious the information that disables time-varying propagation delay, such as the metadata that disables doppler adaptations based on a speed of the user in the VR environment (see Peters, ¶ 0103); and
“processing the at least one audio signal based on the information, wherein processing the at least one audio signal comprises:
determining at least one early reverberation parameter” because the object-specific reverb is determined based on the metadata and the viewer or listener position (see Honma, figures 6, 8, and 9, and ¶ 0172-0175); and
“rendering the at least one audio signal based on the at least one early reverberation parameter” because the rendering processing unit renders audio based on the direct component information and the object reverb, or early reverberation, parameters to output the audio (see Honma, figure 1, unit 22, figure 6, unit Q12, and ¶ 0042).
Regarding claim 21, see the preceding rejection with respect to claim 1 above. The combination makes obvious the apparatus of claim 1, and likewise makes obvious:
“A non-transitory computer readable medium comprising instructions that, when executed with an apparatus, cause the apparatus to perform processing of at least one immersive audio signal, the at least one immersive audio signal comprising at least one audio signal associated with a sound source, at least one sound source parameter defining the sound source and at least one scene parameter for acoustically defining a six degrees of freedom audio scene within which the sound source is located” because the signal processing device receives and decodes a bitstream to obtain an audio object signal and audio object information, or metadata, of the audio object signal (see Honma, figure 1, unit 11 and ¶ 0035-0038), and the metadata of the audio object signal includes sound source, or audio object, parameters defining the audio object and scene parameters, such as the audio objects position in three dimensions, the room_reverb_data, Peters makes obvious the 6DOF audio scene for audio object movement relative to a listeners movement in VR (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122, in view of Peters, ¶ 0056-0057 and 0060), and the programs are loaded from one or more mediums to be executed by the processor (see Honma, figure 12, units 501-502, 508, and 510-511, and ¶ 0220-0222 and 0224-0227),
“and cause the apparatus to perform at least the following:
obtaining the at least one audio signal associated with the sound source” because the signal processing device obtains an audio object signal by receiving and decoding a bitstream (see Honma, figure 1, unit 11 and ¶ 0035-0038);
“obtaining the at least one sound source parameter defining the sound source” because the signal processing device obtains the audio object information, or metadata, of the audio object signal (see Honma, figure 1, unit 11 and ¶ 0035-0038), and the metadata of the audio object signal includes sound source, or audio object, parameters defining the audio object (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122);
“obtaining the at least one scene parameter for acoustically defining the six degrees of freedom audio scene within which the sound source is located” because the signal processing device obtains scene parameters, such as room_reverb_data (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122), and Peters makes obvious the 6DOF audio scene for audio object movement relative to a listeners movement in VR (see Honma, figures 3-4 and ¶ 0103-0105, 0118, and 0121-0122, in view of Peters, ¶ 0056-0057 and 0060);
“determining information, for the sound source, about disabling of a time-varying propagation delay” because the signal processing device determines the distance from the audio object to the viewer or listener for direct sound components, initial reflected sound components, and the room reverberation (see Honma, figures 6 and 8, and ¶ 0144-0146 and 0150-0158), and Peters makes obvious the information that disables time-varying propagation delay, such as the metadata that disables doppler adaptations based on a speed of the user in the VR environment (see Peters, ¶ 0103); and
“processing the at least one audio signal based on the information, wherein the instructions further cause the apparatus to perform:
determining at least one early reverberation parameter” because the object-specific reverb is determined based on the metadata and the viewer or listener position (see Honma, figures 6, 8, and 9, and ¶ 0172-0175); and
“rendering the at least one audio signal based on the at least one early reverberation parameter” because the rendering processing unit renders audio based on the direct component information and the object reverb, or early reverberation, parameters to output the audio (see Honma, figure 1, unit 22, figure 6, unit Q12, and ¶ 0042).
Claim(s) 12-13 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Honma and Peters as applied to claim 11 above, and further in view of Pang et al. (US 2021/0352425 A1, previously cited and hereafter Pang).
Regarding claim 12, see the preceding rejection with respect to claim 11 above. The combination of Honma and Peters makes obvious the “apparatus as claimed in claim 11, wherein the instructions, when executed with the at least one processor, cause the apparatus to:
determine a time-varying position of the sound source based on at least one of the at least one sound source parameter or a time-varying position of a listener” because the metadata includes the position of the audio object (see Honma, figure 3, and ¶ 0104), the included metadata changes at the same frequency that the audio object moves from frame to frame (see Honma, ¶ 0132 and 0157-0158), and Peters makes obvious time-varying positions of a listener with respect to the sound sources (see Peters, ¶ 0060, 0099, and 0103-0107);
“obtain a dimension of the six degrees of freedom audio scene based on the at least one scene parameter” because the metadata includes the length of the impulse response (see Honma, figure 4, and ¶ 0121-0125, and also see Peters, ¶ 0060 and 0099);
“determine the time-varying propagation delay value, the time-varying sound level value, and the time-varying direction of arrival value for a reflection path based on the dimension of the six degrees of freedom audio scene and at least one of the time-varying position of the sound source or the time-varying position of the listener” because the included metadata has changing parameters that change at the same frequency that the audio object moves from frame to frame (see Honma, ¶ 0115, 0132, and 0157-0158), the varying parameters are determined by the signal processing device (see Honma, figure 9, and ¶ 0172-0175), and Peters makes obvious time-varying positions of a listener with respect to the sound sources (see Peters, ¶ 0060, 0099, and 0103-0107); [and]
“generate early reverberation audio signals based on application of the time-varying propagation delay value, the time-varying sound level value, and the time-varying direction of arrival value to at least part of the at least one audio signal associated with the sound source” because the signal processing device and the rendering processing unit renders audio based on the direct component information and the object reverb, or early reverberation, parameters to output the audio (see Honma, figure 1, unit 22, figure 6, unit Q12, and ¶ 0042).
However, the combination of Honma and Peters does not appear to teach that the early reverberation processing includes phase modification.
Pang teaches a method and apparatus for processing a stereo signal, where binaural filtering is used to improve localization of virtual sound sources (see Pang, ¶ 0007). It would have been obvious to one of ordinary skill in the art at the time of the effective filing date to modify the combination of Honma and Peters with the teachings of Pang to improve binaural audio output when personalized head-related transfer functions cannot be easily measured (see Pang, ¶ 0003-0007).
Therefore, the combination of Honma, Peters, and Pang makes obvious the features to:
“further phase modifying the early reverberation audio signals” because Pang makes it obvious to decorrelate the early reverberation audio signals to improve the externalization and localization of virtual sound sources, where the improved method treats the early reverberation separately from the direct sound components (see Pang, ¶ 0175, 0270, and 0278-0279).
Regarding claim 13, see the preceding rejection with respect to claim 12 above. The combination of Honma, Peters, and Pang makes obvious the “apparatus as claimed in claim 12, wherein the instructions, when executed with the at least one processor, cause the apparatus to decorrelate process the early reverberation audio signals” because Pang makes it obvious to decorrelate the early reverberation components by applying filters with randomized phase (see Pang, ¶ 0278-0279).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Fueg et al. (US 2016/0255453 A1, previously cited, and hereafter Fueg) discloses a method and apparatus for processing an audio signal with room impulse response data (see Fueg, abstract and figures 4-9);
Sarkar (US 2019/0373395 A1, previously cited) discloses adjusting audio characteristics for augmented reality (see Sarkar, abstract and figures 1-4); and
Eronen et al. (US 2023/0100071 A1, previously cited, and hereafter Eronen) discloses a rendering reverberation apparatus and method (see Eronen, abstract and figures 1-7, 10, 12, 16a-16d, and 18a-20).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Daniel R Sellers whose telephone number is (571)272-7528. The examiner can normally be reached Mon - Fri 10:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fan S Tsang can be reached at (571)272-7547. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Daniel R Sellers/Primary Examiner, Art Unit 2694