DETAILED ACTION
1. This action is responsive to remarks filed 12/31/2025.
Notice of Pre-AIA or AIA Status
2. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
3. Claims 1-8, 11-14, 16-21 have been amended.
Response to Arguments
4. Applicant’s arguments filed have been fully considered but are not persuasive.
Applicant argues in remarks that cited prior art Laaksonen does not teach the limitations of claim 1, of analyzing an ambient part to determine a difference parameter and generating a spatial comfort noise based on the parameter.
Examiner respectfully disagrees.
Laaksonen teaches
analysing an ambient part of the at least two audio signals to determine a difference parameter (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; distance;
pg 4 l. 8-12: The means configured to generate at least one spatial comfort noise parameter further based on the time window or period may be configured to track the at least one spatial comfort noise parameter more accurately during the time window or period when the source activity determination determines substantially background noise.); and
generating a spatial comfort noise based on the difference parameter (abstract: receives an audio signal and metadata 131 from which a spatial comfort noise parameter is generated 225 in Active mode. In DTX mode, this parameter is used to generate an actual comfort noise signal for external render and output 209. The comfort noise parameter may be a direction, energy ratio or source activity as determined by a statistical model; pg 2 l. 29 – pg 3 l. 2: obtain or update a spatial comfort noise generator model based on the metadata parameters associated with the at least one audio signal; and generate the at least one spatial comfort noise parameter based on spatial comfort noise generator model.;
pg 3 l. 3-10: The means configured to obtain or update the spatial comfort noise generator model based on the metadata parameters associated with the at least one audio signal and during the active mode of operation of the apparatus may be configured to determine: at least one comfort noise directional component, wherein the comfort noise directional component is associated with a frequency band and time window; and at least one comfort noise energy ratio, wherein the at least one comfort noise energy ratio is associated with one of the at least one comfort noise directional component;
pg 3 l. 24-28: generate, during the discontinuous transmission mode of operation of the apparatus, at least one spatial comfort noise audio signal based on the at least one spatial comfort noise parameter; col 15 l. 1-2: consider the spatial dimension of background noise during CNG periods).
Laaksonen teaches an immersive voice audio service…receives an audio signal and metadata from which a spatial comfort noise parameter is generated...this parameter is used to generate..comfort noise (abstract).
Pages 12-13 explicitly teach:
Other input formats may utilize new IVAS encoding tools. One input format proposed for IVAS is the Metadata-assisted spatial audio (MASA) format, where the encoder may utilize, e.g., a combination of mono and stereo encoding tools and metadata encoding tools for efficient transmission of the format. MASA is a parametric spatial audio format suitable for spatial audio processing. Parametric spatial audio processing is a field of audio signal processing where the spatial aspect of the sound (or sound scene) is described using a set of parameters. For example, in parametric spatial audio capture from microphone arrays, it is a typical and an effective choice to estimate from the microphone array signals a set of parameters such as directions of the sound in frequency bands, and the relative energies of the directional and non-directional pads of the captured sound in frequency bands, expressed for example as a direct-to-total ratio or an ambient-to-total energy ratio in frequency bands.
These parameters are known to well describe the perceptual spatial properties of the captured sound at the position of the microphone array. These parameters can be utilized in synthesis of the spatial sound accordingly, for headphones binaurally, for loudspeakers, or to other formats, such as Ambisonics.
For example, there can be two channels (stereo) of audio signals and spatial metadata. The spatial metadata may furthermore define parameters such as: Direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; Direct-to-total energy ratio, describing an energy ratio for the direction index; Diffuseness; Coherences such as Spread coherence describing a spread of energy for the direction index; Diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; Surround coherence describing a coherence of the non-directional sound over the surrounding directions; Remainder-to-total energy ratio, describing an energy ratio of the remainder (such as microphone noise) sound energy to fulfil requirement that sum of energy ratios is 1; Distance, describing a distance of the sound originating from the direction index in meters on a logarithmic scale; covariance matrices related to a multi-channel loudspeaker signal, or any data related to these covariance matrices; other parameters for guiding or controlling a specific decoder, e.g., VAD/DTX/CNG/SID parameters Any of these parameters can be determined in frequency bands.
Claim 1 only recites analyzing an ambient part of the at least two generated audio signals to determine a (some) difference parameter to generate spatial comfort noise.
The cited portions of Laaksonen present a plethora of parameters that can correspond to an ambient part of the signals, represent some difference in those portions, and generate comfort noise.
Therefore, the claim limitations are currently still broad enough to allow the language of the reference to read on the limitations, and the rejections are maintained.
The additional independent and dependent claims are rejected based on arguments presented above and art rejections below.
Claim Rejections - 35 USC § 102
5. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
6. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
7. Claims 1, 3-8, 10-14, 17-18, 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Laaksonen et al (GB 2596138 – provided in IDS).
Regarding claim 1 Laaksonen teaches A method for generating at least two channel audio signals for two or multi- way communication (fig 6; pg 1 l. 5-6: apparatus and methods for decoder spatial comfort noise generation; pg 30 l. 26-30: electronics device, mobile device; l. 31- pg 31 l. 5: processor, memory), the method comprising:
capturing, from at least two microphones, at least two audio signals (pg 13 l 1 microphone array; pg 13 l 10 : there can be two channels (stereo) of audio signals);
generating at least two audio signals based on the captured at least two audio signals (pg 13 l 10 : there can be two channels (stereo) of audio signals and spatial metadata);
analysing an ambient part of the at least two generated audio signals to determine a difference parameter (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; distance;
pg 4 l. 8-12: The means configured to generate at least one spatial comfort noise parameter further based on the time window or period may be configured to track the at least one spatial comfort noise parameter more accurately during the time window or period when the source activity determination determines substantially background noise.); and
generating a spatial comfort noise based on the difference parameter (abstract: receives an audio signal and metadata 131 from which a spatial comfort noise parameter is generated 225 in Active mode. In DTX mode, this parameter is used to generate an actual comfort noise signal for external render and output 209. The comfort noise parameter may be a direction, energy ratio or source activity as determined by a statistical model; pg 2 l. 29 – pg 3 l. 2: obtain or update a spatial comfort noise generator model based on the metadata parameters associated with the at least one audio signal; and generate the at least one spatial comfort noise parameter based on spatial comfort noise generator model.;
pg 3 l. 3-10: The means configured to obtain or update the spatial comfort noise generator model based on the metadata parameters associated with the at least one audio signal and during the active mode of operation of the apparatus may be configured to determine: at least one comfort noise directional component, wherein the comfort noise directional component is associated with a frequency band and time window; and at least one comfort noise energy ratio, wherein the at least one comfort noise energy ratio is associated with one of the at least one comfort noise directional component;
pg 3 l. 24-28: generate, during the discontinuous transmission mode of operation of the apparatus, at least one spatial comfort noise audio signal based on the at least one spatial comfort noise parameter; col 15 l. 1-2: consider the spatial dimension of background noise during CNG periods).
Regarding claim 3 Laaksonen teaches The method as claimed in claim 1, further comprising: estimating an ambient part noise level, wherein generating the spatial comfort noise based on the difference parameter further comprises: generating the spatial comfort noise based on the ambient part noise level (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; distance; col 15 l. 1-2: consider the spatial dimension of background noise during CNG periods).
Regarding claim 4 Laaksonen teaches The method as claimed in claim 1, wherein the difference parameter is a directional parameter representing a direction of the ambient part of the at least two generated audio signals (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; distance; col 15 l. 1-2: consider the spatial dimension of background noise during CNG periods).
Regarding claim 5 Laaksonen teaches The method as claimed in claim 1, wherein analysing the ambient part of the at least two generated audio signals comprises spatially analysing the ambient part and further comprises:
determining a diffuseness of the at least two generated audio signals (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; distance; col 15 l. 1-2: consider the spatial dimension of background noise during CNG periods
pg 13 l. 14: diffuseness); and
determining the difference parameter based on the determined diffuseness (pg 13 l. 11 -25).
Regarding claim 6 Laaksonen teaches The method as claimed in claim 1, wherein analysinq the ambient part of the at least two generated audio signals comprises spatially analysing the ambient part to determine the difference parameter and further comprises:
determining for at least one of at least one time or frequency element that the at least two generated audio signals comprise an active non-stationary audio source (pg 14 l. 14-17); and
spatially analysing the ambient part of the at least two generated audio signals for elements other than the determined at least one of at least one time or frequency element in order to determine an ambient directional parameter as a directional parameter (pg 14 l. 14-24: a user is talking in front of a spatial capture device with a busy road behind the device. The spatial audio capture has then constant traffic hum (that is somewhat diffuse), and specific traffic noises (e.g., car horns) coming mainly from behind, and of course the talker’s voice coming from the front; pg 13 l. 11 -25).
Regarding claim 7 Laaksonen teaches The method as claimed in claim 1,wherein the difference parameter is one of:
a directional parameter representing a direction of the ambient part of the at least two generated audio signals (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions);
a phase difference parameter representing a phase difference between the ambient part of the at least two generated audio signals (pg 13 l. 11-25);
a level difference parameter representing a level difference between the ambient part of the at least two generated audio signals (pg 13 l. 11-25); or
a delay or time difference parameter representing a delay or time difference between the ambient part of the at least two generated audio signals (pg 13 l. 11-25).
Regarding claim 8 Laaksonen teaches An apparatus for generating at least two channel audio signals for two or multi-way communication, the apparatus comprising:
at least one processor; and
at least one memory storing instructions that, when executed with the at least one processor, cause the apparatus at least to:
capture, from at least two microphones, at least two audio signals;
generate at least two audio signals based on the captured at least two audio signals;
analyse an ambient part of the at least two generated audio signals to determine a difference parameter; and
generate a spatial comfort noise based on the difference parameter.
Recites limitations similar to claim 1 and is rejected for similar rationale and reasoning
Claim 10 recites limitations similar to claim 3 and is rejected for similar rationale and reasoning
Claim 11 recites limitations similar to claim 4 and is rejected for similar rationale and reasoning
Claim 12 recites limitations similar to claim 5 and is rejected for similar rationale and reasoning
Claim 13 recites limitations similar to claim 6 and is rejected for similar rationale and reasoning
Claim 14 recites limitations similar to claim 7 and is rejected for similar rationale and reasoning
Claim 17 recites limitations similar to claim 3 and is rejected for similar rationale and reasoning
Regarding claim 18 Laaksonen teaches The apparatus as claimed in claim 11, wherein the instructions, when executed with the at least one processor, cause the apparatus to: spatially analyse the ambient part to determine at least one of a delay or phase or level difference between at least a first pair of the at least two generated audio signals (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; distance; col 15 l. 1-2: consider the spatial dimension of background noise during CNG periods).
Claim 20 recites limitations similar to claim 6 and is rejected for similar rationale and reasoning
Claim Rejections - 35 USC § 103
8. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
9. Claims 2, 9, 21 are rejected under 35 U.S.C. 103 as being unpatentable over Laaksonen in view of Vicinus et al (9,865,274).
Regarding claim 2 Laaksonen does not specifically teach where Vicinus teaches The method as claimed in claim 1, wherein generating the at least two audio signals based on the captured at least two audio signals further comprises:
obtaining at least two downlink audio signals (col 2 l. 20-26: remote audio signal; col 7 l. 45-67: remote audio signal; one or more microphones);
determining at least two estimated echo audio signals from the at least two downlink audio signals (abstract; col 2 l. 20-26; col 7 l. 45-67: remote audio signal); and
subtracting the at least two estimated echo audio signals from the captured at least two audio signals to generate the at least two audio signals
(abstract: input ambisonics audio signal includes multiple channels; A remote audio signal made up of audio data representing sound captured by remote meeting equipment is output by a local loudspeaker. Acoustic echo cancellation is performed on the input ambisonic audio signal by removing the remote audio signal from the input ambisonic audio signal.; col 2 l. 27-33; col 3 l. 9: stereo; col 4 l. 33-43; col 11 l. 10-39; fig 8; col 15 l. 36-43: In the example of FIG. 8, each channel of the output encoding is passed to a separate acoustic echo canceller. Specifically, a first channel of the output encoding is passed to the AEC Logic 600, and a second channel of the output encoding is passed to the AEC Logic 604. Each one of the acoustic echo cancellers performs acoustic echo cancellation on the respective one of the output encoding channels, e.g. AEC Logic 600 removes Remote Audio Signal 348 from a first channel of the output encoding (e.g. a “right” channel of a stereo encoding), and AEC Logic 604 removes Remote Audio Signal 348 from a second channel of the output encoding (e.g. a “left” channel of a stereo encoding).).
It would have been obvious to one of ordinary skill in the art before the effective filing date to incorporate Vicinus to remove echo for improved audio communications while still presenting a reasonable expectation of success.
Claim 9 recites limitations similar to claim 2 and is rejected for similar rationale and reasoning.
Regarding claim 21 Laaksonen teaches The apparatus as claimed in claim 8, wherein the apparatus comprises the at least two microphones, at least one loudspeaker, and an antenna (pg 13 6-9: microphone array; loudspeakers; pg 31 l. 26-33; pg 32 l. 1 – 9: UMTS protocol).
10. Claims 16 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Laaksonen in view of Ramo et al (2021/0211828)
Regarding claim 16 Laaksonen teaches The method as claimed in claim 4, wherein analysing the ambient part of the at least two generated audio signals comprises spatially analysing the ambient part to determine the difference parameter and further comprises:
determining at least one of a delay or phase or level difference between at least a first pair of the at least two generated audio signals (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; distance); and
determining the directional parameter based on at least one of the delay or phase or level difference [and at least one of a determined distance or orientation between microphones associated with the first pair of the at least two generated audio signals] (pg 13 l. 11 -25 : may define parameters such as: direction index, describing a direction of arrival of the sound at a time-frequency parameter interval; level/phase differences; direct-to-total energy ratio; diffuseness; diffuse-to-total energy ratio, describing an energy ratio of non-directional sound over surrounding directions; distance);
and does not specifically teach where Ramo teaches at least one of a determined distance or orientation between microphones associated with the first pair of the at least two generated audio signals (0011 The characteristic associated with the specific microphone profile may comprise at least one of: a distance between at least two microphones of the microphone array; and a direction of the at least one microphone of the microphone array.)
It would have been obvious to one of ordinary skill in the art before the effective filing date to incorporate Ramo defining at least one parameter field associated with an input multi-channel audio signals; determining at least one spatial audio parameter associated with the multi-channel audio signals (0004) for improved spatial determination and comfort noise generation.
Regarding claim 19 Laaksonen in view of Ramo teaches The apparatus as claimed in claim 18, wherein the instructions, when executed with the at least one processor, cause the apparatus to: determine the directional parameter based on at least one of the delay or phase or level difference and at least one of a determined distance or orientation between microphones associated with the first pair of the at least two generated audio signals.
Recites limitations similar to claim 16 and is rejected for similar rationale and reasoning
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See 892.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAUN A ROBERTS whose telephone number is (571)270-7541. The examiner can normally be reached Monday-Friday 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.
For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SHAUN ROBERTS/Primary Examiner, Art Unit 2655