Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawing submitted on 06/04/2026 is considered by the examiner.
Claim Objections
Claim 11 is objected to because of the following informalities: the word “module” in line 4, spelling should be “model”. Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1 and 11, are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim(s) recite(s) “performing a classification on the digital audio signal to determine a specific classification corresponding to the digital audio signal, wherein the specific classification is one of a plurality of predetermined classifications, the plurality of predetermined classifications comprise at least two predetermined classifications for a sound situation, and the at least two predetermined classifications correspond to different resource configurations; and processing the digital audio signal based on a specific resource configuration corresponding to the specific classification, wherein the specific resource configuration is one of a plurality of predetermined resource configurations, and the plurality of predetermined resource configurations are associated with the plurality of predetermined classifications.”
The limitation of performing a classification on the digital audio signal, as drafted is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, other than reciting “digital audio” (claim 1), “storage device storing the classification model” and “processor performing the classification model on the digital audio signal” (claim 11), nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the “digital audio” (claim 1), “storage device storing the classification model” and “processor performing the classification model on the digital audio signal” (claim 11), “Performing” in the context of this claims encompasses the user manually plot in a piece of a paper the digital audio signal over time to perform classification of sound within the digital audio signal to at least two predetermined sound situation and type from a table listing all difference sound type, i.e. a speech, silence portion, noise portion etc. The user further manually identify the processing resources required corresponding to each classified sound type from the table listing, which further list each specific sound associating a processing resource.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Process” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim 1, only recites one additional element- digital, in front of the audio signal at a high-level generality such that it amounts no more than apply the exception using a generic computer component (i.e., processing the digital audio signal using a generic processor). Similarly Claim 11, recites additional elements- “storage device” storing the classification model and “processor” performing the classification model on the digital audio signal. The claimed storage device and processor are recited at a high level of generality and perform functions—storing a classification model and executing the model on a digital signal—that are well-understood, routine, and conventional in the field of digital signal processing and machine learning. Applicant specification [0011] recites storage device as “In one example, the first storage device 121 may be an SRAM (Static Random-Access Memory), and the second storage device 15 may be a DRAM (Dynamic Random-Access Memory), but the present disclosure is not limited to these types.”
Implementing a classification model on a general-purpose processor, with the model stored in a non-transitory computer-readable medium, is a standard practice in the art and does not amount to significantly more than the abstract idea of classification itself. Such arrangements merely apply generic computing components to execute known algorithms, which is a conventional use of computer technology (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception, courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity, Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93(see MPEP § 2106.05(d) II). Accordingly, this additional elements does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into practical application, the additional element of processing the digital audio signal based on a specific resource configuration steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component (See MPEP 2106.05(f)) cannot provide an incentive concept( See MPEP 2106.05 (b) “And if a claim fails the Alice/Mayo test (i.e., is directed to an exception at Step 2A and does not amount to significantly more than the exception in Step 2B), then the claim is ineligible even if it passes the M-or-T test. DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245, 1256, 113 USPQ2d 1097, 1104 (Fed. Cir. 2014) ("[I]n Mayo, the Supreme Court emphasized that satisfying the machine-or-transformation test, by itself, is not sufficient to render a claim patent-eligible, as not all transformations or machine implementations infuse an otherwise ineligible claim with an 'inventive concept.'"). The claim is not patent eligible.
Even when considered in combination, these additional elements represent mere instructions to apply an exception and insignificant extra-solution activity (See MPEP 2106.05(g)), which cannot provide an inventive concept.
With respect to claims 2 and 12, which depends on claim 1 and includes all the limitation of claim 1, recites obtaining a speech confidence score (SCS) of the digital audio signal; and determining the specific classification of the digital audio signal according to the SCS, is a process. The limitation of obtaining a speech confidence score (SCS) of the digital audio signal, and determining the specific classification of the digital audio signal according to the SCS as drafted is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “obtaining” in the context of this claims encompasses the user manually obtaining the table listing all difference sound type ( i.e. a speech, silence portion, noise portion etc.), each associating a confidence score along with each specific sound associating a processing resource and identify the specific classification of the digital audio signal according to a sound type associating a confidence score from the table listing. Accordingly, under its broadest reasonable interpretation, the limitations falls within the “Mental Process” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
With respect to claims 3 and 13, which depends on claim 1 and includes all the limitation of claim 1, recites wherein the plurality of predetermined classifications are associated with a plurality of predetermined algorithms, and the plurality of predetermined algorithms are noise reduction algorithms, which similar to claim 1, is a mental process where the user each specific sound associating a processing resource are list of algorithms. Accordingly, the claim recites an abstract idea.
With respect to claims 4 and 14, which depends on claim 1, similarly recites “obtaining a type of sound of the digital audio signal; and determining the specific classification of the digital audio signal according to the type of sound” which similarly can be perform mentally by identifying the sound type of the digital sound by mapping the sound with the table listing a specific sound. Therefore claim 4, similarly recites mental process grouping of abstract idea.
With respect to claims 5 and 15, which depends on claim 4, recites “wherein the plurality of predetermined classifications are associated with a plurality of predetermined algorithms, and the plurality of predetermined algorithms are sound recognition algorithms.”, which is similar to claim 3 as mental process grouping of abstract idea.
obtaining a type of sound of the digital audio signal; and determining the specific classification of the digital audio signal according to the type of sound.
With respect to claims 6 and 16, which depends on claim 1, recites “wherein the plurality of resource configurations correspond to different levels of computing resources, and the at least two predetermined classifications correspond to different predetermined algorithms” similar to claim 1, is mental grouping process of abstract idea.
With respect to claims 7 and 17, which depends on claim 1, recites “wherein performing the classification on the digital audio signal to determine the specific classification corresponding to the digital audio signal comprises: obtaining at least one audio feature of the digital audio signal; and determining the specific classification of the digital audio signal according to the at least one audio feature.” similar to claim 1, is mental grouping process of abstract idea.
With respect to claims 8 and 18, which depends on claim 7, recites “wherein the at least one audio feature comprises at least one of a time-domain feature, a frequency- domain feature, a rhythmic feature, and a statistical feature” which is the feature from the user plotted data could be based on both the time and frequency domain and similarly is mental grouping process of abstract idea.
With respect to claims 10 and 20, which depends on claim 1, recites “wherein processing the digital audio signal based on a specific resource configuration corresponding to the specific classification comprising: processing the digital audio signal based on the specific resource configuration and further based on a specific algorithm corresponding to the specific classification, wherein the specific algorithm is one of a plurality of predetermined algorithms, and the plurality of predetermined algorithms are associated with the plurality of predetermined classifications” which is a mathematical operation and thus falls within mathematical grouping of abstract idea.
The claims 2-8, 10-18 and 20, do not include any other additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into practical application, the additional elements of processing the digital audio signal based on a specific resource configuration steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component (See MPEP 2106.05(f)) cannot provide an incentive concept( See MPEP 2106.05 (b) “And if a claim fails the Alice/Mayo test (i.e., is directed to an exception at Step 2A and does not amount to significantly more than the exception in Step 2B), then the claim is ineligible even if it passes the M-or-T test. DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245, 1256, 113 USPQ2d 1097, 1104 (Fed. Cir. 2014) ("[I]n Mayo, the Supreme Court emphasized that satisfying the machine-or-transformation test, by itself, is not sufficient to render a claim patent-eligible, as not all transformations or machine implementations infuse an otherwise ineligible claim with an 'inventive concept.'"). The claims 2-8, 10-18 and 20, are not patent eligible.
Even when considered in combination, this additional elements represent mere instructions to apply an exception and insignificant extra-solution activity (See MPEP 2106.05(g)), which cannot provide an inventive concept.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1-20, are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Mehrabi et al.(US 2023/0097522 A1).
Regarding Claim 1, Mehrabi teaches: An audio signal processing method for a digital audio signal, comprising ([0030] In the case of detecting or characterizing acoustic events within an environment, the NMD can be configured to continuously or intermittently detect sound from the environment. This detected sound may then be analyzed to characterize noise or other acoustic events within the environment. As described in greater detail below, a plurality of NMDs within an environment can each make respective noise determinations based on the sound as detected by the microphone(s) of each NMD. In some examples, these respective noise determinations can include determining relative noise levels. Additionally, or alternatively, the noise determinations can include classification of the noise into discrete types (e.g., traffic noise, background speech, running water, fan noise, etc.). Moreover, in some examples each of the NMDs can detect various acoustic events, such as calculating a speech detection probability based on the detected sound, detecting a door opening or closing, a person walking across a room, or any other suitable acoustic event. ): performing a classification on the digital audio signal to determine a specific classification (such as traffic, appliances (e.g., fans, sinks, refrigerators, etc.), construction, interfering speech, etc.) corresponding to the digital audio signal, wherein the specific classification is one of a plurality of predetermined classifications, the plurality of predetermined classifications comprise at least two predetermined classifications (such as traffic, appliances (e.g., fans, sinks, refrigerators, etc.), construction, interfering speech, etc.)for a sound situation (acoustic events) ([0030] Moreover, in some examples each of the NMDs can detect various acoustic events, such as calculating a speech detection probability based on the detected sound, detecting a door opening or closing, a person walking across a room, or any other suitable acoustic event. [0121] In operation, NMDs can be exposed to a variety of different types of noise, such as traffic, appliances (e.g., fans, sinks, refrigerators, etc.), construction, interfering speech, etc. To better analyze captured audio input in the presence of such noise, it can be useful to classify noises in the audio input. Different noise sources will produce different sounds, and those different sounds will have different associated sound metadata (e.g., frequency response, signal levels, etc.). The sound metadata associated with different noise sources can have a signature that differentiates one noise source from another. Accordingly, by identifying the different signatures, different noise sources can be classified by analyzing the sound metadata.), and the at least two predetermined classifications correspond to different resource configurations (a spatial processing algorithm i.e. one or more multi-channel Wiener filters, other filters, and/or one or more beam-forming algorithms); and processing the digital audio signal based on a specific resource configuration corresponding to the specific classification, wherein the specific resource configuration is one of a plurality of predetermined resource configurations, and the plurality of predetermined resource configurations are associated with the plurality of predetermined classifications ([0123] Another tunable parameter is noise-reduction, for example modifying the extent to which the NMD processes the sound data or sound-data stream to reduce noise and/or improve the signal-to-noise ratio. The NMD may also modify an acoustic echo cancellation (AEC) parameter (e.g., by modifying operation of the AEC 564 in FIG. 5) or other parameters of the voice processor 560 or other NMD components. As yet another example, a spatial processing algorithm of the NMD may be modified. For example, the voice processing path may reduce the number of microphone channels for a less noisy environment. [0125] In addition or alternatively to those parameters listed above, in some examples the NMD can modify the spatial processing algorithm to improve performance in detecting and processing voice input in the presence of a particular class of noise (e.g., by modifying operation of the spatial processor 566 in FIG. 5). In various examples, the spatial processing algorithm can comprise one or more multi-channel Wiener filters, other filters, and/or one or more beam-forming algorithms, details of which may be found, for example, in previously reference application Ser. Nos. 15/984,073 and 16/147,710.).
Regarding Claim 2, Mehrabi teaches: The audio signal processing method as claimed in claim 1, wherein performing the classification on the digital audio signal to determine the specific classification corresponding to the digital audio signal comprises: obtaining a speech confidence score (SCS) (speech presence probability, i.e. threshold or coefficients associated with these metrics (e.g., energy within certain bands, entropy, etc.) ) of the digital audio signal; and determining the specific classification of the digital audio signal according to the SCS (See rejection of claim 1 and [0030] Moreover, in some examples each of the NMDs can detect various acoustic events, such as calculating a speech detection probability based on the detected sound, detecting a door opening or closing, a person walking across a room, or any other suitable acoustic event. [0110] As one possibility, the spatial processor 566 may monitor metrics that distinguish speech from other sounds. Such metrics can include, for example, energy within the speech band relative to background noise and entropy within the speech band—a measure of spectral structure—which is typically lower in speech than in most common background noise. [0125] In some implementations, the spatial processor 566 may be configured to determine a speech presence probability. The threshold or coefficients associated with these metrics (e.g., energy within certain bands, entropy, etc.) can be adjusted to improve performance of the NMD in detecting and processing voice input in the presence of a particular class of noise. [0190] Example 3. The system of any one of the Examples herein, wherein analyzing the respective sound data specimen to obtain the respective noise determinations comprises classifying noise in each respective sound data specimen into discrete noise types (e.g., with assigned probabilities). [0173] Although this example illustrates speech detection probabilities, this approach can be extended to any type of acoustic event determination, noise classification, or other such analysis. For example, the results of analysis of detected sound from multiple devices can be evaluated together or otherwise combined to determine whether a particular detected noise falls into one category (e.g., background speech) or another (e.g., fan noise). ).
Regarding Claim 3, Mehrabi teaches: The audio signal processing method as claimed in claim 2, wherein the plurality of predetermined classifications are associated with a plurality of predetermined algorithms, and the plurality of predetermined algorithms are noise reduction algorithms (See rejection of claim 1 and [0123] Another tunable parameter is noise-reduction, for example modifying the extent to which the NMD processes the sound data or sound-data stream to reduce noise and/or improve the signal-to-noise ratio. The NMD may also modify an acoustic echo cancellation (AEC) parameter (e.g., by modifying operation of the AEC 564 in FIG. 5) or other parameters of the voice processor 560 or other NMD components. As yet another example, a spatial processing algorithm of the NMD may be modified. For example, the voice processing path may reduce the number of microphone channels for a less noisy environment. [0125] In addition or alternatively to those parameters listed above, in some examples the NMD can modify the spatial processing algorithm to improve performance in detecting and processing voice input in the presence of a particular class of noise (e.g., by modifying operation of the spatial processor 566 in FIG. 5). In various examples, the spatial processing algorithm can comprise one or more multi-channel Wiener filters, other filters, and/or one or more beam-forming algorithms, details of which may be found, for example, in previously reference application Ser. Nos. 15/984,073 and 16/147,710.).
Regarding Claim 4, Mehrabi teaches: The audio signal processing method as claimed in claim 1, wherein performing the classification on the digital audio signal to determine the specific classification corresponding to the digital audio signal comprises: obtaining a type of sound of the digital audio signal; and determining the specific classification of the digital audio signal according to the type of sound(See rejection of claim 1 and [0110] As one possibility, the spatial processor 566 may monitor metrics that distinguish speech from other sounds. [0121] To better analyze captured audio input in the presence of such noise, it can be useful to classify noises in the audio input. [0159] Further, a classifier can be constructed using a neural network to identify noises in collected data from one or more NMDs. For example, the neural network can be trained on a set of known, labeled noises that are projected onto the population's eigenspace. These known, labeled noises can be processed by simulation software and can include many types of typical noises grouped into a handful of labels for classification such as “ambient,” “fan,” “sink,” “interfering speech,” etc., each of which may provide sufficient insight to tune performance parameters of an NMD, for example by modifying a noise cancellation algorithm or other audio processing algorithms.).
Regarding Claim 5, Mehrabi teaches: The audio signal processing method as claimed in claim 4, wherein the plurality of predetermined classifications are associated with a plurality of predetermined algorithms, and the plurality of predetermined algorithms are sound recognition algorithms (neural network )(See rejection of claim 1 and [0110] As one possibility, the spatial processor 566 may monitor metrics that distinguish speech from other sounds. [0121] To better analyze captured audio input in the presence of such noise, it can be useful to classify noises in the audio input. [0138] In block 816, a noise classifier can be updated based on the particular noise classification obtained in block 812. As described in more detail below, a noise classifier can include a neural network or other mathematical model configured to identify different types of noise in detected sound data or metadata. [0159] Further, a classifier can be constructed using a neural network to identify noises in collected data from one or more NMDs. For example, the neural network can be trained on a set of known, labeled noises that are projected onto the population's eigenspace. These known, labeled noises can be processed by simulation software and can include many types of typical noises grouped into a handful of labels for classification such as “ambient,” “fan,” “sink,” “interfering speech,” etc., each of which may provide sufficient insight to tune performance parameters of an NMD, for example by modifying a noise cancellation algorithm or other audio processing algorithms.).
Regarding Claim 6, Mehrabi teaches: The audio signal processing method as claimed in claim 1, wherein the plurality of resource configurations correspond to different levels of computing resources, and the at least two predetermined classifications correspond to different predetermined algorithms(See rejection of claim 1 and [0123] Another tunable parameter is noise-reduction, for example modifying the extent to which the NMD processes the sound data or sound-data stream to reduce noise and/or improve the signal-to-noise ratio. The NMD may also modify an acoustic echo cancellation (AEC) parameter (e.g., by modifying operation of the AEC 564 in FIG. 5) or other parameters of the voice processor 560 or other NMD components. As yet another example, a spatial processing algorithm of the NMD may be modified. For example, the voice processing path may reduce the number of microphone channels for a less noisy environment. [0125] In addition or alternatively to those parameters listed above, in some examples the NMD can modify the spatial processing algorithm to improve performance in detecting and processing voice input in the presence of a particular class of noise (e.g., by modifying operation of the spatial processor 566 in FIG. 5). In various examples, the spatial processing algorithm can comprise one or more multi-channel Wiener filters, other filters, and/or one or more beam-forming algorithms, details of which may be found, for example, in previously reference application Ser. Nos. 15/984,073 and 16/147,710. [0138] In block 816, a noise classifier can be updated based on the particular noise classification obtained in block 812. As described in more detail below, a noise classifier can include a neural network or other mathematical model configured to identify different types of noise in detected sound data or metadata.).
.
Regarding Claim 7, Mehrabi teaches: The audio signal processing method as claimed in claim 1, wherein performing the classification on the digital audio signal to determine the specific classification corresponding to the digital audio signal comprises: obtaining at least one audio feature of the digital audio signal; and determining the specific classification of the digital audio signal according to the at least one audio feature(See rejection of claim 1 and [0132] Analyzing the sound metadata can include comparing one or more features of the sound metadata with known noise reference values or a sample population data with known noise. For example, any features of the sound metadata such as signal levels, frequency response spectra, etc. can be compared with noise reference values or values collected and averaged over a sample population. In some examples, analyzing the sound metadata includes projecting the frequency response spectrum onto an eigenspace corresponding to aggregated frequency response spectra from a population of NMDs (as described in more detail below with respect to FIGS. 10-13). In at least some examples, projecting the frequency response spectrum onto an eigenspace can be performed as a pre-processing step to facilitate downstream classification. In various examples, any number of different techniques for classification of noise using the sound metadata can be used, for example machine learning using decision trees, or Bayesian classifiers, neural networks, or any other classification techniques. [0138] In block 816, a noise classifier can be updated based on the particular noise classification obtained in block 812. As described in more detail below, a noise classifier can include a neural network or other mathematical model configured to identify different types of noise in detected sound data or metadata. [0146] In block 913, the remote computing device 106c analyzes the sound metadata to classify the noise. In some examples, analyzing the sound metadata includes comparing one or more features of the sound metadata with noise reference values or sample population values. For example, any feature of the sound metadata (such as frequency response data, signal levels, etc.) can be compared with known noise reference values or averaged values collected from a sample population, as described in more detail below with respect to FIGS. 10-13.).
Regarding Claim 8, Mehrabi teaches: The audio signal processing method as claimed in claim 7, wherein the at least one audio feature comprises at least one of a time-domain feature, a frequency- domain feature, a rhythmic feature, and a statistical feature (See rejection of claim 7 and [0116] The sound metadata can include, for example: (1) frequency response data for individual microphones of the array, (2) an echo return loss enhancement measure (i.e., a measure of the effectiveness of the acoustic echo canceller (AEC) for each microphone), (3) a voice direction measure; (4) arbitration statistics (e.g., signal and noise estimates for the spatial processing streams associated with different microphones); and/or (5) speech spectral data (i.e., frequency response evaluated on processed audio output after acoustic echo cancellation and spatial processing have been performed). Other sound metadata may also be used to identify and/or classify noise in the detected-sound data S.sub.D. [0132] Analyzing the sound metadata can include comparing one or more features of the sound metadata with known noise reference values or a sample population data with known noise. For example, any features of the sound metadata such as signal levels, frequency response spectra, etc. can be compared with noise reference values or values collected and averaged over a sample population. In some examples, analyzing the sound metadata includes projecting the frequency response spectrum onto an eigenspace corresponding to aggregated frequency response spectra from a population of NMDs (as described in more detail below with respect to FIGS. 10-13). In at least some examples, projecting the frequency response spectrum onto an eigenspace can be performed as a pre-processing step to facilitate downstream classification. In various examples, any number of different techniques for classification of noise using the sound metadata can be used, for example machine learning using decision trees, or Bayesian classifiers, neural networks, or any other classification techniques.).
Regarding Claim 9, Mehrabi teaches: The audio signal processing method as claimed in claim 1, wherein the specific resource configuration comprises an operating voltage, an operating frequency, clock resource, an operating state of an SRAM, an operating state of a DRAM, and/or an operating state of a co-processor (See rejection of claim 1 and [0169] In some examples, the one or more of the NMDs can modify an audio output based on the acoustic determinations (e.g., detection of noise, speech, etc.) within the environment. For example, if NMD 503i detects type B noise (e.g., background speech) at level 8, this indicates a high noise level and, in response, the surrounding NMDs 503f and 503i may modify their audio output in a manner that masks or suppresses the detected background speech for adjacent areas in the space. Additionally, or alternatively, if high speech levels are detected, the acoustic output can be modified so as to enhance speech for the listeners, such as by lowering the volume level of audio within speech frequencies. [0171] In still other examples, the particular NMDs 503 may modify the volume, equalization parameters, can adjust or switch the select audio output, or make any other suitable modification to the audio output that results in masking or suppressing the detected noise within the environment. In some examples, the modified audio output can itself vary in real-time or near real-time based on continued noise determinations. For example, as the detected noise determinations vary (e.g., the detected noise level decreases), the modified audio output may also be varied (e.g., the added audio over a particular frequency band can be reduced or can cease altogether).).
Regarding Claim 10, Mehrabi teaches: The audio signal processing method as claimed in claim 1, wherein processing the digital audio signal based on a specific resource configuration corresponding to the specific classification comprising: processing the digital audio signal based on the specific resource configuration and further based on a specific algorithm corresponding to the specific classification, wherein the specific algorithm is one of a plurality of predetermined algorithms, and the plurality of predetermined algorithms are associated with the plurality of predetermined classifications (See rejection of claim 1 and [0110] As one possibility, the spatial processor 566 may monitor metrics that distinguish speech from other sounds. [0121] To better analyze captured audio input in the presence of such noise, it can be useful to classify noises in the audio input. [0123] Another tunable parameter is noise-reduction, for example modifying the extent to which the NMD processes the sound data or sound-data stream to reduce noise and/or improve the signal-to-noise ratio. The NMD may also modify an acoustic echo cancellation (AEC) parameter (e.g., by modifying operation of the AEC 564 in FIG. 5) or other parameters of the voice processor 560 or other NMD components. As yet another example, a spatial processing algorithm of the NMD may be modified. For example, the voice processing path may reduce the number of microphone channels for a less noisy environment. [0125] In addition or alternatively to those parameters listed above, in some examples the NMD can modify the spatial processing algorithm to improve performance in detecting and processing voice input in the presence of a particular class of noise (e.g., by modifying operation of the spatial processor 566 in FIG. 5). In various examples, the spatial processing algorithm can comprise one or more multi-channel Wiener filters, other filters, and/or one or more beam-forming algorithms, details of which may be found, for example, in previously reference application Ser. Nos. 15/984,073 and 16/147,710. [0159] Further, a classifier can be constructed using a neural network to identify noises in collected data from one or more NMDs. For example, the neural network can be trained on a set of known, labeled noises that are projected onto the population's eigenspace. These known, labeled noises can be processed by simulation software and can include many types of typical noises grouped into a handful of labels for classification such as “ambient,” “fan,” “sink,” “interfering speech,” etc., each of which may provide sufficient insight to tune performance parameters of an NMD, for example by modifying a noise cancellation algorithm or other audio processing algorithms.).
Regarding Claim 11, Mehrabi teaches: An audio signal processing device comprising: a storage device (remote computing device 106c can collect sound metadata data from one or more NMDs) configured to store a classification model(a noise classifier can include a neural network or other mathematical model) ([0131] In block 812, the method 800 involves analyzing the sound metadata to classify noise in the detected sound. This analysis can be performed either locally by the NMD or remotely by one or more remote computing devices. [0138] In block 816, a noise classifier can be updated based on the particular noise classification obtained in block 812. As described in more detail below, a noise classifier can include a neural network or other mathematical model configured to identify different types of noise in detected sound data or metadata. Such a noise classifier can be improved with increased available data for training and evaluation. Accordingly, noise data may be obtained from a large number of NMDs, with each new noise classification or other noise data being used to update or revise the noise classifier. [0145] From block 909, the sound metadata can be transmitted from the NMD 503 to the remote computing device 106c for cloud collection in block 911. For example, the remote computing device 106c can collect sound metadata data from one or more NMDs. In some examples, the remote computing device 106c can collect sound metadata from a large population of NMDs, and such population metadata can be used to classify noise, derive averages, identify outliers, and guide modification of NMD performance parameters to improve operation of the NMD 503 in the presence of various classes of noise.); and a processor configured to load the classification model from the storage device and perform the classification module on a digital audio signal to determine a specific classification corresponding to the digital audio signal ([0146] In block 913, the remote computing device 106c analyzes the sound metadata to classify the noise. In some examples, analyzing the sound metadata includes comparing one or more features of the sound metadata with noise reference values or sample population values. ), wherein the specific classification is one of a plurality of predetermined classifications, the plurality of predetermined classifications comprise at least two predetermined classifications for a sound situation, and the at least two predetermined classifications correspond to different resource configurations, wherein the processor is further configured to process the digital audio signal based on a specific resource configuration corresponding to the specific classification, and wherein the specific resource configuration is one of a plurality of predetermined resource configurations, and the plurality of predetermined resource configurations are associated with the plurality of predetermined classifications (See rejection of claim 1).
Regarding Claim 12, Mehrabi teaches: The audio signal processing device as claimed in claim 11, wherein the processor is further configured to obtain a speech confidence score (SCS) of the digital audio signal and to determine the specific classification of the digital audio signal according to the SCS (See rejection of claim 2).
Regarding Claim 13, Mehrabi teaches: The audio signal processing device as claimed in claim 12, wherein the plurality of predetermined classifications are associated with a plurality of predetermined algorithms, and the plurality of determined algorithms are noise reduction algorithms(See rejection of claim 3).
Regarding Claim 14, Mehrabi teaches: The audio signal processing device as claimed in claim 11, wherein the processor is configured to obtain a type of sound of the digital audio signal and to determine the specific classification of the digital audio signal according to the type of sound (See rejection of claim 4).
Regarding Claim 15, Mehrabi teaches: The audio signal processing device as claimed in claim 14, wherein the plurality of predetermined classifications are associated with a plurality of predetermined algorithms, and the plurality of determined algorithms are sound recognition algorithms (See rejection of claim 5).
Regarding Claim 16, Mehrabi teaches: The audio signal processing device as claimed in claim 11, wherein the plurality of resource configurations correspond to different levels of computing resources, and the at least two predetermined classifications correspond to different predetermined algorithms(See rejection of claim 6).
Regarding Claim 17, Mehrabi teaches: The audio signal processing device as claimed in claim 11, wherein the processor is configured to obtain at least one audio feature of the digital audio signal and determine the specific classification of the digital audio signal according to the at least one audio feature(See rejection of claim 7).
Regarding Claim 18, Mehrabi teaches: The audio signal processing device as claimed in claim 17, wherein the at least one audio feature comprises at least one of a time-domain feature, a frequency- domain feature, a rhythmic feature, and a statistical feature(See rejection of claim 8).
Regarding Claim 19, Mehrabi teaches: The audio signal processing device as claimed in claim 11, wherein the specific resource configuration comprises an operating voltage, an operating frequency, clock resource, an operating state of an SRAM, an operating state of a DRAM, and/or an operating state of a co-processor(See rejection of claim 9).
Regarding Claim 20, Mehrabi teaches: The audio signal processing device as claimed in claim 11, wherein the processor is configured to process the digital audio signal based on the specific resource configuration and further based on a specific algorithm corresponding to the specific classification, and wherein the specific algorithm is one of a plurality of predetermined algorithms, and the plurality of predetermined algorithms are associated with the plurality of predetermined classifications(See rejection of claim 10).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of record Visser et al.(US 2023/0036986 A1) teach: PROCESSING OF AUDIO SIGNALS FROM MULTIPLE MICROPHONES ([0086] The audio event processing unit 154 may access a database (not shown) that includes models for different audio events, such as a car horn, a train horn, a pedestrian talking, etc. In response to the sound characteristics matching (or substantially matching) a particular model, the audio event processing unit 154 can generate audio event information 145 indicating that the sound 182 represents an audio event associated with the particular model. In some implementations, the audio event processing unit 154 includes one or more classifiers configured to determine a class of an audio event in a similar manner as described for the audio event processing unit 134. As compared to the audio event processing unit 134, however, the audio event processing unit 154 may perform more complex operations, may support a much larger set of models or audio classes than the audio event processing unit 134, and may generate a more accurate determination (or classification) of an audio event than the audio event processing unit 134. [0124] The system 400 of FIG. 4 enables detected sound events and corresponding direction-of-arrivals to be analyzed to improve a hearing sensation. Based on the context information 496, the system 400 can determine which sound is of particular interest to a user. For example, if the user is crossing a street, the system 400 can determine that the sound 182 of the vehicle is of more importance than the sound 186 of people talking. As a result, the system 400 can focus on the sound 182 of importance and suppress other sounds.).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras Shah can be reached at 571-270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2653