Notice of Pre-AIA or AIA Status
1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
2. This Office Action is sent in response to Applicant’s Communication received on 01/22/2026 for application number 17/196,960.
Response to Amendments
3. The Amendment filed 01/22/2026 has been entered. Claims 2, 3, 5, 8, 9, 16, 18, 20, 23, and 24 have been amended. Claims 1, 4, 6, 7, 10-15, 17, 19, 21, and 29 have been canceled. Claims 30-39 have been added. Claims 2, 3, 5, 8, 9, 16, 18, 20, 22-28, and 30-39 remain pending in the application.
Claim Objections
4. Claims 37-39 are objected to because of the following informalities:
Claims 37-39 recite “The integrated circuit of claim 33” where “The integrated circuit of claim 36” was apparently intended.
Response to Arguments
5. Applicant’s arguments with respect to claims have been considered but are moot in view of new ground of rejection. See rejections below for details.
Claim Rejections – 35 USC § 103
6. The following is a quotation of pre-AIA 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.
7. Claims 3, 20, 22-26, and 33-34 are rejected under 35 U.S.C. 103 as being unpatentable over Fu (U.S. Patent Pub. No. US 5538915 A) in view of Germain et al. (U.S. Patent Application Pub. No. US 20190043516 A1).
Claim 20: Fu teaches a device, comprising:
an integrated circuit (i.e. a neural network integrated circuit chip; col. 3, lines 36-40), the integrated circuit implementing a neural network (i.e. a neural network integrated circuit chip; col. 3, lines 36-40),
wherein: the integrated circuit includes a plurality of operational amplifiers (i.e. the neurons are all made up of operational amplifiers; col. 1, lines 47-50) and a plurality of resistors (i.e. a customizable neural network is provided in which one or more resistors form each synapse; col. 3, lines 47-50);
each operational amplifier represents a respective analog neuron of the equivalent analog network (i.e. The neuron 114 is essentially an operational amplifier; col. 4, lines 48-50);
each resistor: (i) represents a respective connection between a respective pair analog neurons (i.e. The synaptic interconnection weight between one of the array input pairs 104 and one of the neurons 114 is defined by all eight resistors which are connected between tile pair 104 and the eight inputs 108 to the neuron 114; col. 7, lines 10-15), ii) represents a respective fixed resistance value (i.e. a production version of the network may be mass produced using fixed resistors in place; col. 1, lines 55-59), and (iii) defines a respective dedicated signal pathway connecting the respective pair of analog neurons (i.e. a plurality of neuron circuits each coupled to receive as inputs several of the synaptic array output lines. The outputs of the neuron circuits may further be connected back to the input isolation buffers by severable links for feedback purposes; col. 3, lines 60-65); and
the integrated circuit is configured to receive an input signal (i.e. a plurality of input neurons (which may simply be isolation buffers) receive input excitation signals; col. 1, lines 19-21).
Fu does not explicitly teach a circuit for voice clarity, trained for voice clarity enhancement using a dataset of noisy voice signals and corresponding clean voice signals, enhance the voice clarity of the input signal by removing at least a portion of unwanted noise from the input signal, thereby generating an enhanced voice signal, and to provide the enhanced voice signal.
However, Germain teaches a voice-transmission device (i.e. FIG. 6 illustrates an example platform 600, configured in accordance with certain embodiments of the present disclosure, to perform speech denoising. In some embodiments, platform 600 may be hosted on, or otherwise be incorporated into a speech enabled device; para. [0043]), comprising: an integrated circuit (i.e. Examples of hardware elements may include processors, microprocessors, circuits, circuit elements (for example, transistors, resistors, capacitors, inductors, and so forth), integrated circuits, ASICs; para. [0054]) for voice clarity (i.e. Techniques to remove this noise, sometimes referred to as denoising or as enhancement, are used to improve the audio signal quality so that the underlying speech can be recognized and understood, whether by a human listener or by subsequent processing systems; para. [0002]), trained for voice clarity enhancement using a dataset of noisy voice signals and corresponding clean voice signals (i.e. a speech denoising neural network is trained based on a combination of noisy training speech 150 and associated clean training speech 155. This operation generates a trained speech denoising neural network 180; para. [0017]), enhance the voice clarity of the input signal by removing at least a portion of unwanted noise from the input signal (i.e. Techniques are provided for speech denoising, which is a process for the reduction or elimination of additive background noise from a signal containing speech. Denoising is an important operation in many audio processing systems, as it improves the audio signal quality such that the underlying speech can be recognized and understood, either by a human listener or by subsequent automated speech recognition systems; para. [0011]), thereby generating an enhanced voice signal, and to provide the enhanced voice signal (i.e. the denoising neural network comprises 16 convolutional layers. The first (topmost) layer, which receives the degraded (noisy) input signal 160, and the final layer, which produces the enhanced output signal 190; para. [0034, 0051]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
Claim 22: Fu and Germain teach the voice-transmission device of claim 20. Fu does not explicitly teach wherein the voice transmission device is integrated into a cell phone.
However, Germain further teaches wherein the voice transmission device is integrated into a cell phone (i.e. FIG. 6 illustrates an example platform 600, configured in accordance with certain embodiments of the present disclosure, to perform speech denoising. In some embodiments, platform 600 may be hosted on, or otherwise be incorporated into a speech enabled device, (for example, a smartphone, smart-speaker, smart-tablet, personal assistant, smart home management system); para. [0043]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
Claim 23: Fu and Germain teach the voice-transmission device of claim 22. Fu does not explicitly teach wherein: the cell phone includes a microphone configured to provide the input signal to the integrated circuit; and the integrated circuit is configured to enhance voice clarity of the input signal and output an enhance voice signal based on the input signal.
However, Germain further teaches wherein: the cell phone includes a microphone configured to provide the input signal to the integrated circuit (i.e. platform 600 may comprise any combination of a processor 620, a memory 630, a trained speech denoising neural network 180, a network interface 640, an input/output (I/O) system 650, a user interface 660, microphone(s) 610; para. [0044]); and the integrated circuit is configured to enhance voice clarity of the input signal (i.e. The trained speech denoising neural network 180 is employed to process noisy operational speech 160 to generate denoised speech 190; para. [0019]) and output an enhance voice signal based on the input signal (i.e. the denoising neural network comprises 16 convolutional layers. The first (topmost) layer, which receives the degraded (noisy) input signal 160, and the final layer, which produces the enhanced output signal 190; para. [0034]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
Claim 24: Fu and Germain teach the voice-transmission device of claim 22. Fu does not explicitly teach wherein: the integrated circuit is configured to enhance voice clarity of a signal received by the cell phone prior to providing input to the speaker of the cell phone so that the voice enhanced signal output from the integrated circuit is provided as input to a speaker of the cell phone.
However, Germain further teaches wherein: the integrated circuit is configured to enhance voice clarity of a signal received by the cell phone prior to providing input to the speaker of the cell phone so that the voice enhanced signal output from the integrated circuit is provided as input to a speaker of the cell phone (i.e. Techniques are provided for speech denoising, which is a process for the reduction or elimination of additive background noise from a signal containing speech. Denoising is an important operation in many audio processing systems, as it improves the audio signal quality such that the underlying speech can be recognized and understood, either by a human listener; para. [0011, 0049, 0051]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
Claim 25: Fu and Germain teach the voice-transmission device of claim 20. Fu does not explicitly teach wherein: the integrated circuit is coupled to one or more other noise cancelling devices; and the integrated circuit is configured to: receive signals that include voice-containing signals and noise signals; and enhance voice-containing signals and suppress noise signals.
However, Germain further teaches wherein: the integrated circuit is coupled to one or more other noise cancelling devices; and the integrated circuit is configured to: receive signals that include voice-containing signals and noise signals; and enhance voice-containing signals and suppress noise signals (i.e. Techniques are provided for speech denoising, which is a process for the reduction or elimination of additive background noise from a signal containing speech. Denoising is an important operation in many audio processing systems, as it improves the audio signal quality such that the underlying speech can be recognized and understood, either by a human listener or by subsequent automated speech recognition systems; para. [0011, 0014, 0044]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
Claim 26: Fu and Germain teach the voice-transmission device of claim 20. Fu does not explicitly teach wherein the integrated circuit is coupled to one or more noise reduction software programs executing on the voice-transmission device.
However, Germain further teaches wherein the integrated circuit is coupled to one or more noise reduction software programs executing on the voice-transmission device (i.e. the functionalities disclosed herein can be incorporated into other voice-enabled devices and speech-based software applications, such as, for example, automobile control/navigation, smart-home management, entertainment, personal assistant, and robotic applications; para. [0013, 0054, 0056, 0060]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
Claim 33: Fu and Germain teach the voice-transmission device of claim 22. Fu does not explicitly teach one or more microphones configured to generate an input signal that includes voice and unwanted noise; one or more speakers configured to receive audio signals and output sound based on the audio signals, wherein: the integrated circuit is configured to: receive the input signal from the one or more microphones; enhance the voice clarity of the input signal by removing at least a portion of the unwanted noise from the input signal, thereby generating an enhanced voice signal; and output the enhanced voice signal to the one or more speakers.
However, Germain further teaches one or more microphones (i.e. platform 600 may comprise any combination of a processor 620, a memory 630, a trained speech denoising neural network 180, a network interface 640, an input/output (I/O) system 650, a user interface 660, microphone(s) 610; para. [0044]) configured to generate an input signal that includes voice and unwanted noise (i.e. the trained speech denoising neural network may be applied to noisy operational speech signals to generate denoised speech signals; para. [0041]); one or more speakers configured to receive audio signals and output sound based on the audio signals (i.e. platform 600 may comprise any combination of a processor 620, a memory 630, a trained speech denoising neural network 180, a network interface 640, an input/output (I/O) system 650, a user interface 660, microphone(s) 610; para. [0044]), wherein: the integrated circuit is configured to: receive the input signal from the one or more microphones; enhance the voice clarity of the input signal by removing at least a portion of the unwanted noise from the input signal, thereby generating an enhanced voice signal; and output the enhanced voice signal to the one or more speakers (i.e. Techniques are provided for speech denoising, which is a process for the reduction or elimination of additive background noise from a signal containing speech. Denoising is an important operation in many audio processing systems, as it improves the audio signal quality such that the underlying speech can be recognized and understood, either by a human listener or by subsequent automated speech recognition systems; para. [0011, 0014, 0044]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
Claim 34: Fu and Germain teach the voice-transmission device of claim 20. Fu does not explicitly teach wherein: the voice-transmission device is configured to receive a wireless signal; and the input signal corresponds to a wireless signal.
However, Germain further teaches wherein: the voice-transmission device is configured to receive a wireless signal (i.e. In some embodiments, platform 600 may comprise any combination of a processor 620, a memory 630, a trained speech denoising neural network 180, a network interface 640, an input/output (I/O) system 650, a user interface 660, microphone(s) 610, and a storage system 670. As can be further seen, a bus and/or interconnect 692 is also provided to allow for communication between the various components listed above and/or other components not shown. Platform 600 can be coupled to a network 694 through network interface 640 to allow for communications with other computing devices, platforms, devices to be controlled, or other resources; para. [0014, 0043, 0044, 0053]); and the input signal corresponds to a wireless signal (i.e. Techniques are provided for speech denoising, which is a process for the reduction or elimination of additive background noise from a signal containing speech. Denoising is an important operation in many audio processing systems, as it improves the audio signal quality such that the underlying speech can be recognized and understood, either by a human listener or by subsequent automated speech recognition systems; para. [0011, 0014, 0044, 0053]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
Claim 3: Fu and Germain teach the integrated circuit of claim 36. Fu does not explicitly teach wherein the neural network has a topology that includes one or more of: a convolutional layer, a max-pooling layer, and a densely connected layer.
However, Germain further teaches wherein the neural network has a topology that includes one or more of: a convolutional layer, a max-pooling layer, and a densely connected layer (i.e. the audio classifier neural network 170 is a convolutional neural network comprising multiple convolutional layers; para. [0015]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
8. Claims 27 and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Germain and further in view of Song (U.S. Patent Application Pub. No. US 20180268808 A1).
Claim 27: Fu and Germain teach the voice-transmission device of claim 20. Fu does not explicitly teach wherein the integrated circuit is configured to perform voice clarity enhancement according to one or more of: a number of voices detected in a signal received at the voice-transmission device, an estimated proximity of a voice source to a receiver of the signal, an estimated relative distance of voice sources to the receiver of the signal, and a volume of a voice detected in the signal.
However, Germain further teaches wherein the integrated circuit is configured to perform voice clarity enhancement (i.e. Techniques are provided for speech denoising, which is a process for the reduction or elimination of additive background noise from a signal containing speech. Denoising is an important operation in many audio processing systems, as it improves the audio signal quality such that the underlying speech can be recognized and understood, either by a human listener; para. [0011, 0049, 0051]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
However, Song teaches wherein the integrated circuit is configured to perform voice clarity enhancement (i.e. The electronic apparatus 100 may process a plurality of voice signals using the changed pre-processing method and generate the enhanced voice signal at step S870; para. [0125]) according to one or more of: a number of voices detected in a signal received at the voice-transmission device, an estimated proximity of a voice source to a receiver of the signal, an estimated relative distance of voice sources to the receiver of the signal, and a volume of a voice detected in the signal (i.e. Referring to FIG. 8, the electronic apparatus 100 may receive sound sources from different positions and generate a plurality of voice signals at step S810. For example, the electronic apparatus 100 may generate multichannel voice signals through a micro-array defined by a plurality of microphones. The electronic apparatus 100 may determine the direction where a sound source is uttered and the distance from the uttered sound source based on the difference in time when sound sources are input to the plurality of microphones; para. [0121]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu and Germain to include the feature of Song. One would have been motivated to make this modification because it provides processing the plurality of voice signals using the change pre-processing method and generating enhanced voice signals.
Claim 28: Fu and Germain teach the voice-transmission device of claim 20. Fu does not explicitly teach wherein the integrated circuit is configured to enhance voice clarity for signals originating from more than one source.
However, Song teaches wherein the integrated circuit is configured to enhance voice clarity for signals originating from more than one source (i.e. Referring to FIG. 8, the electronic apparatus 100 may receive sound sources from different positions and generate a plurality of voice signals at step S810. For example, the electronic apparatus 100 may generate multichannel voice signals through a micro-array defined by a plurality of microphones. The electronic apparatus 100 may determine the direction where a sound source is uttered and the distance from the uttered sound source based on the difference in time when sound sources are input to the plurality of microphones; para. [0121]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu and Germain to include the feature of Song. One would have been motivated to make this modification because it provides processing the plurality of voice signals using the change pre-processing method and generating enhanced voice signals.
9. Claims 30 and 32 are rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Germain and further in view of Mo et al. (U.S. Patent Application Pub. No. US 20090273691 A1).
Claim 30: Fu and Germain teach the voice-transmission device of claim 20. Fu further teaches the integrated circuit requires a smaller die area for implementation of the trained neural network (i.e. a customizable neural network is provided in which one or more resistors form each synapse. All the resistors in the synaptic array are identical, thus simplifying the processing issues. Doped, amorphous silicon is used as the resistor material, to create extremely high resistances occupying very small spaces; col. 3, lines 47-53).
Fu does not explicitly teach wherein: the voice-transmission device is a portable device; and the integrated circuit requires a smaller die area for implementation compared to a software implementation of the trained neural network, enabling the integrated circuit to be integrated into the portable voice-transmission device for voice clarity.
However, Germain further teaches wherein: the voice-transmission device is a portable device (i.e. The disclosed techniques can be implemented on a broad range of platforms including smartphones, smart-speakers, laptops, tablets, video conferencing systems, hearing aids, gaming systems, smart home control systems, and robotic systems; para. [0014]), enabling the integrated circuit to be integrated into the portable voice-transmission device for voice clarity (i.e. The trained speech denoising neural network may then be employed to process noisy operational speech signals to generate denoised speech signals; para. [0013]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
However, Mo teaches the integrated circuit requires a smaller die area for implementation compared to a software implementation (i.e. To address the above limitation of the digital row noise correction, a method and apparatus for row noise correction and hot pixel filtering in the analog domain (i.e., prior to digital conversion) is provided. Furthermore, compared to its digital domain counterpart, analog row noise correction has the advantage of smaller die size, greater accuracy and faster readout speed; para. [0008]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu and Germain to include the feature of Mo. One would have been motivated to make this modification because it provides the advantage of smaller die size, greater accuracy and faster readout speed.
Claim 32: Fu and Germain teach the voice-transmission device of claim 20. Fu does not explicitly teach wherein: the voice-transmission device is a communication device configured for real-time communication; and the integrated circuit has a smaller latency compared to a software implementation of the trained neural network, enabling the integrated circuit to be integrated into the voice-transmission device for voice clarity in real-time communication.
However, Germain further teaches wherein: the voice-transmission device is a communication device configured for real-time communication (i.e. Processor 620 may be configured to execute an Operating System (OS) 680 which may comprise any suitable operating system, such as Google Android (Google Inc., Mountain View, Calif.), Microsoft Windows (Microsoft Corp., Redmond, Wash.), Apple OS X (Apple Inc., Cupertino, Calif.), Linux, or a real-time operating system (RTOS). As will be appreciated in light of this disclosure, the techniques provided herein can be implemented without regard to the particular operating system provided in conjunction with platform 600, and therefore may also be implemented using any suitable existing or subsequently-developed platform; para. [0047]); and the integrated circuit to a software implementation of the trained neural network (i.e. the techniques described herein may provide an improved method for speech denoising with greater efficiency, compared to existing techniques that require complex statistical signal processing, computationally expensive spectrogram transforms, or the use of expert knowledge for manual tuning of the loss functions. The disclosed techniques can be implemented on a broad range of platforms including smartphones, smart-speakers, laptops, tablets, video conferencing systems, hearing aids, gaming systems, smart home control systems, and robotic systems. These techniques may further be implemented in hardware or software or a combination thereof; para. [0014]), enabling the integrated circuit to be integrated into the voice-transmission device for voice clarity in real-time communication (i.e. The trained speech denoising neural network may then be employed to process noisy operational speech signals to generate denoised speech signals; para. [0013]).
However, Mo teaches the integrated circuit has a smaller latency compared to a software implementation (i.e. To address the above limitation of the digital row noise correction, a method and apparatus for row noise correction and hot pixel filtering in the analog domain (i.e., prior to digital conversion) is provided. Furthermore, compared to its digital domain counterpart, analog row noise correction has the advantage of smaller die size, greater accuracy and faster readout speed; para. [0008]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu and Germain to include the feature of Mo. One would have been motivated to make this modification because it provides the advantage of smaller die size, greater accuracy and faster readout speed.
10. Claim 31 is rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Germain and further in view of Binas et al. (U.S. Patent Application Pub. No. US 20190050720 A1).
Claim 31: Fu and Germain teach the voice-transmission device of claim 20. Fu further teaches the integrated circuit has a lower power consumption (i.e. For minimum power dissipation and maximum linearity; col. 6, lines 45-55).
Fu does not explicitly teach wherein: the voice-transmission device is a battery powered device; and the integrated circuit has a lower power consumption compared to a software implementation of the trained neural network, enabling the integrated circuit to be integrated into the battery-powered voice-transmission device for voice clarity.
However, Germain further teaches wherein: the voice-transmission device is a battery powered device (i.e. The disclosed techniques can be implemented on a broad range of platforms including smartphones, smart-speakers, laptops, tablets, video conferencing systems, hearing aids, gaming systems, smart home control systems, and robotic systems; para. [0014]); and the integrated circuit has to a software implementation of the trained neural network (i.e. These techniques may further be implemented in hardware or software or a combination thereof; para. [0014]), enabling the integrated circuit to be integrated into the battery-powered voice-transmission device for voice clarity (i.e. The trained speech denoising neural network may then be employed to process noisy operational speech signals to generate denoised speech signals; para. [0013]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu to include the feature of Germain. One would have been motivated to make this modification because it applies the known speech denoising neural network of Germain to the analog neural network IC architecture of Fu would have been a predictable use of the Fu’s configurable hardware to implement a known neural network in a dedicated circuitry.
However, Binas teaches the integrated circuit has a lower power consumption compared to a software implementation of the trained neural network (i.e. The proposed analogue electronic neural network, when programmed properly, may achieve state-of-the-art performance while dissipating significantly less power than most efficient digital electronic neural networks. The very low power consumption can be achieved by running at least some components of the network in their sub-threshold (weak inversion) region; para. [0009]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu and Germain to include the feature of Binas. One would have been motivated to make this modification because by applying low power hardware approach to the known voice enhancement network in order to make the device more suitable for portable operation.
11. Claim 35 is rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Germain and further in view of Oshima et al. (U.S. Patent Application Pub. No. US 20190392298 A1).
Claim 35: Fu and Germain teach the voice-transmission device of claim 20. Fu further teaches wherein: the neural network has a topology (i.e. The neural network integrated circuit chip may include a plurality of input isolation buffers for driving the input lines in the synaptic array, and a plurality of neuron circuits each coupled to receive as inputs several of the synaptic array output lines. The outputs of the neuron circuits may further be connected back to the input isolation buffers by severable links for feedback purposes. By severing appropriate ones of these feedback conductors, any of a large number of neural network architectures can be defined during the same laser cutting step in which the synaptic weights are programmed; col. 3, lines 57-67); the integrated circuit includes an analog network (i.e. the neurons are all made up of operational amplifiers; col. 1, lines 47-50); the analog network includes the plurality of analog components and corresponds to a first portion of the neural network topology (i.e. the neurons are all made up of operational amplifiers; col. 1, lines 47-50).
Fu does not explicitly teach a digital network; the digital network includes a plurality of digital components and corresponds to one or more output layers of the neural network topology; and one or more layers of the analog network are connected to one or more layers of the digital network so that an output from the one or more layers of the analog network are provided as input to the one or more layers of the digital network.
However, Oshima teaches wherein: the neural network has a topology (i.e. the analog voltage output from the D/A converter 11D is a first output of the first layer. The first output of the first layer, which is an analog voltage, is input to the analog-to-digital multiplier 12D of a second layer. Further, although not illustrated, a second output of the first layer generated with the same configuration is input to an analog-digital multiplier 12E. Further, although not illustrated as well, a third output of the first layer generated with the same configuration is input to an analog-to-digital multiplier 12F; para. [0054, 0055]); the integrated circuit includes an analog network and a digital network (i.e. an A/D converter that converts a result obtained by adding the multiplication result as an analog signal into a digital signal, a digital activation function circuit that performs digital processing corresponding to an activation function on the result obtained by adding the multiplication result as the digital signal output from the A/D converter, and a second D/A converter that converts a digital output signal of the digital activation function circuit into an analog voltage; para. [0012, 0053]); the analog network includes the plurality of analog components (i.e. A neural network circuit according to one aspect of the present invention, includes a plurality of D/A converters that converts a digital input signal into an analog input voltage, a plurality of analog-to-digital multipliers each connected to the D/A converters and that outputs a predetermined multiplication result obtained by multiplying the analog input voltage by a weighting factor which is a digital signal, and an analog activation function circuit that performs analog processing corresponding to an activation function on the result obtained by adding the multiplication result output from a plurality of the analog-to-digital multipliers; para. [0011, 0012, 0050-0053]) and corresponds to a first portion of the neural network topology; the digital network includes a plurality of digital components and corresponds to one or more output layers of the neural network topology (i.e. That is, the analog voltage output from the D/A converter 11D is a first output of the first layer. The first output of the first layer, which is an analog voltage, is input to the analog-to-digital multiplier 12D of a second layer. Further, although not illustrated, a second output of the first layer generated with the same configuration is input to an analog-digital multiplier 12E. Further, although not illustrated as well, a third output of the first layer generated with the same configuration is input to an analog-to-digital multiplier 12F; para. [0054-0056]); and one or more layers of the analog network are connected to one or more layers of the digital network so that an output from the one or more layers of the analog network are provided as input to the one or more layers of the digital network (i.e. the other ends of the switch 13A, the switch 13B, and the switch 13C are connected to an A/D converter 21. The A/D converter 21 converts the first product-sum voltage into a digital value. An output of the A/D converter 21 is connected to a digital activation function circuit 22. The digital activation function circuit 22 is a digital circuit that performs an operation of an activation function required as a neural network, such as a step function, a sigmoid function, or ReLU. The digital activation function circuit 22 causes a digital value output from the A/D converter 21 to be subjected to processing corresponding to the necessary activation function and outputs it as a digital value. An output of the digital activation function circuit 22 is connected to a D/A converter 11D. The D/A converter 11D converts the output of the digital activation function circuit 22 into an analog voltage. The first product-sum voltage is formed on a parasitic capacitance formed between a wiring from the switch 13A, the switch 13B, and the switch 13C to the A/D converter 21 and a ground, a power supply wiring, and the like, that is, on the wiring capacitance 16; para. [0053-0056]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu and Germain to include the feature of Oshima. One would have been motivated to make this modification because it provides a neural network circuit incorporating power-efficient analog circuit operations.
12. Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Germain and further in view of Chang et al. (U.S. Patent Application Pub. No. US 20200105287 A1).
Claim 2: Fu and Germain teach the integrated circuit of claim 36. Fu further teaches the neural network has a topology (i.e. The neural network integrated circuit chip may include a plurality of input isolation buffers for driving the input lines in the synaptic array, and a plurality of neuron circuits each coupled to receive as inputs several of the synaptic array output lines. The outputs of the neuron circuits may further be connected back to the input isolation buffers by severable links for feedback purposes. By severing appropriate ones of these feedback conductors, any of a large number of neural network architectures can be defined during the same laser cutting step in which the synaptic weights are programmed; col. 3, lines 57-67); and the plurality of analog components includes one or more first electronic analog components (i.e. the neurons are all made up of operational amplifiers; col. 1, lines 47-50).
Fu does not explicitly teach a Fourier transformation layer configured to convert an input signal into input features and an inverse Fourier transformation layer configured to generate an output signal that has improved voice clarity compared to the input signal; and to modify signals in accordance with the Fourier transformation layer and to modify signals in accordance with the inverse Fourier transformation layer.
However, Chang teaches a Fourier transformation layer configured to convert an input signal into input features (i.e. The spectrum extraction unit of the training unit may transform a signal in a time domain into a signal in a frequency domain by performing a short-time Fourier transform (STFT) on the microphone input signal including noise and an echo and the far-end speech signal, and may extract the log power spectrum (LPS) of the transformed signal in the frequency domain as a feature vector in the training stage; para. [0112]) and an inverse Fourier transformation layer configured to generate an output signal that has improved voice clarity compared to the input signal; and to modify signals in accordance with the Fourier transformation layer and to modify signals in accordance with the inverse Fourier transformation layer (i.e. The voice signal reconstruction unit 830 may obtain the estimated log power spectrum (LPS) of the near-end speech signal by multiplying the LPS of the microphone input signal including noise and an echo by the estimated integrated and eliminated gain of nose and an echo, and may obtain the waveform of the final near-end speech signal from which noise and an echo have been finally eliminated by performing an inverse short-time Fourier transform (ISTFT) on the LPS of the near-end speech signal along with the phase of a signal including noise. That is, the final voice signal from which noise and an echo have been integrated and eliminated can be obtained; para. [0121]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Fu and Germain to include the feature of Chang. One would have been motivated to make this modification because it enhances the capabilities of the neural network by incorporating Fourier transformation layers, enabling it to analyze and process input signals more effectively in the frequency domain, thereby improving voice clarity and potentially enhancing performance in various related tasks.
13. Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Germain and further in view of Shrivastava (U.S. Patent Application Pub. No. US 20200311535 A1).
Claim 5: Fu and Germain teach the integrated circuit of claim 36. Fu further teaches the neural network topology and the integrated circuit includes a multi-layer network of analog neurons (i.e. The neural network integrated circuit chip may include a plurality of input isolation buffers for driving the input lines in the synaptic array, and a plurality of neuron circuits each coupled to receive as inputs several of the synaptic array output lines. The outputs of the neuron circuits may further be connected back to the input isolation buffers by severable links for feedback purposes. By severing appropriate ones of these feedback conductors, any of a large number of neural network architectures can be defined during the same laser cutting step in which the synaptic weights are programmed; col. 3, lines 57-67).
Fu does not explicitly teach a max-pooling layer, for the max-pooling layer, that have maximum input counts.
However, Shrivastava teaches max-pooling layer, for the max-pooling layer, that have maximum input counts (i.e. The convolution layer and A-ReLU layer is followed by a Max-Pooling layer which is also be implemented in analog. These stages are followed by another convolution, A-ReLU, Max-Pooling layer. The output from final Max-Pooling layer is used in two ways. Inside the ASIC, it is connected to fully connected (FC) layer of 10 outputs to reduce the memory foot-print. The output of final Max-pooling layer is connected to two FC layers as shown in FIG. 2. In the following sections, we will provide the details of proposed analog-ASIC shown in FIG. 2. The hardware implementation is more general that can accommodate other simpler CNN architectures as well; para. [0041]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu and Germain to include the feature of Shrivastava. One would have been motivated to make this modification because max-pooling layer often leads to improved performance, faster computation, and enhanced robustness, making it a widely used component in many neural network models.
14. Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Germain, Shrivastava, and further in view of Amant et al. (General-Purpose Code Acceleration with Limited-Precision Analog Computation, Proceedings of the 41st International Symposium on Computer Architecture, 2014).
Claim 8: Fu, Germain, and Shrivastava teach the integrated circuit of claim 5. Fu does not explicitly teach a four-input schematic comprising four single neuron models (SNMs) arranged in three layers; each SNM is a schematic model with analog components representing a specific type of math neuron in schematic form; and an SNM of the last layer has a maximum of four inputs.
However, Amant teaches the multi-layer network of analog neurons includes a four-input schematic comprising four single neuron model (SNMs) arranged in three layers; each SNM is a schematic model with analog components representing a specific type of math neuron in schematic form; and an SNM of the last layer has a maximum of four inputs (i.e. This section describes how analog circuits can perform the computation of neurons in multi-layer perceptrons, which are widely used neural networks. We also discuss, at a high level, how limitations of the analog circuits manifest in the computation. We explain how these restrictions are exposed to the compilation framework. The next section presents a concrete design for the analog neural accelerator. As Figure 2a illustrates, each neuron in a multi-layer perceptron takes in a set of inputs (xi) and performs a weighted sum of those input values. The weights are the result of training the neural network on; pages 3-4).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu, Germain, and Shrivastava to include the feature of Amant. One would have been motivated to make this modification because it balances considerations of simplicity, resource efficiency, compatibility, performance, and noise reduction in the analog hardware implementation.
15. Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Fu in view of Germain, Shrivastava, Amant, and further in view of Labreuche et al. (U.S. Patent Application Pub. No. US 20230206036 A1).
Claim 9: Fu, Germain, Shrivastava, and Amant teach the integrated circuit of claim 8. Fu does not explicitly teach each node of the calculation tree is selected from the group consisting of. a two-input schematic comprising two SNMs arranged in two layers, where an SNM of the last layer has a maximum of two inputs; a three-input schematic comprising three SNMs arranged in three layers, where an SNM of the last layer has a maximum of three inputs; and a four-input schematic comprising four SNMs arranged in three layers, where an SNM of the last layer has a maximum of four inputs.
Shrivastava further teaches each node of the calculation tree (i.e. The convolution layer and A-ReLU layer is followed by a Max-Pooling layer which is also be implemented in analog. These stages are followed by another convolution, A-ReLU, Max-Pooling layer. The output from final Max-Pooling layer is used in two ways. Inside the ASIC, it is connected to fully connected (FC) layer of 10 outputs to reduce the memory foot-print. The output of final Max-pooling layer is connected to two FC layers as shown in FIG. 2. In the following sections, we will provide the details of proposed analog-ASIC shown in FIG. 2. The hardware implementation is more general that can accommodate other simpler CNN architectures as well; para. [0041]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu, Germain, and Amant to include the feature of Shrivastava. One would have been motivated to make this modification because max-pooling layer often leads to improved performance, faster computation, and enhanced robustness, making it a widely used component in many neural network models.
Amant further teaches a two-input schematic comprising two SNMs arranged in two layers, where an SNM of the last layer has a maximum of two inputs; a three-input schematic comprising three SNMs arranged in three layers, where an SNM of the last layer has a maximum of three inputs; and a four-input schematic comprising four SNMs arranged in three layers, where an SNM of the last layer has a maximum of four inputs (i.e. This section describes how analog circuits can perform the computation of neurons in multi-layer perceptrons, which are widely used neural networks. We also discuss, at a high level, how limitations of the analog circuits manifest in the computation. We explain how these restrictions are exposed to the compilation framework. The next section presents a concrete design for the analog neural accelerator. As Figure 2a illustrates, each neuron in a multi-layer perceptron takes in a set of inputs (xi) and performs a weighted sum of those input values. The weights are the result of training the neural network on; pages 3-4).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu, Germain, Shrivastava to include the feature of Amant. One would have been motivated to make this modification because it balances considerations of simplicity, resource efficiency, compatibility, performance, and noise reduction in the analog hardware implementation.
However, Labreuche teaches transforming the max-pooling layer into a calculation tree in which each node of the calculation tree is selected from the group consisting of. a two-input schematic comprising two SNMs arranged in two layers, where an SNM of the last layer has a maximum of two inputs; a three-input schematic comprising three SNMs arranged in three layers, where an SNM of the last layer has a maximum of three inputs; and a four-input schematic comprising four SNMs arranged in three layers, where an SNM of the last layer has a maximum of four inputs (i.e. The neurons of the hidden layer each perform one function amongst: identity (if the neural has only one input, the neural returns the input thereof unchanged), min-pooling (the neural has two inputs and returns the value of the smallest of the inputs thereof), or max-pooling (the neural has two inputs and returns the value of the largest of the inputs thereof.) A linear regression on the outputs of all such neurons makes it possible to learn the weights, w.sub.i, w.sub.ij,Min and w.sub.ij,Max; para. [0139]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Fu, Germain, Shrivastava, and Amant to include the feature of Labreuche. One would have been motivated to make this modification because it provides flexibility in designing the network architecture.
16. Claims 18, 36-39 are similar in scope to Claims 20, 30-32, 35 and are rejected under a similar rationale.
Allowable Subject Matter
Claim 16 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Garimella et al. (Pub. No. US 9886948 B1), FIG. 1 depicts a speech processing system 100 configured to perform speech recognition on an audio signal using a neural network 114. The neural network 114 may take input from multiple (e.g., two or more) feature streams, and may process the input using max pooling and restricted connectivity.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TAN TRAN whose telephone number is (303)297-4266. The examiner can normally be reached on Monday - Thursday - 8:00 am - 5:00 pm MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matt Ell can be reached on 571-270-3264. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TAN H TRAN/Primary Examiner, Art Unit 2141