Last updated: April 19, 2026
Application No. 18/248,805
GENERAL MEDIA NEURAL NETWORK PREDICTOR AND A GENERATIVE MODEL INCLUDING SUCH A PREDICTOR

Non-Final OA §101§102§103§112
Filed
Apr 12, 2023
Examiner
BAKER, EZRA JAMES
Art Unit
2126
Tech Center
2100 — Computer Architecture & Software
Assignee
Dolby International AB
OA Round
1 (Non-Final)
This examiner grants 50% of cases after interview

— +77.8% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 14 resolved cases, 2023–2026
Examiner Intelligence

BAKER, EZRA JAMES View full profile →
Grants 50% of resolved cases
Career Allow Rate
7 granted / 14 resolved
-5.0% vs TC avg
Strong +78% interview lift
Without
With
+77.8%
Interview Lift
resolved cases with interview
Typical timeline
4y 3m
Avg Prosecution
33 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
31.8%
-8.2% vs TC avg
§103
35.9%
-4.1% vs TC avg
§102
7.9%
-32.1% vs TC avg
§112
21.8%
-18.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 14 resolved cases
Office Action

§101 §102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
	The present application is being examined under the claims filed 04/11/2023.
	Claims 16-31 are pending.

Information Disclosure Statement
The information disclosure statement filed 12/06/2023 fails to comply with the provisions of 37 CFR 1.97, 1.98 and MPEP § 609 because foreign patent document EP422958A1 did not contain the specification nor any explanation of relevance in English.  It has been placed in the application file, but the information referred to therein has not been considered as to the merits.  Applicant is advised that the date of any re-submission of any item of information contained in this information disclosure statement or the submission of any missing element(s) will be the date of submission for purposes of determining compliance with the requirements based on the time of filing the statement, including all certification requirements for statements under 37 CFR 1.97(e).  See MPEP § 609.05(a).
The information disclosure statement filed 07/09/2025 fails to comply with the provisions of 37 CFR 1.97, 1.98 and MPEP § 609 because foreign patent document IN202027017474 A is missing.  It has been placed in the application file, but the information referred to therein has not been considered as to the merits.  Applicant is advised that the date of any re-submission of any item of information contained in this information disclosure statement or the submission of any missing element(s) will be the date of submission for purposes of determining compliance with the requirements based on the time of filing the statement, including all certification requirements for statements under 37 CFR 1.97(e).  See MPEP § 609.05(a).


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 16-31 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.

Regarding Claim 16
The term “adjacent” in claim 16 is a relative term which renders the claim indefinite. The term “adjacent” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The term “adjacent” indicates discrete values or a range with a clear ordering relative to a particular point. However, neither the claim nor the specification provide a standard range or set of discrete values to ascertain what is meant by “adjacent”.
	Claim 16 is further rejected under 35 U.S.C. 112(b) because it recites “by the frequency predicting portion previously predicted”. This statement and what context relates to this statement is unclear, rendering the claim indefinite.

Regarding Claim 17
The term “adjacent” in claim 17 is a relative term which renders the claim indefinite. The term “adjacent” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The term “adjacent” indicates discrete values or a range with a clear ordering relative to a particular point. However, neither the claim nor the specification provide a standard range or set of discrete values to ascertain what is meant by “adjacent”.

Regarding Claim 20
The term “neighboring frequency bands” in claim 20 is a relative term which renders the claim indefinite. The term “neighboring” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The term “neighboring” indicates discrete values or a range with a clear ordering relative to a particular point. However, neither the claim nor the specification provide a standard range or set of discrete values to ascertain what is meant by “neighboring”.

Regarding Claim 22
The term “lower” in claim 22 is a relative term which renders the claim indefinite. The term “lower” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. It is unclear which frequency is the standard for comparison (lower than what frequency band?).

Regarding Claim 27
The term “lower” in claim 27 is a relative term which renders the claim indefinite. The term “lower” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. It is unclear which frequency is the standard for comparison (lower than what frequency band?).

Regarding Dependent Claims
Claims
17-31 are dependent upon claim 16
21 and 26-31 are dependent upon claim 20
23 is dependent upon claim 22
and are therefore similarly rejected for including the deficiencies of claim 16, 21, and 23 respectively.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 16-31 are rejected under 35 U.S.C. 101 for containing an abstract idea without significantly more.

Regarding Claim 16:
	Step 1 – Is the claim to a process, machine, manufacture, or composition of matter?
	Yes, the claim is to a machine.
	Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites the abstract ideas of:
a time predicting portion including […] predict a first set of output variables representing a time predicted frequency band of a current time frame given coefficients of one or several previous time frames — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to formulating an opinion about a time of a frequency band based on given data.
and a frequency predicting portion including […] predict a second set of output variables representing a frequency predicted frequency band given coefficients of one or several adjacent lower and, by the frequency predicting portion previously predicted, frequency bands in said current time frame — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to formulating an opinion of a frequency based on given data.

	Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
	No, the claim does not recite additional elements that integrate the judicial exception into a practical application. The additional elements:
A computer implemented neural network system for predicting frequency coefficients of a media signal, the neural network system comprising: […] at least one neural network trained to — This limitation is directed to merely applying an abstract idea using a generic computer with a neural network as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
an output stage configured to provide a set of frequency coefficients representing a specific frequency band of said current time frame, based on said first and second set of output variables, said specific frequency band being at least one of the time predicted and frequency predicted frequency band, and wherein a) said first set of output variables, predicted by the time predicting portion, is used as input variables to the frequency predicting portion, or b) said second set of output variables, predicted by the frequency prediction portion, is used as input variables to the time predicting portion — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).

	Step 2B – Does the claim recite additional elements that amount to significantly more than the abstract idea itself?
	No, the claim does not recite additional elements which amount to significantly more than the abstract idea itself. The additional elements as identified in step 2A prong 2:
A computer implemented neural network system for predicting frequency coefficients of a media signal, the neural network system comprising: […] at least one neural network trained to — Using a generic computer and neural network as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
an output stage configured to provide a set of frequency coefficients representing a specific frequency band of said current time frame, based on said first and second set of output variables, said specific frequency band being at least one of the time predicted and frequency predicted frequency band, and wherein a) said first set of output variables, predicted by the time predicting portion, is used as input variables to the frequency predicting portion, or b) said second set of output variables, predicted by the frequency prediction portion, is used as input variables to the time predicting portion — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.


Regarding Claim 17
Claim 17 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 16 which included an abstract idea (see rejection for claim 16). The claim recites the additional limitations:
Step 2A Prong 2:
wherein a) said first set of output variables, predicted by the time predicting portion, is used as input variables to the frequency predicting portion, and said time predicted frequency band is adjacent to said frequency predicted frequency band in said current time frame — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the particular data operated on by the predictions.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein a) said first set of output variables, predicted by the time predicting portion, is used as input variables to the frequency predicting portion, and said time predicted frequency band is adjacent to said frequency predicted frequency band in said current time frame — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 18
Claim 18 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 16 which included an abstract idea (see rejection for claim 16). The claim recites the additional limitations:
Step 2A Prong 2:
wherein b) said second set of output variables, predicted by the frequency prediction portion, is used as input variables to the time predicting portion, and said time predicted frequency band and said frequency predicted frequency band are a same frequency band in a previous time frame and current time frame respectively — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the particular data operated on by the predictions.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein b) said second set of output variables, predicted by the frequency prediction portion, is used as input variables to the time predicting portion, and said time predicted frequency band and said frequency predicted frequency band are a same frequency band in a previous time frame and current time frame respectively — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 19
Claim 19 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 16 which included an abstract idea (see rejection for claim 16). The claim recites the additional limitations:
Step 2A Prong 2:
wherein said first set of output variables, predicted by the time predicting portion, are used as input variables to the frequency predicting portion — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the particular data operated on by the predictions.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein said first set of output variables, predicted by the time predicting portion, are used as input variables to the frequency predicting portion — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 20
Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 16 which included an abstract idea (see rejection for claim 16). The claim recites the additional limitations:
Step 2A Prong 1:
predict an intermediate set of output variables representing the current time frame, given a first set of input variables representing a preceding time frame of the media signal — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to formulating an opinion about a time using given data.
predict said first set of output variables, wherein variables in the intermediate set are formed by mixing variables in said intermediate set representing said time predicted frequency band and a plurality of neighboring frequency bands — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to evaluating data using known data processing.
Step 2A Prong 2:
wherein the time predicting portion includes: a time predicting recurrent neural network comprising a plurality of neural network layers, said time predicting recurrent neural network being trained to — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the neural network configuration.
and a band mixing neural network trained to — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the neural network configuration.

Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the time predicting portion includes: a time predicting recurrent neural network comprising a plurality of neural network layers, said time predicting recurrent neural network being trained to — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
and a band mixing neural network trained to — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 21
Claim 21 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 20 which included an abstract idea (see rejection for claim 20). The claim recites the additional limitations:
Step 2A Prong 1:
predict said first set of input variables given frequency coefficients of a preceding time frame of said media signal — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to formulating an opinion of what input variables should be based on given data.
Step 2A Prong 2:
wherein the time predicting portion further includes: an input stage comprising a neural network trained to — This limitation is directed to merely applying an abstract idea using a generic computer with a neural network as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the time predicting portion further includes: an input stage comprising a neural network trained to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 22
Claim 22 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 19 which included an abstract idea (see rejection for claim 19). The claim recites the additional limitations:
Step 2A Prong 1:
predict said second set of output variables, given a sum of said first set of output variables and a second set of input variables representing lower frequency bands of the current time frame — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to formulating an opinion on what the output variables should be based on given data.
Step 2A Prong 2:
wherein the frequency predicting portion includes: a frequency predicting recurrent neural network comprising a plurality of neural network layers, said frequency predicting neural network being trained to — This limitation is directed to merely applying an abstract idea using a generic computer with a neural network as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the frequency predicting portion includes: a frequency predicting recurrent neural network comprising a plurality of neural network layers, said frequency predicting neural network being trained to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 23
Claim 23 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 22 which included an abstract idea (see rejection for claim 22). The claim recites the additional limitations:
Step 2A Prong 2:
wherein the frequency predicting portion further includes: one or several output layers trained to — This limitation is directed to merely applying an abstract idea using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
provide said set of frequency coefficients based on said second set of output variables — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the frequency predicting portion further includes: one or several output layers trained to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
provide said set of frequency coefficients based on said second set of output variables — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 24
Claim 24 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 16 which included an abstract idea (see rejection for claim 16). The claim recites the additional limitations:
Step 2A Prong 1:
wherein each frequency coefficient is represented by a set of distribution parameters, wherein said set of distribution parameters are configured to parametrize a probability distribution of the coefficient, wherein said specific frequency band of said current time frame is obtained by sampling the probability distribution of each frequency coefficient — This limitation is directed to the abstract idea of a mathematical process, and mathematical calculations in particular (MPEP 2106.04(a)(2) I. C.). The claim describes the mathematical operations of parametrizing and sampling a probability distribution.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 25
Claim 25 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 16 which included an abstract idea (see rejection for claim 16). The claim recites the additional limitations:
Step 2A Prong 2:
wherein the frequency coefficients correspond to bins of a time-to-frequency transform of the media signal, or the frequency coefficients correspond to samples of a filterbank representation of the media signal — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the frequency coefficients to filterbank representations or bins of a time-to-frequency transform.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the frequency coefficients correspond to bins of a time-to-frequency transform of the media signal, or the frequency coefficients correspond to samples of a filterbank representation of the media signal —Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception. 
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 26
Claim 26 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 20 which included an abstract idea (see rejection for claim 20). The claim recites the additional limitations:
Step 2A Prong 1:
predict a set of conditioning variables given conditioning information describing the target media signal, the conditioning information comprising quantized frequency coefficients describing the target media signal — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to formulating an opinion on what conditioning variables should be based on given data.
combine said first set of input variables with at least a subset of said set of conditioning variables — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to performing a judgement of how to arrange input variables and conditioning variables together.

Step 2A Prong 2:
A generative model for generating a target media signal, comprising: a neural network system according to claim 20 — This limitation is directed to merely applying an abstract idea using a generic computer with a neural network as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
a conditioning neural network trained to — This limitation is directed to merely applying an abstract idea using a generic computer with a neural network as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
said time predicting recurrent neural network being configured to — This limitation is directed to merely applying an abstract idea using a generic computer with a neural network as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
A generative model for generating a target media signal, comprising: a neural network system according to claim 20 — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
a conditioning neural network trained to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
said time predicting recurrent neural network being configured to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 27
Claim 27 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 26 which included an abstract idea (see rejection for claim 26). The claim recites the additional limitations:
Step 2A Prong 1:
predict said second set of output variables, given a sum of said first set of output variables and a second set of input variables representing lower frequency bands of the current time frame — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to formulating an opinion on what the input and output variables should be based on given data.
combine said sum with at least a subset of said set of conditioning variables — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to performing a judgement of how to arrange a sum and conditioning variables together.

Step 2A Prong 2:
wherein said first set of output variables, predicted by the time predicting portion, are used as input variables to the frequency predicting portion — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the particular data operated on by the predictions.
wherein the neural network system includes a frequency predicting recurrent neural network comprising a plurality of neural network layers — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the architecture of the neural network system.
said frequency predicting neural network being trained to — This limitation is directed to merely applying an abstract idea using a generic computer with generic machine learning training as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
and wherein said frequency predicting recurrent neural network is configured to — This limitation is directed to merely applying an abstract idea using a generic computer with a neural network as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein said first set of output variables, predicted by the time predicting portion, are used as input variables to the frequency predicting portion — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
wherein the neural network system includes a frequency predicting recurrent neural network comprising a plurality of neural network layers — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
said frequency predicting neural network being trained to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
and wherein said frequency predicting recurrent neural network is configured to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 28
Claim 28 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 26 which included an abstract idea (see rejection for claim 26). The claim recites the additional limitations:
Step 2A Prong 2:
wherein the conditioning information includes at least one of a set of distorted frequency coefficients, a set of perceptual model coefficients, and a spectral envelope — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the conditioning information.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the conditioning information includes at least one of a set of distorted frequency coefficients, a set of perceptual model coefficients, and a spectral envelope —Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 29
Claim 29 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 29 which included an abstract idea (see rejection for claim 29). The claim recites the additional limitations:
Step 2A Prong 1:
predict a set of frequency coefficients representing this frequency band — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to formulating an opinion on what conditioning variables should be based on given data.
Step 2A Prong 2:
A method for obtaining an enhanced media signal using a generative model according to claim 26, comprising the steps of: a) providing conditioning information to the conditioning neural network — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
b) for each frequency band of a current time frame, using said frequency predicting recurrent neural network to  — This limitation is directed to merely applying an abstract idea using a generic computer with a neural network as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
and providing said set of frequency coefficients to the frequency predicting recurrent neural network as said second set of input variables — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
c) providing the predicted sets of frequency coefficients representing all frequency bands of the current frame to the time predicting RNN as said first set of input variables — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
A method for obtaining an enhanced media signal using a generative model according to claim 26, comprising the steps of: a) providing conditioning information to the conditioning neural network — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
b) for each frequency band of a current time frame, using said frequency predicting recurrent neural network to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
and providing said set of frequency coefficients to the frequency predicting recurrent neural network as said second set of input variables — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
c) providing the predicted sets of frequency coefficients representing all frequency bands of the current frame to the time predicting RNN as said first set of input variables — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 30
Claim 30 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 26 which included an abstract idea (see rejection for claim 26). The claim recites the additional limitations:
Step 2A Prong 2:
A decoder comprising a generative model according to claim 26 — This limitation is directed to merely applying an abstract idea using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
A decoder comprising a generative model according to claim 26 — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 31
Claim 31 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 26 which included an abstract idea (see rejection for claim 26). The claim recites the additional limitations:
Step 2A Prong 2:
A computer program product comprising computer readable program code portions which, when executed by a computer, implement a generative model according to claim 26 — This limitation is directed to merely applying an abstract idea using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
A computer program product comprising computer readable program code portions which, when executed by a computer, implement a generative model according to claim 26 — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.



Claim Rejections - 35 USC § 102

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 16-17, 19, 22, and 2 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Vasquez et al. “MelNet: A Generative Model for Audio in the Frequency Domain”.

Regarding Claim 16
Vasquez teaches:
A computer implemented neural network system for predicting frequency coefficients of a media signal, the neural network system comprising: 
(page 1 column 1 paragraph 2) “We introduce a generative model for audio which captures longer-range dependencies than existing end-to-end models. We primarily achieve this by modelling 2D time-frequency representations such as spectrograms rather than 1D time-domain waveforms (Figure 1).”

a time predicting portion including at least one neural network trained to predict a first set of output variables representing a time predicted frequency band of a current time frame given coefficients of one or several previous time frames
(page 3 column 2 section 4.1 paragraph 1) “The time-delayed stack utilizes multiple layers of multidimensional RNNs to extract features[*Examiner notes: first set of output variables] from x<i,∗, the two-dimensional region consisting of all frames preceding xij.”

and a frequency predicting portion including at least one neural network trained to predict a second set of output variables representing a frequency predicted frequency band given coefficients of one or several adjacent lower and, by the frequency predicting portion previously predicted, frequency bands in said current time frame
(page 4 column 1 paragraph 4) “The frequency-delayed stack is a one-dimensional RNN which runs forward along the frequency axis. Much like existing one-dimensional autoregressive models (language models, waveform models, etc.), the frequency-delayed stack operates on a one-dimensional sequence (a single frame) and estimates the distribution for each element conditioned on all preceding elements. The primary difference is that it is also conditioned on the outputs of the time-delayed stack, allowing it to use the full two-dimensional context x<ij.”; [*Examiner notes: The output of the frequency-delayed stack is mapped to the second set of output variables]

an output stage configured to provide a set of frequency coefficients representing a specific frequency band of said current time frame, based on said first and second set of output variables, said specific frequency band being at least one of the time predicted and frequency predicted frequency band, 
(page 8 column 1 section 7.1.3) “To show that MelNet can model audio modalities other than speech, we apply the model to the task of unconditional music generation. We utilize the MAESTRO dataset [19], which consists of over 172 hours of solo piano performances. The samples demonstrate that MelNet learns musical structures such as melody and harmony. Furthermore, generated samples often maintain consistent tempo and contain interesting variation in volume, timbre, and rhythm.”

and wherein a) said first set of output variables, predicted by the time predicting portion, is used as input variables to the frequency predicting portion, or b) said second set of output variables, predicted by the frequency prediction portion, is used as input variables to the time predicting portion.
(page 4 column 1 paragraph 4) “The frequency-delayed stack is a one-dimensional RNN which runs forward along the frequency axis. Much like existing one-dimensional autoregressive models (language models, waveform models, etc.), the frequency-delayed stack operates on a one-dimensional sequence (a single frame) and estimates the distribution for each element conditioned on all preceding elements. The primary difference is that it is also conditioned on the outputs of the time-delayed stack, allowing it to use the full two-dimensional context x<ij.”

Regarding Claim 17
Vasquez teaches:
The neural network system according to claim 16
(see rejection of claim 16)

wherein a) said first set of output variables, predicted by the time predicting portion, is used as input variables to the frequency predicting portion, and said time predicted frequency band is adjacent to said frequency predicted frequency band in said current time frame.
(page 4 column 1 paragraph 4) “The frequency-delayed stack is a one-dimensional RNN which runs forward along the frequency axis. Much like existing one-dimensional autoregressive models (language models, waveform models, etc.), the frequency-delayed stack operates on a one-dimensional sequence (a single frame) and estimates the distribution for each element conditioned on all preceding elements. The primary difference is that it is also conditioned on the outputs of the time-delayed stack, allowing it to use the full two-dimensional context x<ij[*Examiner notes: includes adjacent].”

Regarding Claim 19
Vasquez teaches:
The neural network system according to claim 16,
(see rejection of claim 16)

wherein said first set of output variables, predicted by the time predicting portion, are used as input variables to the frequency predicting portion.
(page 4 column 1 paragraph 4) “The frequency-delayed stack is a one-dimensional RNN which runs forward along the frequency axis. Much like existing one-dimensional autoregressive models (language models, waveform models, etc.), the frequency-delayed stack operates on a one-dimensional sequence (a single frame) and estimates the distribution for each element conditioned on all preceding elements. The primary difference is that it is also conditioned on the outputs of the time-delayed stack, allowing it to use the full two-dimensional context x<ij.”


Regarding Claim 22
Vasquez teaches:
The neural network system according to claim 19, 
(see rejection of claim 19)

wherein the frequency predicting portion includes: a frequency predicting recurrent neural network comprising a plurality of neural network layers, said frequency predicting neural network being trained to predict said second set of output variables, given a sum of said first set of output variables and a second set of input variables representing lower frequency bands of the current time frame.
(page 4 figure 3 caption) “The outputs of these functions are projected (by the matrices Wtl and Wfl) and summed with the layer inputs to form residual blocks.”

    PNG
    media_image1.png
    296
    313
    media_image1.png
    Greyscale




Regarding Claim 24
Vasquez teaches:
The neural network system according to claim 16
(see rejection of claim 16)

wherein each frequency coefficient is represented by a set of distribution parameters, wherein said set of distribution parameters are configured to parametrize a probability distribution of the coefficient, wherein said specific frequency band of said current time frame is obtained by sampling the probability distribution of each frequency coefficient.
(page 2 column 2 section 3) “We use an autoregressive model which factorizes the joint distribution over a spectrogram x as a product of conditional distributions”; (page 6 column 2 section 6.2) “To sample from the multiscale model we iteratively sample a value for xg conditioned on x<g using the learned distributions defined by the estimated network parameters”


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 18 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Vasquez in view of NPL reference .

Regarding Claim 18
Vasquez teaches:
The neural network system according to claim 16
(see rejection of claim 16)

Vasquez further teaches:
and said time predicted frequency band and said frequency predicted frequency band are a same frequency band in a previous time frame and current time frame respectively
(page 2 figure 1 ) “Spectrogram and waveform representations of the same four-second audio signal. The waveform spans nearly 100,000 timesteps whereas the temporal axis of the spectrogram spans roughly 400.”; Figure 1

    PNG
    media_image2.png
    202
    337
    media_image2.png
    Greyscale


Vasquez does not explicitly teach:
wherein said second set of output variables, predicted by the frequency prediction portion, is used as input variables to the time predicting portion, 

However, Adavanne teaches:
wherein said second set of output variables, predicted by the frequency prediction portion, is used as input variables to the time predicting portion, 
(page 1730 column 2 paragraph 1) “The feature maps from the individual CNN s are merged using an elementwise multiplication operation and fed to bi-directional gated recurrent unit (GRU) layers followed by fully-connected time distributed dense layers.”

	Vasquez, Adavanne, and the instant application are analogous because they are all directed to neural networks.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the time and frequency predictors of Vasquez by switching the order of the RNNs so that the output of the frequency prediction neural network is used as an input to the time prediction neural network because (Adavanne page 1732 column 2 section V) “A stacked convolutional and bidirectional recurrent neural network architecture (CBRNN) was proposed for bird audio detection task. Two kinds of features and their combination were studied and the best result on test data was achieved using the log mel-band energy feature. The proposed novel domain adaptation was shown to consistently perform better than having no adaptation. The data augmentation method studied was not helpful and gave comparable results as without augmentation. The proposed method achieved an area under curve measure of 88.1 % on the unseen evaluation data, and 95.5% on the development data.”

Regarding Claim 23
Vasquez teaches:
The neural network system according to claim 22
(see rejection of claim 22)

Vasquez does not explicitly teach:
wherein the frequency predicting portion further includes: one or several output layers trained to provide said set of frequency coefficients based on said second set of output variables

However, Adavanne teaches:
wherein the frequency predicting portion further includes: one or several output layers trained to provide said set of frequency coefficients based on said second set of output variables
(page 1730 column 1 paragraph 1) “This stacked neural network is built by stacking layers of CNN, RNN and FC followed by a single node output layer producing outputs in the range of [0, 1].” 
	Vasquez, Adavanne, and the instant application are analogous because they are all directed to neural networks.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the time and frequency predictors of Vasquez with the output layer of Adavanne because (Adavanne page 1732 column 2 section V) “A stacked convolutional and bidirectional recurrent neural network architecture (CBRNN) was proposed for bird audio detection task. Two kinds of features and their combination were studied and the best result on test data was achieved using the log mel-band energy feature. The proposed novel domain adaptation was shown to consistently perform better than having no adaptation. The data augmentation method studied was not helpful and gave comparable results as without augmentation. The proposed method achieved an area under curve measure of 88.1 % on the unseen evaluation data, and 95.5% on the development data.”


Claims 20-21, 25-28, and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Vasquez in view of Martinez Ramirez.


Regarding Claim 20
Vasquez teaches:
The neural network system according claim 16
(see rejection of claim 16)

wherein the time predicting portion includes: a time predicting recurrent neural network comprising a plurality of neural network layers, said time predicting recurrent neural network being trained to predict an intermediate set of output variables representing the current time frame, given a first set of input variables representing a preceding time frame of the media signal, 
(page 3 column 2 section 4.1) “The time-delayed stack utilizes multiple layers of multidimensional RNNs to extract features from x<i,∗, the two dimensional region consisting of all frames preceding xij .”


Vasquez does not explicitly teach:
and a band mixing neural network trained to predict said first set of output variables, wherein variables in the intermediate set are formed by mixing variables in said intermediate set representing said time predicted frequency band and a plurality of neighboring frequency bands.

However, Martinez Ramirez teaches:
and a band mixing neural network trained to predict said first set of output variables, wherein variables in the intermediate set are formed by mixing variables in said intermediate set representing said time predicted frequency band and a plurality of neighboring frequency bands.
(page 302 column 1 last paragraph) “Possible applications for this architecture are within the fields of automatic mixing and audio effect modeling. For example, style-learning of a specific sound engineer could be explored, where the model is trained with several tracks equalized by the engineer and finds a generalization from the engineer’s EQ practices.”

Vasquez, Martinez Ramirez, and the instant application are analogous because they are all directed to neural networks.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the time and frequency predictors of Vasquez with the band mixing neural network of Martinez Ramirez because (Martinez Ramirez page 302 column 2 last paragraph) “Finally, it is worth noting the immense benefit that generative music could obtain from deep learning architectures for intelligent music production. Our implementation could be used in the field of deep neural networks applied to generative music and automatic mixing production systems.”

Regarding Claim 21
Vasquez in view of Martinez Ramirez teaches:
The neural network system according to claim 20
(see rejection of claim 20)

And Vasquez further teaches:
wherein the time predicting portion further includes: an input stage comprising a neural network trained to predict said first set of input variables given frequency coefficients of a preceding time frame of said media signal.
(page 3 section 4.1 paragraph 1) “The time-delayed stack utilizes multiple layers of multidimensional RNNs to extract features from x<i,∗, the two-dimensional region consisting of all frames preceding xij . Each multidimensional RNN is composed of three one-dimensional RNNs: one which runs forwards along the frequency axis, one which runs backwards along the frequency axis, and one which runs forwards along the time axis. Each RNN runs along each slice of a given axis, as shown in Figure 2.”; Figure 3

    PNG
    media_image3.png
    458
    607
    media_image3.png
    Greyscale


Regarding Claim 25
Vasquez teaches:
The neural network system according to claim 16
(see rejection of claim 16)

Vasquez does not explicitly teach:
wherein the frequency coefficients correspond to bins of a time-to-frequency transform of the media signal, or the frequency coefficients correspond to samples of a filterbank representation of the media signal.

However, Martinez Ramirez teaches:
wherein the frequency coefficients correspond to bins of a time-to-frequency transform of the media signal, or the frequency coefficients correspond to samples of a filterbank representation of the media signal.
(page 296 column 2) “Given an arbitrary EQ configuration, our task is to train a deep neural network to learn the specific transformation. In this way, an optimal filter bank decomposition and its latent representation are learned from the input data, and these are transformed and decoded to obtain an audio signal that matches the target”

Vasquez, Martinez Ramirez, and the instant application are analogous because they are all directed to neural networks.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the time and frequency predictors of Vasquez with the filter bank representation of Martinez Ramirez because (Martinez Ramirez page 302 column 2 last paragraph) “Finally, it is worth noting the immense benefit that generative music could obtain from deep learning architectures for intelligent music production. Our implementation could be used in the field of deep neural networks applied to generative music and automatic mixing production systems.”


Regarding Claim 26
Vasquez in view of Martinez Ramirez teaches:
A generative model for generating a target media signal, comprising: a neural network system according to claim 20
(see rejection of claim 20)

Vasquez further teaches:
and a conditioning neural network trained to predict a set of conditioning variables given conditioning information describing the target media signal, the conditioning information comprising quantized frequency coefficients describing the target media signal, 
(page 4 column 2 section 4.4) “To incorporate conditioning information into the model, conditioning features z are simply projected onto the input layer along with the inputs x, altering Equations 7 and 9:”; Equations 13 and 14; [*Examiner notes: The broadest reasonable interpretation of a neural network includes calculations involving matrix multiplications. Equations 13 and 14 provide conditioning operations involving matrix multiplications which could be interpreted as a conditioning neural network]

said time predicting recurrent neural network being configured to combine said first set of input variables with at least a subset of said set of conditioning variables
(page 3 column 2 last paragraph) “To ensure the output htij [l] is only a function of frames which lie in the context x<ij , the inputs to the time-delayed stack are shifted backwards one step in time:”; Equations 6 and 7; [*Examiner notes: Equations 6 and 7 show how the conditioning information is incorporated into the model]


Regarding Claim 27
Vasquez in view of Martinez Ramirez teaches:
The generative model according to claim 26
(see rejection of claim 26)

Vasquez further teaches:
wherein said first set of output variables, predicted by the time predicting portion, are used as input variables to the frequency predicting portion, 
(page 4 column 1 paragraph 4) “The frequency-delayed stack is a one-dimensional RNN which runs forward along the frequency axis. Much like existing one-dimensional autoregressive models (language models, waveform models, etc.), the frequency-delayed stack operates on a one-dimensional sequence (a single frame) and estimates the distribution for each element conditioned on all preceding elements. The primary difference is that it is also conditioned on the outputs of the time-delayed stack, allowing it to use the full two-dimensional context x<ij.”

wherein the neural network system includes a frequency predicting recurrent neural network comprising a plurality of neural network layers, 
(page 4 column 1 section 4.2) “The frequency-delayed stack is a one-dimensional RNN which runs forward along the frequency axis.”

said frequency predicting neural network being trained to predict said second set of output variables, given a sum of said first set of output variables and a second set of input variables representing lower frequency bands of the current time frame, 
(page 4 column 1 first paragraph) “The primary difference is that it is also conditioned on the outputs of the time-delayed stack, allowing it to use the full two-dimensional context x<ij .” Figure 2

    PNG
    media_image4.png
    195
    586
    media_image4.png
    Greyscale


and wherein said frequency predicting recurrent neural network is configured to combine said sum with at least a subset of said set of conditioning variables.
(page 4 column 1 paragraph 4) “The frequency-delayed stack is a one-dimensional RNN which runs forward along the frequency axis. Much like existing one-dimensional autoregressive models (language models, waveform models, etc.), the frequency-delayed stack operates on a one-dimensional sequence (a single frame) and estimates the distribution for each element conditioned on all preceding elements. The primary difference is that it is also conditioned on the outputs of the time-delayed stack, allowing it to use the full two-dimensional context x<ij.”; [*Examiner notes: the sum and the conditioning variables are both inputs to the recurrent neural network. Thus they are combined during RNN processing.]

Regarding Claim 28
Vasquez in view of Martinez Ramirez teaches:
The generative model according to claim 26
(see rejection of claim 26)

Vasquez further teaches:
wherein the conditioning information includes at least one of a set of distorted frequency coefficients, a set of perceptual model coefficients, and a spectral envelope.
(page ) “Reshaping, upsampling, and broadcasting can be used as necessary to ensure the conditioning features have the same time and frequency shape as the input spectrogram, e.g. a
one-hot vector representation for speaker ID would first be broadcast along both the time and frequency axes.”; [*Examiner notes: conditioning by reshaping, upsampling, and broadcasting amounts to distorting the frequency coefficients.]

Regarding Claim 30
Vasquez in view of Martinez Ramirez teaches:
comprising a generative model according to claim 26.
(see rejection of claim 26)

Vasquez further teaches:
A decoder 
(page 14 column 1 last paragraph) “All models have 8-layer autoregressive decoders”


Claims 29 and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Vasquez in view of Martinez Ramirez, and further in view of Adavanne.

Regarding Claim 29
Vasquez in view of Martinez Ramirez teaches:
A method for obtaining an enhanced media signal using a generative model according to claim 26, comprising the steps of: 
(see rejection of claim 26)

Vasquez further teaches:
a) providing conditioning information to the conditioning neural network
(page 4 column 2 section 4.4) “To incorporate conditioning information into the model, conditioning features z are simply projected onto the input layer along with the inputs x”


b) for each frequency band of a current time frame, using said frequency predicting recurrent neural network to predict a set of frequency coefficients representing this frequency band, and providing said set of frequency coefficients to the frequency predicting recurrent neural network as said second set of input variables,
(page 4 column 1 section 4.2) “The frequency-delayed stack is a one-dimensional RNN which runs forward along the frequency axis. Much like existing one-dimensional autoregressive models (language models, waveform models, etc.), the frequency-delayed stack operates on a one-dimensional sequence (a single frame) and estimates the distribution for each element conditioned on all preceding elements […] At each layer, the frequency-delayed stack takes two inputs: the the previous-layer outputs of the frequency-delayed stack, hf[l − 1], and the current-layer outputs of the time-delayed stack ht[l].”

Vasquez in view of Martinez Ramirez does not explicitly teach:
c) providing the predicted sets of frequency coefficients representing all frequency bands of the current frame to the time predicting RNN as said first set of input variables.

However, Adavanne teaches:
c) providing the predicted sets of frequency coefficients representing all frequency bands of the current frame to the time predicting RNN as said first set of input variables.
(page 1730 column 2 paragraph 1) “The feature maps from the individual CNN s are merged using an elementwise multiplication operation and fed to bi-directional gated recurrent unit (GRU) layers followed by fully-connected time distributed dense layers.”

	Vasquez, Martinez Ramirez, Adavanne, and the instant application are analogous because they are all directed to neural networks.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the time and frequency predictors of Vasquez in view of Martinez Ramirez by switching the order of the RNNs so that the output of the frequency prediction neural network is used as an input to the time prediction neural network because (Adavanne page 1732 column 2 section V) “A stacked convolutional and bidirectional recurrent neural network architecture (CBRNN) was proposed for bird audio detection task. Two kinds of features and their combination were studied and the best result on test data was achieved using the log mel-band energy feature. The proposed novel domain adaptation was shown to consistently perform better than having no adaptation. The data augmentation method studied was not helpful and gave comparable results as without augmentation. The proposed method achieved an area under curve measure of 88.1 % on the unseen evaluation data, and 95.5% on the development data.”

Regarding Claim 31
Vasquez in view of Martinez Ramirez teaches:
a generative model according to claim 26
(see rejection of claim 26)

Vasquez in view of Martinez Ramirez does not explicitly teach:
A computer program product comprising computer readable program code portions which, when executed by a computer, implement

However, Adavanne teaches:
A computer program product comprising computer readable program code portions which, when executed by a computer, implement
(page 1732 column 2 last paragraph) “Part of the computations leading to these results were performed on a TITAN-X GPU donated by NVIDIA.”

Vasquez, Martinez Ramirez, Adavanne, and the instant application are analogous because they are all directed to neural networks.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the time and frequency predictors of Vasquez in view of Martinez Ramirez by implementing it on the computer-readable medium of Adavanne because (Adavanne page 1732 column 2 last paragraph) “Part of the computations leading to these results were performed on a TITAN-X GPU donated by NVIDIA.” That is, the NVIDIA chip can be used to perform computations related to neural network methods.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Deng et al. “Exploiting time-frequency patterns with LSTM-RNNs for low-bitrate audio restoration” for teaching use of a time prediction neural network and a frequency prediction neural network to make predictions (see fig. 1)
Watkins et al. “Effects of spectral contrast on perceptual compensation for spectral-envelope distortion” for teaching on spectral envelope and distortions

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ezra J Baker whose telephone number is (703)756-1087. The examiner can normally be reached Monday - Friday 10:00 am - 8:00 pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached at (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/E.J.B./Examiner, Art Unit 2126                                                                                                                                                                                                        
/VAN C MANG/Primary Examiner, Art Unit 2126
Read full office action
Prosecution Timeline

Apr 12, 2023
Application Filed
Jan 09, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/455,252
Patent 12585964
EXHAUSTIVE LEARNING TECHNIQUES FOR MACHINE LEARNING ALGORITHMS
2y 5m to grant Granted Mar 24, 2026
17/475,901
Patent 12579477
FEATURE SELECTION USING FEEDBACK-ASSISTED OPTIMIZATION MODELS
2y 5m to grant Granted Mar 17, 2026
17/460,373
Patent 12505379
COMPUTER-READABLE RECORDING MEDIUM STORING MACHINE LEARNING PROGRAM, MACHINE LEARNING METHOD, AND INFORMATION PROCESSING DEVICE OF IMPROVING PERFORMANCE OF LEARNING SKIP IN TRAINING MACHINE LEARNING MODEL
2y 5m to grant Granted Dec 23, 2025
17/347,374
Patent 12373674
CODING OF AN EVENT IN AN ANALOG DATA FLOW WITH A FIRST EVENT DETECTION SPIKE AND A SECOND DELAYED SPIKE
2y 5m to grant Granted Jul 29, 2025
Study what changed to get past this examiner. Based on 4 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
50%
Grant Probability
99%
With Interview (+77.8%)
4y 3m
Median Time to Grant
Low
PTA Risk
Based on 14 resolved cases by this examiner. Grant probability derived from career allow rate.