Last updated: April 19, 2026
Application No. 17/884,205
TRAINING A NEURAL NETWORK MODEL ACROSS MULTIPLE DOMAINS

Final Rejection §101§102§103
Filed
Aug 09, 2022
Examiner
HOANG, AMY P
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
The Bank Of New York Mellon
OA Round
2 (Final)
Interview Optional

— +64.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 232 resolved cases, 2023–2026
Examiner Intelligence

HOANG, AMY P View full profile →
Grants 70% — above average
Career Allow Rate
163 granted / 232 resolved
+15.3% vs TC avg
Strong +64% interview lift
Without
With
+64.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
31 currently pending
Career history
263
Total Applications
across all art units
Statute-Specific Performance

§101
15.9%
-24.1% vs TC avg
§103
46.0%
+6.0% vs TC avg
§102
17.0%
-23.0% vs TC avg
§112
13.4%
-26.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 232 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
The Amendment filed on 11/17/2025 has been entered. Claims 1-20 remain pending in the application.  

Response to Arguments
Applicant's arguments filed 11/17/2025 have been fully considered. Each of applicant’s remarks is set forth, followed by examiner’s response.
	(1) Regarding to 35 U.S.C 101 rejection, in the Remarks, Applicant argues that 
(a) Under Step 2A, Prong 1, the claims are not directed to a mere mental process, but rather to a technical solution that could not practically be performed as a mental process. For example, the claims are directed to appending time series data in different domains for a neural network to train a single machine-learning model to learn across the different domains. The foregoing could not be practically performed in the human mind as a mental process.
As to point (1)(a), Examiner respectfully disagrees. The claim recites “append the first plurality of sequences and the second plurality of sequences to generate an appended input data relating to the first domain and the second domain”. The meaning of “the first plurality of sequences and the second plurality of sequences” is recited in the claim as sequences of time series of data. The claim does not put any limits on how the first plurality of sequences and the second plurality of sequences are appended, but the specification supports the plain meaning of “append” as encompassing appending the sequences in the pre-processed data together to form an appended set of sequences (see [0052]). Thus, the step of “append the first plurality of sequences and the second plurality of sequences to generate an appended input data relating to the first domain and the second domain” may be practically performed in the human mind to append sequences of data together using observation, evaluation, judgment, and opinion. The claim recites limitations that fall within the mental process grouping of abstract ideas. See MPEP 2106.04(a)(2), subsection III.
(b) Under Step 2A, Prong 2, even if the claims recite a mental process, they recite an integration of that process into a practical application that improves upon conventional machine- learning training techniques. In particular, unlike standard training workflows that require separate models for each domain or rely on serial recurrent networks, the claimed invention integrates cross-domain sequence appending to train a single neural network model capable of generalizing across multiple independent datasets (at least the independent claims).
As to point (1)(b), Examiner respectfully disagrees. This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application. See MPEP 2106.04(d). The claim recites the additional elements: “A system comprising: a memory that stores a plurality of time series of data each relating to a respective domain; a processor programmed to” is recited at a high level of generality, i.e., as a generic computer performing generic computer functions, “access first training data relating to a first domain, the first training data having a first time series of data comprising first sequential data values that vary over time”, “access second training data relating to a second domain, the second training data having a second time series of data comprising second sequential data values that vary independently from the first sequential data values over time” and “provide the appended input data to a neural network” are mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity, See MPEP 2106.05(g), and “a neural network to train a single machine-learning model trained to make predictions in the first domain and/or the second domain” provides nothing more than mere instructions to implement an abstract idea on a generic computer. See MPEP 2106.05(f). MPEP 2106.05(f) provides the following considerations for determining whether a claim simply recites a judicial exception with the words “apply it” (or an equivalent), such as mere instructions to implement an abstract idea on a computer: (1) whether the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished; (2) whether the claim invokes computers or other machinery merely as a tool to 9 perform an existing process; and (3) the particularity or generality of the application of the judicial exception. Training a single machine-learning model trained to make predictions in the first domain and/or the second domain using the appended input data generally apply the abstract idea without placing any limits on how to train a single machine-learning model trained to make predictions. Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).
(c) Under Step 2B, even if the claims are viewed as directed to a mental process under Step 2A (Prongs 1 and 2), they recite a combination of features that provide an inventive concept significantly more than just a mental process. For instance, the claims recite: append the first plurality of sequences and the second plurality of sequences to generate an appended input data relating to the first domain and the second domain; and provide the appended input data to a neural network to train a single machine- learning model trained to make predictions in the first domain and/or the second domain. At least the foregoing combination introduces an unconventional way to training a single machine-learning model across multiple domains. The foregoing is not a mere instruction to "apply a neural network," but a concrete implementation that changes how the neural network receives and learns from data. The appended, multi-domain data architecture improves the representational capacity and generalization performance of the system, obviating the need to train different models for different domains. These features represent a combination of inventive features that are significantly more than just a mental process. For at least this additional reason under Step 2B, the claims recite eligible subject matter. 
 As to point (1)(c), this part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05. As explained with respect to Step 2A, Prong Two, these additional elements were found to be insignificant extra-solution activity in Step 2A, Prong Two, because they were determined to be insignificant limitations as necessary data gathering and outputting. As discussed in Step 2A, Prong Two above, the recitations of “access first training data relating to a first domain, the first training data having a first time series of data comprising first sequential data values that vary over time”, “access second training data relating to a second domain, the second training data having a second time series of data comprising second sequential data values that vary independently from the first sequential data values over time” and “provide the appended input data to a neural network” are recited at a high level of generality. These elements amount to receiving or transmitting data over a network and are well-understood, routine, conventional activity. See MPEP 2106.05(d), subsection II. As discussed in Step 2A, Prong Two above, the recitation of a system comprising: a memory that stores a plurality of time series of data each relating to a respective domain; a processor programmed to amounts to no more than mere instructions to apply the exception using a generic computer component. Even when considered in combination, these additional elements represent mere instructions to implement an abstract idea or other exception on a computer and insignificant extra-solution activity, which do not provide an inventive concept. (Step 2B: NO).
(2) Regarding rejections made under 35 U.S.C. 102 for claim 1, Applicant alleges He does not teach “append the first plurality of sequences and the second plurality of sequences to generate an appended input data relating to the first domain and the second domain“. He in the relied upon portions merely describes encoding data from a higher-dimensional space into a lower-dimensional space to reduce the number of features or dimensions used for model training. This does not disclose appending multiple time series of data across different domains to form an appended input dataset used to train a single machine-learning model capable of making predictions across those domains as claimed.
As to point (2), Examiner respectfully disagrees. Examiner notes that the claims place no limitations on how to append the first plurality of sequences and the second plurality of sequences to generate an appended input data relating to the first domain and the second domain. According to the Specification as filed, [0052] discloses the computer system 110 may append the sequences in the pre-processed data 712 together to form an appended set of sequences. In the illustrated example, the input data 714 will include 500 sequences appended together. In this manner, the single LSTM model 730 may be trained from sequence data derived from market data of all tickers in the raw data 710. According to MPEP 2111, examiner is obliged to give the terms or phrases their broadest interpretation definition awarded by one of an ordinary skill in the art unless applicant has provided some indication of the definition of the claimed terms or phrases. He depicts a process for generating a machine learning model and/or a prediction based on encoded time series data in Figs. 5A-5C. This system may receive a training dataset of a plurality of data instances wherein each data instance includes a time series of data points. The system may perform an encoding operation on a training dataset and/or an input dataset to provide an encoded dataset having a lower dimension space than a dimension space of the training dataset and/or input dataset ([0107]-[0112]). Examiner notes that Figs. 5B-5C depict the system receives the Encoded Dataset which comprises all the encoded data instances, Encoded Data Instance 1 … Encoded Data Instance m, (i.e. appending) to train and generate one or more prediction models based on the encoded dataset. Thus He is considered to teach appending multiple time series of data across different domains to form an appended input dataset used to train a single machine-learning model capable of making predictions across those domains as claimed in claim 1.
Similar arguments have been presented for claims 10 and 18 and thus, Applicant’s arguments are not persuasive for the same reasons.
(3) Regarding rejections made under 35 U.S.C. 103 for claims 3 and 12, Applicant alleges the combination of Santos and He fails to render the claimed invention obvious because both references address normalization in fundamentally different contexts. Santos operates on spatially structured raster data, performing normalization across pixel bands to support spatial classification, while He addresses time-series data normalization within a single temporal domain to improve model stability. In contrast, claims 3 and 12 are respectively directed to a system and method that normalize multi-domain time-series data, where each domain represents an independent dataset exhibiting different temporal and statistical behaviors. The claimed mixture-model normalization unifies these domain-specific distributions into a common representational space, enabling a single model to generalize across independent domains. Neither Santos nor He, alone or in combination, teaches or suggests normalization for the purpose of cross-domain temporal learning. For at least additional reason, claims 3 and 12 are allowable over the references relied upon in the Office Action.
As to point (3), Examiner respectfully disagrees. Examiner notes that the claims place no limitations on how the cited mixture model is generated and how a plurality of clusters of normal distributions together approximates the input data. According to MPEP 2111, examiner is obliged to give the terms or phrases their broadest interpretation definition awarded by one of an ordinary skill in the art unless applicant has provided some indication of the definition of the claimed terms or phrases. SANTOS teaches a normalization process by first normalizing an input data referred to as observations in a summarization phase by finding the mean and standard deviation of the values in each band. The numerical values of these observations are then modified by first subtracting, from each band, the corresponding mean, and then dividing by the corresponding standard deviation. This modification step is an application of a feature scaling technique that standardizes the range of independent variables or features of data to normalize data by selecting for zero mean and unit variance and once the modification step has been performed in the summarization phase, a number of cluster descriptions are selected, and the normalized observations are used to generate a set of clusters of this selected number that describe the approximate distribution of normalized observations relative to each other. Thus, SANTOS is considered to teach the limitations of claims 3 and 12.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 


Step 1: Claims 1-9 are directed to a system, claims 10-17 are directed to a method and claims 18-20 are directed to a medium. Therefore, the claims are eligible under Step 1 for being directed to a machine, a process and a manufacture respectively.
Step 2A Prong 1:  
Independent claims 1, 10 and 18 recite:
generate a first plurality of sequences from the first training data - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
generate a second plurality of sequences from the second training data - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
append the first plurality of sequences and the second plurality of sequences to generate an appended input data relating to the first domain and the second domain - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;

Dependent claims 2 and 11 recite:
normalize the input data - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
Dependent claims 3 and 12 recite:
generate a mixture model comprising a plurality of clusters of normal distributions that together approximate the input data - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Dependent claim 4 recites:
for each data value in the input data:
identify a corresponding cluster from among the plurality of clusters; determine a normalization value based on the corresponding cluster; and normalize the data value based on the normalization value - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
Independent claims 1, 10 and 18:
A system comprising: a memory that stores a plurality of time series of data each relating to a respective domain; a processor programmed to - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
access first training data relating to a first domain, the first training data having a first time series of data comprising first sequential data values that vary over time - the steps recited at a high level of generality, and amounts to mere data gathering, receiving interaction between a user and the interactive GUI element is well known which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g));
access second training data relating to a second domain, the second training data having a second time series of data comprising second sequential data values that vary independently from the first sequential data values over time - the steps recited at a high level of generality, and amounts to mere data gathering, receiving interaction between a user and the interactive GUI element is well known which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g));
provide the appended input data to a neural network to train a single machine-learning model trained to make predictions in the first domain and/or the second domain - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f).
Dependent claims 2 and 11:
wherein the processor is further programmed to: - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
receive input data relating to the first domain or the second domain, the input data comprising an input time series of data - the “receiving” step recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
provide the normalized input data to the single machine-learning model; and generate a prediction based on the input data using the single machine-learning model - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f).
Dependent claims 3 and 12:
wherein to normalize the input data, the processor is further programmed to: - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
Dependent claim 4:
wherein the processor is further programmed to: - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
Dependent claims 5 and 13:
wherein the neural network is part of a parallel neural network architecture comprising a plurality of neural networks - the step recited at a high level of generality, and amounts to generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)).
, and 
wherein the processor is further programmed to: - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
provide the appended input data to an input layer of a first neural network of the parallel network architecture - the steps recited at a high level of generality, and amounts to mere data transmission which is well known which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Dependent claims 6 and 14:
wherein each sequence from among the first plurality of sequences comprises a respective subset of the first time series of data - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Dependent claims 7 and 15:
wherein each sequence from among the first plurality of sequences have in common at least some of the first time series of data with a next sequence in the first plurality of sequences - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Dependent claims 8 and 16:
wherein a number of the plurality of sequences that are generated is based on a size of the first time series of data - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Dependent claims 9 and 20:
wherein the single machine-learning model comprises a single Long-term Short-term Memory (LSTM) model - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
Independent claims 1, 10 and 18:
A system comprising: a memory that stores a plurality of time series of data each relating to a respective domain; a processor programmed to - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
access first training data relating to a first domain, the first training data having a first time series of data comprising first sequential data values that vary over time - the steps recited at a high level of generality, and amounts to mere data gathering, receiving interaction between a user and the interactive GUI element is well known which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g));
access second training data relating to a second domain, the second training data having a second time series of data comprising second sequential data values that vary independently from the first sequential data values over time - the steps recited at a high level of generality, and amounts to mere data gathering, receiving interaction between a user and the interactive GUI element is well known which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g));
provide the appended input data to a neural network to train a single machine-learning model trained to make predictions in the first domain and/or the second domain - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f).
Dependent claims 2 and 11:
wherein the processor is further programmed to: - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
receive input data relating to the first domain or the second domain, the input data comprising an input time series of data - the “receiving” step recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
provide the normalized input data to the single machine-learning model; and generate a prediction based the input data using the single machine-learning model - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f).
Dependent claims 3 and 12:
wherein to normalize the input data, the processor is further programmed to: - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
Dependent claim 4:
wherein the processor is further programmed to: - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
Dependent claims 5 and 13:
wherein the neural network is part of a parallel neural network architecture comprising a plurality of neural networks - the step recited at a high level of generality, and amounts to generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)).
, and 
wherein the processor is further programmed to: - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
provide the appended input data to an input layer of a first neural network of the parallel network architecture - the steps recited at a high level of generality, and amounts to mere data transmission which is well known which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Dependent claims 6 and 14:
wherein each sequence from among the first plurality of sequences comprises a respective subset of the first time series of data - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Dependent claims 7 and 15:
wherein each sequence from among the first plurality of sequences have in common at least some of the first time series of data with a next sequence in the first plurality of sequences - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Dependent claims 8 and 16:
wherein a number of the plurality of sequences that are generated is based on a size of the first time series of data - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Dependent claims 9 and 20:
wherein the single machine-learning model comprises a single Long-term Short-term Memory (LSTM) model - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-2, 10-11 and 18-19 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by He et al. (hereinafter He), US 20230267352 A1.

Regarding independent claim 1, He teaches a system (Fig. 1, 102; [0058] Model reduction system 102 may include one or more computing devices configured to communicate with user device 104, and/or data source 106 via communication network 108; [0064] Referring now to FIG. 2, FIG. 2 is a diagram of example components of device 200. Device 200 may correspond to model reduction system 102 (e.g., one or more devices of model reduction system 102)) comprising:
a memory that stores a plurality of time series of data each relating to a respective domain (Fig. 2, 206, 208; [0071] Memory 206 and/or storage component 208 may include data storage or one or more data structures (e.g., a database and/or the like). Device 200 may be capable of receiving information from, storing information in, communicating information to, or searching information stored in the data storage or one or more data structures in memory 206 and/or storage component 208. For example, the information may include input data, output data, transaction data, account data, or any combination thereof; [0074] In some non-limiting embodiments or aspects, model reduction system 102 may receive a training dataset of a plurality of data instances, wherein each data instance comprises a time series of data points; [0076] In some non-limiting embodiments or aspects, each data instance of the plurality of data instances of the training dataset may represent an institution (e.g., issuer, bank, merchant, and/or the like) (i.e. domain));
a processor programmed to (Fig. 2, 204; [0065] processor 204 may be implemented in hardware, software, or a combination of hardware and software. For example, processor 204 may include a processor (e.g., a central processing unit (CPU), a graphics processing unit (GPU), an accelerated processing unit (APU), etc.), a microprocessor, a digital signal processor (DSP), and/or any processing component (e.g., a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), etc.) that can be programmed to perform a function.):
access first training data relating to a first domain, the first training data having a first time series of data comprising first sequential data values that vary over time ([0074] As shown in FIG. 3, at step 302, process 300 may include receiving a training dataset. In some non-limiting embodiments or aspects, model reduction system 102 may receive a training dataset during the training phase. In some non-limiting embodiments or aspects, model reduction system 102 may receive a training dataset during the testing phase. In some non-limiting embodiments or aspects, model reduction system 102 may receive a training dataset of a plurality of data instances, wherein each data instance comprises a time series of data points; [0076] In some non-limiting embodiments or aspects, each data instance of the plurality of data instances of the training dataset may represent an institution (e.g., issuer, bank, merchant, and/or the like) (i.e. domain); [0077] In some non-limiting embodiments or aspects, each data instance may be stored and compiled into a training dataset for future training; Fig. 5A, Data Instance 1; FIG. 5A, Data Instance 1; [0108] As shown by reference number 505 in FIG. 5A, model reduction system 102 may receive a training dataset. In some non-limiting embodiments or aspects, model reduction system 102 may receive a training dataset of a plurality of data instances. In some non-limiting embodiments or aspects, a data instance may correspond to a data source (e.g., issuer, bank, merchant, and/or the like). In some non-limiting embodiments or aspects, each data instance of the plurality of data instances may include a time series of data vectors and/or data points);
access second training data relating to a second domain (FIG. 5A, Data Instance 2), the second training data having a second time series of data comprising second sequential data values that vary independently from the first sequential data values over time ([0108] As shown by reference number 505 in FIG. 5A, model reduction system 102 may receive a training dataset. In some non-limiting embodiments or aspects, model reduction system 102 may receive a training dataset of a plurality of data instances. In some non-limiting embodiments or aspects, a data instance may correspond to a data source (e.g., issuer, bank, merchant, and/or the like). In some non-limiting embodiments or aspects, each data instance of the plurality of data instances may include a time series of data vectors and/or data points);
generate a first plurality of sequences from the first training data (FIG. 5B, Encoded Data Instance 1; [0109] As shown by reference number 510 in FIG. 5B, model reduction system 102 may perform an encoding operation. In some non-limiting embodiments or aspects, model reduction system 102 may perform an encoding operation on a training dataset and/or an input dataset to provide an encoded dataset having a lower dimension space than a dimension space of the training dataset and/or input dataset);
generate a second plurality of sequences from the second training data (FIG. 5B, Encoded Data Instance 2; [0109] As shown by reference number 510 in FIG. 5B, model reduction system 102 may perform an encoding operation. In some non-limiting embodiments or aspects, model reduction system 102 may perform an encoding operation on a training dataset and/or an input dataset to provide an encoded dataset having a lower dimension space than a dimension space of the training dataset and/or input dataset);
append the first plurality of sequences and the second plurality of sequences to generate an appended input data relating to the first domain and the second domain (FIG. 5C, Encoded Dataset/Encoded Data Instances); and
provide the appended input data to a neural network to train a single machine-learning model trained to make predictions in the first domain and/or the second domain (FIG. 5C; [0112] As shown by reference number 515 in FIG. 5C, model reduction system 102 may generate a prediction model. … In some non-limiting embodiments or aspects, model reduction system 102 may generate a prediction model using the entire encoded dataset as training data).

Regarding dependent claim 2, He teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. He further teaches wherein the processor is further programmed to:
receive input data relating to the first domain or the second domain, the input data comprising an input time series of data ([0097] As shown in FIG. 4, at step 402, process 400 may include receiving an input. In some non-limiting embodiments or aspects, model reduction system 102 may receive a time series dataset of a plurality of data instances as input);
normalize the input data ([0101] As shown in FIG. 4, at step 404, process 400 may include performing an encoding operation);
provide the normalized input data to the single machine-learning model ([0103] As shown in FIG. 4, at step 406, process 400 may include providing encoded input to a prediction model); and
generate a prediction based on the input data using the single machine-learning model ([0104] As shown in FIG. 4, at step 408, process 400 may include determining an output).

Regarding independent claim 10, it is a method claim that corresponding to the system of claim 1. Therefore, it is rejected for the same reason as claim 1 above.

Regarding dependent claim 11, it is a method claim that corresponding to the system of claim 2. Therefore, it is rejected for the same reason as claim 2 above.

Regarding independent claim 18, it is a medium claim that corresponding to the system of claim 1. Therefore, it is rejected for the same reason as claim 1 above. He further teaches a non-transitory computer readable medium storing instructions that, when executed by a processor, causes the processor to perform operation ([0069] Device 200 may perform one or more processes described herein. Device 200 may perform these processes based on processor 204 executing software instructions stored by a computer-readable medium, such as memory 206 and/or storage component 208. A computer-readable medium (e.g., a non-transitory computer-readable medium) is defined herein as a non-transitory memory device. A non-transitory memory device includes memory space located inside of a single physical storage device or memory space spread across multiple physical storage devices).

Regarding dependent claim 19, it is a medium claim that corresponding to the system of claim 2. Therefore, it is rejected for the same reason as claim 2 above.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3-4 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over He as applied in claims 1 and 10, in view of SANTOS et al. (hereinafter SANTOS), US 20180189326 A1.

Regarding dependent claim 3, He teaches all the limitations as set forth in the rejection of claim 2 that is incorporated. He does not explicitly disclose wherein to normalize the input data, the processor is further programmed to:
generate a mixture model comprising a plurality of clusters of normal distributions that together approximate the input data.
However, in the same field of endeavor, SANTOS teaches wherein to normalize the input data, the processor is further programmed to:
generate a mixture model comprising a plurality of clusters of normal distributions that together approximate the input data ([0014] a computing environment 130 that includes one or more processors 132 and a plurality of software and hardware components 132. The one or more processors and plurality of software and hardware components 132 are configured to execute program instructions or routines to perform the components and data processing functions described herein, and embodied within the plurality of data processing modules 120 configured to carry out such functions; [0015] FIG. 1 is a system diagram for such a classification framework 100. The classification framework 100 ingests, retrieves, requests, or otherwise obtains input data 110 in the form of multi-band raster data 112 ... This input data 110 is taken into the classification framework 100 by a data collection component 122 in the plurality of data processing components 120, and may be retrieved from one or more database locations, or acquired directly from third party or proprietary sources. Regardless, the classification framework 100 is initialized by intake of the multi-band raster data 112; [0016] This initial collection of data is referred to as observations; [0017] The observations are first normalized in this summarization phase by finding the mean and standard deviation of the values in each band. The numerical values of these observations are then modified by first subtracting, from each band, the corresponding mean, and then dividing by the corresponding standard deviation. This modification step is an application of a feature scaling technique that standardizes the range of independent variables or features of data to normalize data by selecting for zero mean and unit variance; [0019] Once the modification step has been performed in the summarization phase, a number of cluster descriptions are selected, and the normalized observations are used to generate a set of clusters of this selected number that describe the approximate distribution of normalized observations relative to each other. The set of clusters may be generated using a Gaussian mixture model 146).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of generating a set of clusters using a Gaussian mixture model as suggested in SANTOS into He’s system because both of these systems are addressing the classification technique for input data. This modification would have been motivated by the desire for saving cluster storage space resulting in a speed improvement (SANTOS, [0007]-[0008]).

Regarding dependent claim 4, the combination of He and SANTOS teaches all the limitations as set forth in the rejection of claim 3 that is incorporated. SANTOS further teaches wherein the processor is further programmed to:
for each data value in the input data:
identify a corresponding cluster from among the plurality of clusters ([0022] The retrieval phase in the processing of multi-band raster data 112 of the present invention begins by reading one or more sets of cluster descriptions stored in the summarization phase, and synthesizing a collection of observations of these clusters that can then be ordered, partitioned, and used to find the quantile breaks);
determine a normalization value based on the corresponding cluster ([0022] The retrieval phase in the processing of multi-band raster data 112 of the present invention begins by reading one or more sets of cluster descriptions stored in the summarization phase, and synthesizing a collection of observations of these clusters that can then be ordered, partitioned, and used to find the quantile breaks); and
normalize the data value based on the normalization value ([0025] The observations are then de-normalized, by first multiplying them by the previously-calculated standard deviation, and then summing them by the previously-calculated means. The resulting collection of transformed and de-normalized observations are referred to as the synthesized observations).

Regarding dependent claim 12, it is a method claim that corresponding to the system of claim 3. Therefore, it is rejected for the same reason as claim 3 above.

Claims 5-9, 13-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over He as applied in claims 1, 10 and 18, in view of WATT et al. (hereinafter WATT), US 20210256378 A1.

Regarding dependent claim 5, He teaches all the limitations as set forth in the rejection of claim 2 that is incorporated. He does not explicitly disclose wherein the neural network is part of a parallel neural network architecture comprising a plurality of neural networks, and wherein the processor is further programmed to:
provide the appended input data to an input layer of a first neural network of the parallel network architecture.
However, in the same field of endeavor, WATT teaches 
wherein the neural network is part of a parallel neural network architecture comprising a plurality of neural networks ([0091] FIG. 1B is a more detailed block schematic diagram 100B of the example system, illustrating different models operating in concert with one another, according to some embodiments. FIG. 1B continues to FIG. 1C shown in block schematic diagram 100C; Fig. 4; [0166] As shown in neural network architecture 400, in example embodiments, the plurality of neural network elements comprise at least one LSTM and at least one dense neural network architecture 404; [0166] As shown in neural network architecture 400, in example embodiments, the plurality of neural network elements comprise at least one LSTM and at least one dense neural network architecture 404. In example embodiments, the dense neural network architecture 404 may be configured to receive the output of an element 402. In the embodiment shown, dense neural network architectures can be arranged in series or in parallel, with the dense neural network architecture 408 arranged in series with the dense neural network architecture 410, for example), and wherein the processor is further programmed to:
provide the appended input data to an input layer of a first neural network of the parallel network architecture ([0096] FIG. 1B illustrates the input components of the system, and for this example, climate data 152 and weather data 154 are utilized along with a target data distribution, historical transaction data 156; [0116] Continuing to FIG. 1C and diagram 100C, the data and the de-trended data are provided into a set of models for training. While in this example, three different models are shown, other approaches are possible. These models can be iteratively trained using a loss function (e.g., minimizing losses) such that interconnections between data object representations are refined over a period of time; [0117] Accordingly, the trained models can include a plurality of models, and in an embodiment, three different models (Models A 166, B 168, and C 170) are proposed that are adapted and trained differently. Model A 166 can be trained using the raw time-series data, while Models B 168 and C 170 can be trained using the de-trended data 164, with Model B adapted for a first environmental condition (e.g., weather/daily), and Model C 170 168 adapted for a second environmental condition (e.g., climate/monthly). The models can include neural network architectures, and in some embodiments, parallel decision tree architectures are utilized for establishing the models as data model architectures using interconnected computational objects; [0121] The three models 166, 168, and 170 can be separately instantiated and their results can be aggregated to generate the output data set; [0168] For example, in the embodiment shown, dense neural network architecture 404 is connected to the LSTM element 402, such that an LSTM element's ability to retain information from previous inputs when processing successive inputs is utilized prior to passing information to a dense neural network element. In example embodiments, the neural network architecture 400 may be configured with an initial placement of the plurality of neural network elements such that the LSTM element receives the transaction data initially).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of different models operating in concert with one another as suggested in WATT into He’s system because both of these systems are addressing machine learning data architecture. This modification would have been motivated by the desire to provide more accurate predictions (WATT, [0014]).

Regarding dependent claim 6, He teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. He f does not explicitly disclose wherein each sequence from among the first plurality of sequences comprises a respective subset of the first time series of data.
 However, in the same field of endeavor, WATT teaches wherein each sequence from among the first plurality of sequences comprises a respective subset of the first time series of data ([0165] Neural network architecture 400 is shown comprising a subset of the first data set passing through various neural network elements, including an LSTM element 402. In example embodiments, the first data set is subdivided into subsets based on a desired time series. For example, in the shown embodiment, the subset of first data 406 comprises entries representing 8 days. The first data set can be subdivided into any combination of entries for any desired time series).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of subdividing data set into subsets based on a desired time series as suggested in WATT into He’s system because both of these systems are addressing machine learning data architecture. This modification would have been motivated by the desire to provide more accurate predictions (WATT, [0014]).

Regarding dependent claim 7, the combination of He and WATT teaches all the limitations as set forth in the rejection of claim 6 that is incorporated. WATT further teaches wherein each sequence from among the first plurality of sequences have in common at least some of the first time series of data with a next sequence in the first plurality of sequences ([0165] the first data set can be subdivided based on features. For example, in the embodiment shown in FIG. 4, the weather features are separated from the location features (e.g. geo-features 412), business type features 414, and transaction data features 416. The first data set can be subdivided into any combination of features for any desired time series. Generally, the first data set can be divided into any combination of features and entries).

Regarding dependent claim 8, the combination of He and WATT teaches all the limitations as set forth in the rejection of claim 6 that is incorporated. WATT further teaches wherein a number of the plurality of sequences that are generated is based on a size of the first time series of data ([0165] the first data set is subdivided into subsets based on a desired time series).

Regarding dependent claim 9, He teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. He does not explicitly disclose wherein the single machine-learning model comprises a single Long-term Short-term Memory (LSTM) model.
However, in the same field of endeavor, WATT teaches wherein the single machine-learning model comprises a single Long-term Short-term Memory (LSTM) model (Fig. 4; [0165] Neural network architecture 400 is shown comprising a subset of the first data set passing through various neural network elements, including an LSTM element 402; [0166] As shown in neural network architecture 400, in example embodiments, the plurality of neural network elements comprise at least one LSTM and at least one dense neural network architecture 404).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of  the plurality of neural network elements comprise at least one Long Short Term Memory (LSTM) element and at least one dense neural network architecture as suggested in WATT into He’s system because both of these systems are addressing machine learning data architecture. This modification would have been motivated by the desire to provide more accurate predictions (WATT, [0014]).

Regarding dependent claim 13, it is a method claim that corresponding to the system of claim 5. Therefore, it is rejected for the same reason as claim 5 above.

Regarding dependent claim 14, it is a method claim that corresponding to the system of claim 6. Therefore, it is rejected for the same reason as claim 6 above.

Regarding dependent claim 15, it is a method claim that corresponding to the system of claim 7. Therefore, it is rejected for the same reason as claim 7 above.

Regarding dependent claim 16, it is a method claim that corresponding to the system of claim 8. Therefore, it is rejected for the same reason as claim 8 above.

Regarding dependent claim 17, it is a method claim that corresponding to the system of claim 9. Therefore, it is rejected for the same reason as claim 9 above.

Regarding dependent claim 20, it is a medium claim that corresponding to the system of claim 9. Therefore, it is rejected for the same reason as claim 9 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
KAMNEVA et al. (US 20210193255 A1) discloses utilizing multi-sample batch controls for high throughput copy number calling in a small number of fixed regions where copy number changes are expected.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way.  A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY P HOANG whose telephone number is (469)295-9134. The examiner can normally be reached M-TH 8:30-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER WELCH can be reached at 571-272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AMY P HOANG/           Examiner, Art Unit 2143                                                                                                                                                                                             

/JENNIFER N WELCH/           Supervisory Patent Examiner, Art Unit 2143
Read full office action
Prosecution Timeline

Aug 09, 2022
Application Filed
Aug 18, 2025
Non-Final Rejection — §101, §102, §103
Nov 17, 2025
Response Filed
Jan 24, 2026
Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/455,325
Patent 12602596
APPARATUS AND METHOD FOR VALIDATING DATASET BASED ON FEATURE COVERAGE
2y 5m to grant Granted Apr 14, 2026
18/525,453
Patent 12572263
ACCESS CARD WITH CONFIGURABLE RULES
2y 5m to grant Granted Mar 10, 2026
17/572,921
Patent 12536432
PRE-TRAINING METHOD OF NEURAL NETWORK MODEL, ELECTRONIC DEVICE AND MEDIUM
2y 5m to grant Granted Jan 27, 2026
17/241,391
Patent 12475669
METHOD AND APPARATUS WITH NEURAL NETWORK OPERATION FOR DATA NORMALIZATION
2y 5m to grant Granted Nov 18, 2025
18/386,907
Patent 12461595
SYSTEM AND METHOD FOR EMBEDDED COGNITIVE STATE METRIC SYSTEM
2y 5m to grant Granted Nov 04, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+64.2%)
3y 3m
Median Time to Grant
Moderate
PTA Risk
Based on 232 resolved cases by this examiner. Grant probability derived from career allow rate.