Last updated: April 19, 2026
Application No. 18/096,243
SEGMENTED MACHINE LEARNING-BASED MODELING WITH PERIOD-OVER-PERIOD ANALYSIS

Final Rejection §101§103
Filed
Jan 12, 2023
Examiner
MULLINAX, CLINT LEE
Art Unit
2123
Tech Center
2100 — Computer Architecture & Software
Assignee
The Bank Of New York Mellon
OA Round
2 (Final)
Interview Optional

— +38.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 123 resolved cases, 2023–2026
Examiner Intelligence

MULLINAX, CLINT LEE View full profile →
Grants 48% of resolved cases
Career Allow Rate
59 granted / 123 resolved
-7.0% vs TC avg
Strong +38% interview lift
Without
With
+38.3%
Interview Lift
resolved cases with interview
Typical timeline
4y 4m
Avg Prosecution
25 currently pending
Career history
148
Total Applications
across all art units
Statute-Specific Performance

§101
22.8%
-17.2% vs TC avg
§103
53.6%
+13.6% vs TC avg
§102
6.3%
-33.7% vs TC avg
§112
13.1%
-26.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 123 resolved cases
Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is a responsive to the application filed on 02/02/2026.
Claims 1-14, 16, and 18-22 are pending.
Claims 1, 5, 9, 13, and 19 have been amended.
Claims 15 and 17 have been canceled.
Claims 21-22 have been added

Response to Arguments
Applicant’s arguments, with respect to the claim objections of claims 5, 13, and 19, have been fully considered and are persuasive. Therefore, the objections set forth in the previous office action have been withdrawn.

Applicant’s arguments, with respect to the rejection(s) of claim(s) 1-20 under 35 U.S.C. 101, have been considered but they are not persuasive. The applicant argues that model training and labeling overcome the 101 rejection, in light of specification, since it overcomes “short comings of prior machine learning systems and is not a routine or conventional application of generic computing components”, and are “technical improvements to the operation of machine learning systems themselves”. The examiner respectfully disagrees.
The recitations of the “base machine learning model” and “segmented ML models” operations and trainings are recited at a high level and do not integrate the judicial exceptions into a practical application since the steps are mere “black-box” operations without further details of the inner workings/architecture of the algorithms and how the training specifically affects the algorithms for outputting desired predictions; thus are maintained as generally link the use of the judicial exception to a particular technological environment or field of use. See 35 U.S.C 101 section for full, updated analysis of claim limitations necessitated by applicant amendments.

Applicant’s arguments, with respect to the rejection(s) of claims 1 and 9 under 35 U.S.C. 103, have been considered but are not persuasive. The applicant argues that no reference teaches the claim limitations of claims 1 and 9, and state “execute a plurality of segmented ML models, each segmented ML model being trained to predict entity behavior based on a respective segment from among the plurality of segments; generate a plurality of segmented classes, each segmented class from among the plurality of segmented classes being an output of a corresponding segmented ML model from among the plurality of segmented ML models”, since “Divina’s ensemble uses an algorithm based on random sampling and time series, not segment-based as claimed”. Due to the broadness of the claim language, the examiner respectfully disagrees.
Divina is maintained as teaching the limitations as required by the claim language. As claimed, there is no distinction between the “segment[ed]” data, and thus Divina, sections 3.1, 3.3, and Figs. 4-5 teach stacking ensemble including a Random Forest of Decision trees, “where each tree is trained separately on a independent randomly selected training set (segmented). It follows that each tree depends on the values of an input dataset sampled independently, with the same distribution for all trees”. Sections 3.1, 3.3, and Figs. 4-5 teach “For classification, each tree in the RF casts a unit vote for the most popular class at input”.
See 35 U.S.C 103 section for full mapping of claim limitations.

Applicant’s arguments, with respect to the rejection(s) of claim 18 under 35 U.S.C. 103, have been considered but are not persuasive. The applicant argues that no reference teaches the claim limitations of claim 18, since the claim “directed to training” and the rejections are improper. The examiner respectfully disagrees.
Claims 1, 9, and 18 are all directed to training over the argued limitations and are deemed analogous. The base models can be interpreted as trained on the dataset of the gathered times and the merged GBM model can be interpreted as trained on the base model predicted time set outputs; thus, maintained as reading on the claim based on the broadness of the claim language.
See 35 U.S.C 103 section for full mapping of claim limitations.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-14, 16, and 18-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claims 1, 9, and 18 are respectively drawn to a system, method, and non-transitory computer readable medium, hence each falls under one of four categories of statutory subject matter (Step 1).  Nonetheless, the claims are directed to a judicially recognized exception of an abstract idea without significantly more.  
Claims 1, 9, and 18 recite the following, or analogously, limitations “access a plurality of features derived from entity data relating to an entity, the entity being associated with a plurality of segments; execute a base…model using the plurality of features…; generate a base classification as an output of the executed base…model; execute a plurality of segmented…models…; generate a plurality of segmented classes, each segmented class from among the plurality of segmented classes being an output of a corresponding segmented…model from among the plurality of segmented…models; provide the base class and the plurality of segmented classes as input to a merged model…; and generate a behavior classification as an output of the merged model, the behavior classification representing a prediction of the entity behavior based on outputs of the base…model and the plurality of segmented…models”. These limitations, as claimed, under its broadest reasonable interpretation, can be evaluated in a human mind, except for the recitation of generic computer components (Step 2A). Other than reciting “a processor”, “non-transitory computer readable medium”, “machine learning (ML)”, “ML”, “the base ML model being trained to predict entity behavior across the plurality of segments”, “each segmented ML model being trained to the predict entity behavior based on a respective segment from among the plurality of segments”, “model that was trained based on weights for each of the base ML model and the plurality of segmented ML models”, to perform the exceptions, nothing in the claims preclude the steps from practically being performed in the human mind. For example, a human expert can:
mentally/with the aid of pen and paper access a plurality of features derived from entity data relating to an entity, the entity being associated with a plurality of segments (e.g. by thinking of/writing out remembering data samples of power supply values over time periods)
mentally/with the aid of pen and paper execute a base…model using the plurality of features… (e.g. by thinking of/writing out a calculation using the remembered samples as input),
mentally/with the aid of pen and paper generate a base classification as an output of the executed base…model (e.g. by thinking of/writing out the calculation output)
mentally/with the aid of pen and paper execute a plurality of segmented…models… (e.g. by thinking of/writing out three other calculations)
mentally/with the aid of pen and paper generate a plurality of segmented classes, each segmented class from among the plurality of segmented classes being an output of a corresponding segmented…model from among the plurality of segmented…models (e.g. by thinking of/writing out the three other calculations outputting unique categories)
mentally/with the aid of pen and paper provide the base class and the plurality of segmented classes as input to a merged model… (e.g. by thinking of/writing out inputting the calculation’s and three other calculation’s outputs to a fourth calculation)
mentally/with the aid of pen and paper generate a behavior classification as an output of the merged model, the behavior classification representing a prediction of the entity behavior based on outputs of the base…model and the plurality of segmented…models (e.g. by thinking of/writing out the fourth calculation output being the power supply values for a future time period).
Thus, the claims recite a mental process (Step 2A, Prong 1). 
Claims 1, 9, and 18 include additional elements, “a processor”, “non-transitory computer readable medium”, “machine learning (ML)”, “ML”, “the base ML model being trained to predict entity behavior across the plurality of segments”, “each segmented ML model being trained to the predict entity behavior based on a respective segment from among the plurality of segments”, “model that was trained based on weights for each of the base ML model and the plurality of segmented ML models”, however the recitations of these elements are at a high level of generality, and the claimed “the base ML model being trained to predict entity behavior across the plurality of segments”, “each segmented ML model being trained to the predict entity behavior based on a respective segment from among the plurality of segments”, “model that was trained based on weights for each of the base ML model and the plurality of segmented ML models” amount to adding the words “apply it” (or an equivalent) with the judicial exception; the claimed “a processor”, “non-transitory computer readable medium” amount to merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f); the claimed “machine learning (ML)”, “ML” amount to generally linking the use of the judicial exception to a particular technological environment or filed of use - see MPEP 2106.05(h). Hence, each of the additional limitations or in combination is no more than adding the words “apply it” (or an equivalent) with the judicial exception or merely using a computer as a tool to perform an abstract idea, generally linking the use of the judicial exception to a particular technological environment or filed of use, and do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea (Step 2A, Prong 2). The additional elements in the claim do not amount to significantly more than an abstract idea. Furthermore, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements of using “a processor”, “non-transitory computer readable medium”, “machine learning (ML)”, “ML”, “the base ML model being trained to predict entity behavior across the plurality of segments”, “each segmented ML model being trained to the predict entity behavior based on a respective segment from among the plurality of segments”, “model that was trained based on weights for each of the base ML model and the plurality of segmented ML models” to perform the steps of the independent claims amount to no more than adding the words “apply it” (or an equivalent) with the judicial exception or merely using a computer as a tool to perform an abstract idea, generally linking the use of the judicial exception to a particular technological environment or filed of use, and these cannot provide an inventive concept. (STEP 2B). As such, claims 1, 9, and 18 are not patent eligible.
Dependent claims 2-8, 10-14, 16, and 19-21 are also ineligible for the same reasons given with respect to claims 1, 9, and 18. The dependent claims describe additional mental processes:
mentally/with the aid of pen and paper identify one or more segments associated with the entity; and select corresponding ones of the plurality of segmented…models to execute for the entity based on the identified one or more segments (claims 2 and 10) (e.g. by mentally/writing out time periods associated with the power supply values, and designating one of the three other calculations to process the time period of values)
mentally/with the aid of pen and paper wherein the different period-over-period changes comprises a first period and a second period longer than the first period (claims 4 and 12) (e.g. by thinking of/writing out the time periods are different sizes)
mentally/with the aid of pen and paper wherein the entity behavior is labeled…based on a first definition that specifies activity of the entity that defines the entity behavior over a first time period and a second specifies activity of the entity that defines the entity behavior over a second time period greater than the first time period (claims 5, 15, and 19) (e.g. by thinking of/writing out power supply values are categorized based on the values in a first time period and a second time period that’s after the first time period)
mentally/with the aid of pen and paper wherein the entity behavior is labeled for training only when both the first definition and the second definition are satisfied (claims 6 and 14) (e.g. by thinking of/writing out categorized time period power supply values are used for updating the calculations)
mentally/with the aid of pen and paper wherein the merged model is based on a weight applied to each of: the base…model and the plurality of segmented…models (claim 7) (e.g. by thinking of/writing out the fourth calculation parameters are determined based on a shared parameter of the calculation and the three other calculations)
mentally/with the aid of pen and paper provide outputs of the base…model, the plurality of segmented…models and the merged…model to a training subsystem…(claims 8 and 16) (e.g. by thinking of/writing out the all calculation outputs are input to a parameter tuning function)
mentally/with the aid of pen and paper wherein the entity behavior comprises an attrition of an activity of the entity (claim 20) (e.g. by thinking of/writing out the power supply values are decreasing)
mentally/with the aid of pen and paper generate a first probability as an output of the base…model and a plurality of probabilities as outputs of corresponding ones of the plurality of segmented…models; multiply the first probability by a weight associated with the base…model and multiply each probability, from among the plurality of probabilities, by a corresponding weight associated with the segmented…models to generate a plurality of weighted probabilities; combine the plurality of weighted probabilities to compute an overall probability; and generate the behavior classification by comparing the overall probability to a threshold value (claim 21) (e.g. by thinking of/writing out a weighted summation computation of all the calculation outputs and comparison to a predetermined value)
Again, the dependent claims continued to cover the performance of the limitation in the mind as inherited from the independent claims (Step 2A, Prong 1). The dependent claims 3 and 11 recitation “wherein the plurality of segmented ML models are each trained based on different period-over-period changes in the entity data over time that define a trend”, dependent claims 5, 13, and 19 recitation “for training the base ML model and the plurality of segmented ML models”, dependent claims 8 and 16 recitation “to retrain one or more of the models” amount to adding the words “apply it” (or an equivalent) with the judicial exception, dependent claims 2, 8, 10, and 16 recitation of “the processor” amount to mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f); dependent claims 2, 7-8, 10, 16, and 21 recitation “ML” amount to generally linking the use of the judicial exception to a particular technological environment or filed of use - see MPEP 2106.05(h); and these do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea (Step 2A, Prong 2). The additional element in the claims do not amount to significantly more than an abstract idea. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements to perform the steps of in the dependent claims and perform the steps of the claims amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, generally linking the use of the judicial exception to a particular technological environment or filed of use, and this cannot provide an inventive concept. (STEP 2B). As such, dependent claims 2-8, 10-14, 16, and 19-21 additional elements or combination of elements do not amount to significantly more than an abstract idea nor provide any inventive concept, nor impose a meaningful limit to integrate the elements into a practical application or significantly more than the judicial exceptions; therefore, the dependent claims are not deemed patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-5, 7, 9-13, 15, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Divina et al (“Stacking Ensemble Learning for Short-Term Electricity Consumption Forecasting”, 2018) hereinafter Divina, in view of Kumar et al (“Optimized Stacking Ensemble Learning Model for Breast Cancer Detection and Classification Using Machine Learning”, 2022) hereinafter Kumar.
Regarding analogous claims 1, 9, and 18, Divina teaches a system for identifying activity classes of entities using machine learning, comprising: a processor programmed to; a method for identifying activity classes of entities using machine learning; a non-transitory computer readable medium storing instructions that, when executed by a processor, programs the processor to (sections 3.2-3.3 and 5 teach machine learning stacking ensemble training of multiple machine learning algorithms with computational costs on “computing infrastructures”, known for using one or more processors of a computer system and communicatively coupled to one or more memories to perform the embodiments of the disclosure): 
access a plurality of features derived from entity data relating to an entity, the entity being associated with a plurality of segments (sections 3.1 and 3.3 teach “dataset used in this work records the general electricity consumption in Spain (expressed in MW) over a period of 9 years and 6 months, with a 10 min period between each measurement”, and pre-processing data to extract vectors of data attributes); 
execute a base machine learning (ML) model using the plurality of features, the base ML model being trained to predict entity behavior across the plurality of segments (sections 3.1, 3.3, and Figs. 4-5 teach a learned ANN, where “[t]here is only one node at the output layers, which provides the final results of the network, being it a class label or a numeric value”); 
generate a base classification as an output of the executed base ML model (sections 3.1, 3.3, and Figs. 4-5 teach a learned ANN outputting a prediction, where “[t]here is only one node at the output layers, which provides the final results of the network, being it a class label or a numeric value”); 
execute a plurality of segmented ML models, each segmented ML model being trained to predict entity behavior based on a respective segment from among the plurality of segments (sections 3.1, 3.3, and Figs. 4-5 teach stacking ensemble including a Random Forest of Decision trees, “where each tree is trained separately on a independent randomly selected training set (segmented). It follows that each tree depends on the values of an input dataset sampled independently, with the same distribution for all trees”); 
generate a plurality of segmented classes, each segmented class from among the plurality of segmented classes being an output of a corresponding segmented ML model from among the plurality of segmented ML models (sections 3.1, 3.3, and Figs. 4-5 teach “For classification, each tree in the RF casts a unit vote for the most popular class at input”); 
provide the base class and the plurality of segmented classes as input to a merged model that was trained based on weights for each of the base ML model and the plurality of segmented ML models (section 3.3 and Figs. 4-5 teach “We can see that the training set is used in order to obtain the predictions of the base level, consisting of RF, NN and EVTree (provide the base class and the plurality of segmented classes). The so obtained predictions are then used by (input) the top layer (GBM) (merged model) in order to produce the final predictions for each problem”, wherein the GBM is trained based on the received model data); and 
generate a behavior classification as an output of the merged model, the behavior classification representing a prediction of the entity behavior based on outputs of the base ML model and the plurality of segmented ML models (section 1, 3.3, and Figs. 4-5 teach the GBM outputting “Final Predictions” for “energy consumption”).

Divina at least implies a system for identifying activity classes of entities using machine learning, comprising: a processor programmed to; a method for identifying activity classes of entities using machine learning; a non-transitory computer readable medium storing instructions that, when executed by a processor, programs the processor to, and provide the base class and the plurality of segmented classes as input to a merged model that was trained based on weights for each of the base ML model and the plurality of segmented ML models (see mappings above); however, Kumar teaches a system for identifying activity classes of entities using machine learning, comprising: a processor programmed to; a method for identifying activity classes of entities using machine learning; a non-transitory computer readable medium storing instructions that, when executed by a processor, programs the processor to, and provide the base class and the plurality of segmented classes as input to a merged model that was trained based on weights for each of the base ML model and the plurality of segmented ML models (sections 3.3.4, 3.3.8-3.3.10, 4, and Fig. 7 teach stacked base learners can be decision trees, trained on portioned data subsets, inputting predictions to the met-classifier executed on GPUs known to be executed in a computer system and communicatively coupled to one or more memories).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Kumar’s teachings of meta-classifier inputs directly from different base learners, including decision trees, executed on GPU hardware into Divina‘s teaching of energy consumption stacked ensemble models trained on portions of data in order to increase accuracy of predictions to comparative prediction model methods (Kumar, section 7).

Regarding claims 2 and 10, the combination of Divina and Kumar teach all the claim limitations of claims 1 and 9 above; and further teach wherein the processor is further programmed to: identify one or more segments associated with the entity; and select corresponding ones of the plurality of segmented ML models to execute for the entity based on the identified one or more segments (Divina, sections 3.1, 3.3, and Figs. 4-5 teach “dataset used in this work records the general electricity consumption in Spain (expressed in MW) over a period of 9 years and 6 months, with a 10 min period between each measurement” (identify one or more segments associated with the entity); and a Random Forest of Decision trees, “where each tree is trained separately on a independent randomly selected (alternative identify one or more segments associated with the entity) training set (select corresponding ones of the plurality of segmented ML models to execute for the entity based on the identified one or more segments). It follows that each tree depends on the values of an input dataset sampled independently, with the same distribution for all trees”).

Regarding claims 3 and 11, the combination of Divina and Kumar teach all the claim limitations of claims 1 and 9 above; and further teach wherein the plurality of segmented ML models are each trained based on different period-over-period changes in the entity data over time that define a trend (Divina, sections 1, 2, and 3.1 teach “We use a fixed prediction horizon of four hours, while we vary the historical window size, i.e., the amount of historical data used in order to make the predictions” of “time-series” data with presented “trends” used to train the ensemble models including RF trees (wherein the plurality of segmented ML models are each trained)).

Regarding claims 4 and 12, the combination of Divina and Kumar teach all the claim limitations of claims 3 and 11 above; and further teach wherein the different period-over-period changes comprises a first period and a second period longer than the first period (Divina, sections 1, 2, and 3.1 teach “We use a fixed prediction horizon of four hours, while we vary the historical window size, i.e., the amount of historical data used in order to make the predictions” of “time-series” data, including historical windows “w has been set to the values 24, 48, 72, 96, 120, 144 and 168”).

Regarding claims 5, 13, and 19, the combination of Divina and Kumar teach all the claim limitations of claims 1, 9, and 18 above; and further teach wherein the entity behavior is labeled for training the base ML model and the plurality of segmented ML models based on a first definition that specifies activity of the entity that defines the entity behavior over a first time period and a second definition that specifies activity of the entity that defines the entity behavior over a second time period greater than the first time period (Divina, sections 1, 2, and 3.1 teach “We use a fixed prediction horizon of four hours, while we vary the historical window size, i.e., the amount of historical data used in order to make the predictions” of “time-series” data to train the models. The models including supervised learning models trained over the historical windows, where “w has been set to the values 24, 48, 72, 96, 120, 144 and 168 (first/second period…greater than the first time period)” of power consumption fluctuation values in Spain (activity of the entity that defines the entity behavior)).

Regarding claim 7, the combination of Divina and Kumar teach all the claim limitations of claim 1 above; and further teach wherein the merged model is based on a weight applied to each of: the base ML model and the plurality of segmented ML models (Divina, section 3.3 teaches GBM being trained based on base model values).

Regarding claim 20, the combination of Divina and Kumar teach all the claim limitations of claim 19 above; and further teach wherein the entity behavior comprises an attrition of an activity of the entity (Divina, sections 2, 3.1, 4, and Fig. 11 teach datasets of power consumption fluctuation values in Spain that are shown to decrease (attrition of an activity of the entity)).

Regarding claim 21, the combination of Divina and Kumar teach all the claim limitations of claim 1 above; however, the combination does not explicitly teach wherein the processor is further programmed to: execute the plurality of segmented ML models corresponding to the plurality of segments associated with the entity in parallel during a prediction cycle (Divina, sections 3.1, 3.3, and Figs. 4-5 teach stacking ensemble including a Random Forest of Decision trees, “where each tree is trained separately on a independent randomly selected training set (segmented). It follows that each tree depends on the values of an input dataset sampled independently, with the same distribution for all trees” and outputs their predictions together to create a single output); and
provide outputs of the base ML model and the plurality of segmented ML models generated during the same prediction cycle to the merged model to generate the behavior classification (Divina, sections 1, 3.3, and Figs. 4-5 teach “We can see that the training set is used in order to obtain the predictions of the base level, consisting of RF, NN and EVTree (provide the base class and the plurality of segmented classes). The so obtained predictions are then used by (input) the top layer (GBM) (merged model) in order to produce the final predictions for each problem”, wherein the GBM is trained based on the received model data in parallel).

Claims 6, 8, 14, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Divina et al (“Stacking Ensemble Learning for Short-Term Electricity Consumption Forecasting”, 2018) hereinafter Divina, in view of Kumar et al (“Optimized Stacking Ensemble Learning Model for Breast Cancer Detection and Classification Using Machine Learning”, 2022) hereinafter Kumar, in view of Ding et al (US Pub 20210073686) hereinafter Ding.
Regarding claims 6 and 14, the combination of Divina and Kumar teach all the claim limitations of claims 5 and 13 above; however, the combination does not explicitly teach wherein the entity behavior is labeled for training only when both the first definition and the second definition are satisfied 
Ding teaches wherein the entity behavior is labeled for training only when both the first definition and the second definition are satisfied (paragraph 0022-0023 teach grouping data based on corresponding criteria and then labeling the data before training).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify Divina‘s teaching of energy consumption stacked ensemble models trained on portions of data as modified by Kumar’s teachings of meta-classifier inputs directly from different base learners, including decision trees, executed on GPU hardware, to include data grouping criteria and labeling before model training as taught by Ding in order to achieve increased prediction accuracy from properly labeled data and reduced training times (Ding, paragraphs 0002 and 0022-0023).

Regarding claims 8 and 16, the combination of Divina and Kumar teach all the claim limitations of claims 1 and 9 above; however, the combination does not explicitly teach wherein the processor is further programmed to: provide outputs of the base ML model, the plurality of segmented ML models and the merged ML model to a training subsystem to retrain one or more of the models 
Ding teaches wherein the processor is further programmed to: provide outputs of the base ML model, the plurality of segmented ML models and the merged ML model to a training subsystem to retrain one or more of the models (paragraphs 0038-0046, 0055, and Fig. 3-4 teach a “processor” training the first group of models, then the second group, and then iterating back through training if deemed necessary).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify Divina‘s teaching of energy consumption stacked ensemble models trained on portions of data as modified by Kumar’s teachings of meta-classifier inputs directly from different base learners, including decision trees, executed on GPU hardware, to include data grouping criteria and labeling before model training via a processor as taught by Ding in order to achieve increased prediction accuracy from properly labeled data and reduced training times (Ding, paragraphs 0002 and 0022-0023).

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Divina et al (“Stacking Ensemble Learning for Short-Term Electricity Consumption Forecasting”, 2018) hereinafter Divina, in view of Kumar et al (“Optimized Stacking Ensemble Learning Model for Breast Cancer Detection and Classification Using Machine Learning”, 2022) hereinafter Kumar, in view of Yaqub et al (“Weighted Voting in 3D Random Forest Segmentation”, 2010) hereinafter Yaqub.
Regarding claim 21, the combination of Divina and Kumar teach all the claim limitations of claim 1 above; and further teaches wherein to generate the behavior classification as an output of the merged model, the processor is programmed to: generate a first  as an output of the base ML model and a plurality of  as outputs of corresponding ones of the plurality of segmented ML models (Divina, sections 3.1, 3.3, and Figs. 4-5 teach a learned ANN (base ML model) outputting a prediction, where “[t]here is only one node at the output layers, which provides the final results of the network, being it a class label or a numeric value”; and further “[f]or classification, each tree in the RF (segmented ML models) casts a unit vote for the most popular class at input” and each tree outputting “predictions”);
multiply the first  by a weight associated with the base ML model and multiply each , from among the plurality of , by a corresponding weight associated with the segmented ML models to generate a plurality of weighted  (Divina, sections 3.2-3.3 and Figs. 4-5 teach multiplying an “fj” function (associated weight) to the model outputs to combine the respective predictions of the base models and output a final prediction);
combine the plurality of weighted  to compute an overall  (Divina, sections 3.2-3.3 and Figs. 4-5 teach multiplying an “fj” function to the model outputs to combine the respective predictions of the base models and output a final prediction); and 
generate the behavior classification by comparing the overall  to a threshold value (section 4 and Fig. 11 teach determined ensemble output as kWh over a subset of readings that are compared to real values (threshold) to determine an model output value performance (behavior classification)).
However, while Divina teaches decision trees outputs being multiplied by a factor and the GBM trees outputs being a weighted sum, the combination does not explicitly teach probability/probabilities as claimed. Yaqub teaches probability/probabilities (section 2.4 decision trees in a random forest each outputting “probabilities” and calculating “weighted sum of trees probabilities”)
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify Divina‘s teaching of energy consumption stacked ensemble models trained on portions of data as modified by Kumar’s teachings of meta-classifier inputs directly from different base learners, including decision trees, executed on GPU hardware, to include ensemble models outputting probabilities that are combined via weighted sum as taught by Ding in order to increase model prediction accuracy and reduce training/testing time (Ding, section 2.4).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLINT MULLINAX whose telephone number is 571-272-3241.  The examiner can normally be reached on Mon - Fri 8:00-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/C.M./Examiner, Art Unit 2123                                                                                                                                                                                                        



/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123
Read full office action
Prosecution Timeline

Jan 12, 2023
Application Filed
Sep 29, 2025
Non-Final Rejection — §101, §103
Jan 30, 2026
Applicant Interview (Telephonic)
Jan 30, 2026
Examiner Interview Summary
Feb 02, 2026
Response Filed
Mar 26, 2026
Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/375,973
Patent 12561620
Machine Learning-Based URL Categorization System With Noise Elimination
2y 5m to grant Granted Feb 24, 2026
16/726,709
Patent 12554962
CONFIGURABLE PROCESSOR ELEMENT ARRAYS FOR IMPLEMENTING CONVOLUTIONAL NEURAL NETWORKS
2y 5m to grant Granted Feb 17, 2026
17/230,446
Patent 12547887
SYSTEM FOR DETECTING ELECTRIC SIGNALS
2y 5m to grant Granted Feb 10, 2026
17/367,179
Patent 12518169
SYSTEMS AND METHODS FOR SAMPLE GENERATION FOR IDENTIFYING MANUFACTURING DEFECTS
2y 5m to grant Granted Jan 06, 2026
18/410,742
Patent 12493771
DEEP LEARNING MODEL FOR ENERGY FORECASTING
2y 5m to grant Granted Dec 09, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
48%
Grant Probability
86%
With Interview (+38.3%)
4y 4m
Median Time to Grant
Moderate
PTA Risk
Based on 123 resolved cases by this examiner. Grant probability derived from career allow rate.