Prosecution Insights
Last updated: April 18, 2026
Application No. 17/232,099

Scalable Modeling for Large Collections of Time Series

Final Rejection §103
Filed
Apr 15, 2021
Examiner
WU, NICHOLAS S
Art Unit
2148
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
4 (Final)
47%
Grant Probability
Moderate
5-6
OA Rounds
3y 9m
To Grant
90%
With Interview

Examiner Intelligence

Grants 47% of resolved cases
47%
Career Allow Rate
18 granted / 38 resolved
-7.6% vs TC avg
Strong +43% interview lift
Without
With
+43.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
44 currently pending
Career history
82
Total Applications
across all art units

Statute-Specific Performance

§101
26.7%
-13.3% vs TC avg
§103
52.6%
+12.6% vs TC avg
§102
3.1%
-36.9% vs TC avg
§112
17.4%
-22.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 38 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Arguments Applicant's arguments filed 10/16/2025 have been fully considered but they are not fully persuasive. Regarding the 101 rejections, applicant’s arguments and amendments to the independent claims are persuasive and overcome the previous 101 rejections. Specifically, applicant’s amended limitations determining a hardware computation capability of a computing platform by evaluating a hardware capability and a present computational load of the computing platform; selecting a partition level from the different partition levels, based on the determined hardware computation capability and a modeling accuracy associated with a group focused training of forecasting models; defining one or more modeling tasks based on the selected partition level, each modeling task of the one or more modeling tasks comprising a respective group of time series of the plurality of groups of time series; and executing the group focused training of the forecasting models in parallel on the computing platform, wherein each modeling task of the one or more modeling tasks corresponds to the executing of the group focused training of a respective forecasting model of the forecasting models using all the time series in the respective group of time series provides a technical improvement because forecasting models are executed in parallel at compatible time-series levels to improve load distribution for large amounts of time-series data. See pg. 13-16 of “Remarks”: “The claimed invention is directed to an improved system and method for scalable time series forecasting using large volume of time series data while facilitating reduction in computational overhead, improving computational time and accuracy of model training, and enhancing scalability of the forecasting system. For instance, the amended independent claim 1 recites "partitioning the time series data to generate a plurality of groups of time series ... the generating of the plurality of groups of time series comprises clustering the time series data into a hierarchy of partitions of related one or more time series of the plurality of time series, the hierarchy having different partition levels ... determining a hardware computation capability of a computing platform by evaluating a hardware capability and a present computational load of the computing platform ... selecting a partition level from the different partition levels, based on the determined hardware computation capability and a modeling accuracy associated with a group focused training of forecasting models ... executing the group focused training of the forecasting models in parallel on the computing platform ... each modeling task of the one or more modeling tasks corresponds to the executing of the group focused training of a respective forecasting model of the forecasting models using all the time series in the respective group of time series." The Applicant's Specification describes, for example, "[a]ccordingly, traditional computing systems cannot efficiently accommodate (if at all) the training and use of models based on a large volume of time series data. Furthermore, using the entire available time series data to fit a model may involve an overall large and complex model, further exacerbating scalability ... computing platforms may not have sufficient computational resources to perform the calculations and/or it may take too long to receive forecast results ... [t]he industry struggles to scale forecasting to the large number of time series that may be available and typically sacrifices accuracy of forecasting models... the teachings herein make the forecasting for large numbers of time series and large data ...both scalable and effective (i.e., computationally feasible on a given computing platform ... improving accuracy thereof by automatically determining an appropriate partition level of time series to perform cross-series modeling in parallel, where each partition forms a forecasting task that can be run in parallel ... [b]y virtue of distributing the computational load represented by the groups of time series data, the processing time is reduced while the accuracy is also potentially improved by enabling focused models per group ... efficiency engine 103 is configured to automatically partition time series data to create tasks to run in parallel on one or more computing devices. In one aspect, modeling across multiple series (e.g., vs. training a single model per series) provides an improvement both in terms of scalability and performance for machine-learning (ML) and deep learning (DL) based modelling ... [i]n one embodiment, the efficiency engine 103 performs a test by performing partial modeling for a subset (e.g., one or more groups) from each level from the set of candidate levels (in parallel) to test accuracy and computation time for each. In this way, the computational capability is determined. Upon determining the computational capability of the computing device performing the processing of the time series data, a partitioning level is selected that can accommodate the processing of the time series data in a predetermined time period and a predetermined threshold accuracy... that a level 2 partition (which includes group 1 (e.g., 215) and group 2 (e.g., 209) as 2 different groups in the partition to be modeled separately and simultaneously) provides the better accuracy and efficiency - and this can be based on simply testing a subset of groups at the level (e.g., level 2 being tested) initially - such as group 1 (e.g., 215) only, for a subset of modeling configurations, and comparing to similar tests at other levels." See at ⁋⁋ [0004], [0026], [0027], [0030], [0035], [0036], and [0045] of the Specification as originally filed (emphasis added). The claimed method addresses the problems associated with forecasting large volumes of time series data, particularly the inefficiencies and inaccuracies present in traditional modeling methods that cannot scale effectively, by utilizing a hierarchical partitioning mechanism that dynamically organizes time series data into related groups based on computational capabilities, allowing for parallel execution of modeling tasks, thereby efficiently managing the scaling of time series data. Further, the claimed invention addresses the problems associated with computationally intensive training of time series data, by determining the computational feasibility of each partition level based on historical performance data and screening out the partition levels that are likely to exceed resource limits or lead to inefficiencies. Furthermore, the group-focused training of forecasting models by organizing multiple time series data into related groups allows the forecasting models to learn from shared characteristics and group specific patterns within these groups resulting in better generalization of forecasting models across similar datasets, thereby enhancing the accuracy of forecasting models. (applicant emphasis added).” Applicant’s amendments and corresponding arguments that the claimed invention provides a technical improvement to the field of time-series forecasting are persuasive. Therefore, the 101 rejections are withdrawn. Regarding the 103 rejections, applicant's arguments filed with respect to the prior art rejections have been fully considered but they are moot. Applicant has amended the claims to recite new combinations of limitations. Applicant's arguments are directed at the amendment. Please see below for new grounds of rejection, necessitated by Amendment. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-2, 6-12, and 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Leonard, et al., US Pre-Grant Publication 2015/0052173A1 (“Leonard”) in view of Panda, US Pre-Grant Publication 2019/0050763A1 (“Panda”) and further in view of Amiri, et al., US Pre-Grant Publication 2021/0303969A1 (“Amiri”). Regarding claim 1 and analogous claims 11 and 20, Leonard discloses: A computing device, comprising: a processor; a network interface coupled to the processor to enable communication over a network; a storage device coupled to the processor; and an engine stored in the storage device, wherein an execution of the engine by the processor configures the computing device to perform acts comprising: (Leonard, ⁋25, “FIG. 1 illustrates a computing device 100 configured to use the techniques described in this disclosure to efficiently assemble and store time series in a manner prescribed by a hierarchical schema. The computing device 100 is configured to operate within a grid-computing system that includes multiple computing devices configured similarly or identically to the computing device 100 shown in FIG. 1 [a network interface coupled to the processor to enable communication over a network;].”, and Leonard, ⁋27, “As depicted in FIG. 1, a computing device 100 includes a processor 102 [A computing device, comprising: a processor;], random access memory 104, and memory 106 [a storage device coupled to the processor;]. Memory 106 may be used to store software 108 executed by the processor 102 [and an engine stored in the storage device,].”, and Leonard, ⁋28, “The software 108 can be analytical, statistical, scientific, or business analysis software, or any other software with functionality for assembling time series from unstructured data entries and storing the time series in RAM 104 as prescribed by a hierarchical schema. The software 108 may also provide functionality for performing repeated time series forecasting based on any of the time series stored in RAM 104. When executed, the software 108 causes the processor 102 to access a hierarchy schema 115 [wherein an execution of the engine by the processor configures the computing device to perform acts comprising:].”). receiving time series data comprising a plurality of time series; (Leonard, ⁋21, “This disclosure describes a grid-computing system for time series data warehousing [receiving time series data comprising a plurality of time series;], forecasting and forecast analysis that includes multiple grid-computing devices.”). partitioning the time series data to generate a plurality of groups of time series, wherein each group of time series of the plurality of groups of time series comprises respective one or more time series of the plurality of time series, the generating of the plurality of groups of time series comprises clustering the time series data into a hierarchy of partitions of related one or more time series of the plurality of time series, the hierarchy having different partition levels, and the related one or more time series correspond to the respective one or more time series of the plurality of time series; (Leonard, ⁋42, “In FIG. 3, the depicted schema defines four hierarchical levels, each of which is associated with the storage of time series characterized by a level of granularity or specificity particular to the level [partitioning the time series data to generate a plurality of groups of time series, wherein each group of time series of the plurality of groups of time series comprises respective one or more time series of the plurality of time series,]. The four levels of the hierarchy are represented by the boxes 202, 204, 206 and 208 [the generating of the plurality of groups of time series comprises clustering the time series data into a hierarchy of partitions of related one or more time series of the plurality of time series, the hierarchy having different partition levels,]. The schema calls for eight time series (224-238) at the lowest level (leaf level) of the hierarchy to be assembled such that each of these time series will provide information that is more specific than all other time series in the hierarchy. The lowest level of the hierarchy is represented by the box at 208, which describes the context of the time series 224-238 associated with that level [and the related one or more time series correspond to the respective one or more time series of the plurality of time series;].”). determining a hardware computation capability of a computing platform by evaluating a hardware capability and a present computational load of the computing platform; (Leonard, ⁋23, “the schema specifies parent-child relationships between related time series at adjacent hierarchy levels. Thus, the schema itself may be conceptualized as a tree-structured framework that establishes processing assignments, data relationships and storage locations. The grid-computing devices use the schema as a guide for assembling time series and sharing time series information with other grid-computing devices in the grid-computing system; the schema is interpreted as determining the hardware computational capability of the computing platform as it guides the processing assignments/selections for the different computing nodes (i.e. determining a hardware computation capability of a computing platform by evaluating a hardware capability).”, and Leonard, ⁋41, “hierarchical schema can be the blueprint for assembling, storing and using a time series data hierarchy in a distributed computing system that incorporates load-sharing to parallelize some of the processing involved in generating a time series data hierarchy [and a present computational load of the computing platform;].”). selecting a partition level from the different partition levels, based on the determined hardware computation capability… (Leonard, ⁋23, “the schema specifies parent-child relationships between related time series at adjacent hierarchy levels. Thus, the schema itself may be conceptualized as a tree-structured framework that establishes processing assignments, data relationships and storage locations. The grid-computing devices use the schema as a guide for assembling time series and sharing time series information with other grid-computing devices in the grid-computing system; the schema is interpreted as selecting a partition level based on a hardware capability because the schema assigns nodes to different hierarchy levels based on processing assignments (i.e. selecting a partition level from the different partition levels, based on the determined hardware computation capability…).”). defining one or more modeling tasks based on the selected partition level, each modeling task of the one or more modeling tasks comprising a respective group of time series of the plurality of groups of time series; (Leonard, ⁋38, “Each grid-computing device 100G then uses the hierarchical schema to guide operations that involve assembling a subset of the leaf-level time series specified by the schema, with the assembling being based on the information in its delimited portion of the data set [defining one or more modeling tasks based on the selected partition level,].”, and Leonard, ⁋22, “any grid-computing device in the grid-computing system may be used to forecast future observations of any individual time series that it stores [each modeling task of the one or more modeling tasks comprising a respective group of time series of the plurality of groups of time series;]. As a result of the distributed storage framework and because the time series data are stored in volatile memory locations, such as RAM, the data can be quickly accessed and processed, thereby decreasing time delays entailed by generating numerous forecasts.”). Leonard does not explicitly teach: …and a modeling accuracy associated with a group focused training of forecasting models; and executing the group focused training of the forecasting models in parallel on the computing platform, wherein each modeling task of the one or more modeling tasks corresponds to the executing of the group focused training of a respective forecasting model of the forecasting models using all the time series in the respective group of time series. Panda teaches and a modeling accuracy associated with a group focused training of forecasting models; (Panda, ⁋29, “At step 410, the one or more processors 102 in conjunction with the model fitting module 108 are configured to determine the best fit models for the plurality of time series placed at successive higher cluster heights of the branch [and a modeling accuracy associated with a group focused training of forecasting models;]. This is performed by iterating steps of determining of the best fit model for the first time series and the second time series at every cluster height of the branch.”). Leonard and Panda are both in the same field of endeavor (i.e. time-series forecasting). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Leonard and Panda to teach the above limitation(s). The motivation for doing so is that finding the best fit model for different time-series levels improves the robustness of forecasting accuracy across different time-series data (cf. Panda, ⁋4, “a good balance needs to be sought between identifying best fit model for each time series and identifying common best fit models for plurality of series so as to achieve good time efficiency during model fitting along with good forecast accuracy.”). Leonard in view of Panda does not explicitly teach: and executing the group focused training of the forecasting models in parallel on the computing platform, wherein each modeling task of the one or more modeling tasks corresponds to the executing of the group focused training of a respective forecasting model of the forecasting models using all the time series in the respective group of time series. Amiri teaches and executing the group focused training of the forecasting models in parallel on the computing platform, wherein each modeling task of the one or more modeling tasks corresponds to the executing of the group focused training of a respective forecasting model of the forecasting models using all the time series in the respective group of time series. (Amiri, ⁋91, “The output-mixer architecture also has multiple parallel DNN forecasters 76, 86, allowing for parallel implementation during training and prediction [and executing the group focused training of the forecasting models in parallel on the computing platform,]. Also, the output-mixer architecture has more degrees of freedom with which to increase the capacity. For instance, the capacity of each DNN forecaster 76, 86 can be tailored to the forecast of its input time-series; tailoring the capacity for its respective time-series data is interpreted as using all of the time-series data (i.e. wherein each modeling task of the one or more modeling tasks corresponds to the executing of the group focused training of a respective forecasting model of the forecasting models using all the time series in the respective group of time series.). Also, the capacity of the whole network can be easily made K higher than the input-mixer architecture as it has K DNN forecasters 76, 86, as compared to L DNN forecasters 58 in the embodiment of the input-mixer architecture of FIG. 2B.”). Leonard, in view of Panda, and Amiri are both in the same field of endeavor (i.e. time-series forecasting). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Leonard, in view of Panda, and Amiri to teach the above limitation(s). The motivation for doing so is that using multiple parallel forecasters improves prediction accuracy (cf. Amiri, ⁋27, “However, the capacity of the DNN of the present disclosure can be increased by enabling it to devise a “separate” time waveform for each forecasted data point instead of providing a “common” time waveform for all the data points as is done in conventional systems. The forecasting models of the present disclosure are able to improve prediction performance, even on a dataset having complicated or only partially available periodic patterns.”). Regarding claim 2 and analogous claim 12, Leonard in view of Panda and Amiri teaches the computing device of claim 1. Leonard further teaches wherein each partition level of the different partition levels includes respective one or more groups of time series of the plurality of groups of time series. (Leonard, ⁋42, “In FIG. 3, the depicted schema defines four hierarchical levels, each of which is associated with the storage of time series characterized by a level of granularity or specificity particular to the level [includes respective one or more groups of time series of the plurality of groups of time series.]. The four levels of the hierarchy are represented by the boxes 202, 204, 206 and 208 [wherein each partition level of the different partition levels].”). Regarding claim 6 and analogous claim 15, Leonard in view of Panda and Amiri teaches the computing device of claim 1. Panda further teaches wherein the selection of the partitioning level is based on a highest time efficiency, among a plurality of time efficiencies associated with the group focused training of the forecasting models, for a predetermined accuracy associated with the group focused training of the forecasting models. (Panda, ⁋43-⁋49, “Overall time taken for model fitting in traditional way is as below: The total time to fit the model to all time series=No of TS * average time to fit one model*No of models=100*0.8*5 =400 min [0045] Overall time taken for model fitting in our approach For each cluster, x % of TSs require model fitting. For one cluster, for first TS (TS at lowest level or lowest cluster height) all models are evaluated. Time taken=0.8*5=4 mins As per assumption, 40% of the TSs require model evaluation as the error may be above the ET threshold and the ED is above ED threshold; model fitting is interpreted as the selection of the partition level because the model’s error dictates whether the model is the best fit for the time series data, at a current level, is appropriate. The best fit model is interpreted as the highest time efficiency at a predetermined accuracy because it is the model selected that satisfies the error threshold requirements with the current time series data (i.e. wherein the selection of the partitioning level is based on a highest time efficiency, among a plurality of time efficiencies associated with the group focused training of the forecasting models, for a predetermined accuracy associated with the group focused training of the forecasting models.). So time required for model fitting=0.4*20*4=32 min and the evaluated models are fitted to the rest of the TS. Thus for 5 clusters or 5 branches total time required=5*32=160 mins Saving in the time compared to traditional approach=(400−160)/400=60%”). It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Panda with the teachings of Leonard and Amiri for the same reasons disclosed in claim 1. Regarding claim 7 and analogous claim 16, Leonard in view of Panda and Amiri teaches the computing device of claim 1. Panda further teaches wherein the selection of the partitioning level is based on a highest accuracy, among a plurality of accuracies associated with the group focused training of the forecasting models, for a predetermined time efficiency associated with the group focused training of the forecasting models. (Panda, ⁋43-⁋49, “Overall time taken for model fitting in traditional way is as below: The total time to fit the model to all time series=No of TS * average time to fit one model*No of models=100*0.8*5 =400 min Overall time taken for model fitting in our approach For each cluster, x % of TSs require model fitting. For one cluster, for first TS (TS at lowest level or lowest cluster height) all models are evaluated. Time taken=0.8*5=4 mins As per assumption, 40% of the TSs require model evaluation as the error may be above the ET threshold and the ED is above ED threshold; model fitting is interpreted as the selection of the partition level because the model’s error dictates whether the model is the best fit for the time series data, at a current level, is appropriate (i.e. wherein the selection of the partitioning level is based on a highest accuracy, among a plurality of accuracies associated with the group focused training of the forecasting models,). So time required for model fitting=0.4*20*4=32 min [for a predetermined time efficiency associated with the group focused training of the forecasting models.] and the evaluated models are fitted to the rest of the TS. Thus for 5 clusters or 5 branches total time required=5*32=160 mins Saving in the time compared to traditional approach=(400−160)/400=60%”). It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Panda with the teachings of Leonard and Amiri for the same reasons disclosed in claim 1. Regarding claim 8 and analogous claim 17, Leonard in view of Panda and Amiri teaches the computing device of claim 1. Amiri further teaches wherein, for each modeling task of the one or more modeling tasks, a cross-time-series modeling is performed, at the selected partition level, in parallel. (Amiri, ⁋90-91, “The embodiments of the DNN routines 70, 80 of FIGS. 3-4 are DNN multi-variate forecasters; multi-variate forecasters are interpreted as cross-time-series modeling (i.e. wherein, for each modeling task of the one or more modeling tasks, a cross-time-series modeling is performed,) with an output mixer (e.g., DNN mixer 72, 82). The DNN routines 70, 80 with the output-mixer architecture may have some advantages over the input-mixer architecture of the embodiments of FIGS. 2A-2B. For example, the entirety of historical time-series information is included in the time-series forecast, which may result in better performance of the output-mixer architecture, especially if the input time-series are only weakly correlated. [0091] The output-mixer architecture also has multiple parallel DNN forecasters 76, 86, allowing for parallel implementation during training and prediction [is at the selected partition level, in parallel.]. Also, the output-mixer architecture has more degrees of freedom with which to increase the capacity. For instance, the capacity of each DNN forecaster 76, 86 can be tailored to the forecast of its input time-series. Also, the capacity of the whole network can be easily made K higher than the input-mixer architecture as it has K DNN forecasters 76, 86, as compared to L DNN forecasters 58 in the embodiment of the input-mixer architecture of FIG. 2B.”). It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Amiri with the teachings of Leonard and Panda for the same reasons disclosed in claim 1. Regarding claim 9 and analogous claim 18, Leonard in view of Panda and Amiri teaches the computing device of claim 1. Leonard further teaches wherein the clustering of the time series data is performed by a domain-based and/or a semantic model-based clustering. (Leonard, ⁋66, “The grid-computing system described herein can partition a multi-dimensional data set using a technique that will be described as group-by partitioning. Group-by partitioning involves performing preliminary sorting to identify group-by subsets of the data set. A group-by subset can refer to, for example, a group of multi-dimensional entries in which the entries hold the same data with respect to a first variable dimension, as well as the same data with respect to a second variable dimension [wherein the clustering of the time series data is performed by a domain-based and/or a semantic model-based clustering.].”). Regarding claim 10 and analogous claim 19, Leonard in view of Panda and Amiri teaches the computing device of claim 1. Leonard further teaches wherein: the computing platform comprises a plurality of computing nodes, and the determination of the hardware computation capability of the computing platform is performed separately for each computing node of the plurality of computing nodes. (Leonard, ⁋23, “the schema specifies parent-child relationships between related time series at adjacent hierarchy levels. Thus, the schema itself may be conceptualized as a tree-structured framework that establishes processing assignments, data relationships and storage locations. The grid-computing devices [wherein: the computing platform comprises a plurality of computing nodes,] use the schema as a guide for assembling time series and sharing time series information with other grid-computing devices in the grid-computing system [and the determination of the hardware computation capability of the computing platform].”, and Leonard, ⁋62, “the data can be prepared by being partitioned such that each grid-computing device stores and then works on an exclusive portion of the data that need not be stored or processed by any other device in the system [is performed separately for each computing node of the plurality of computing nodes.].”). Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Leonard, et al., US Pre-Grant Publication 2015/0052173A1 (“Leonard”) in view of Panda, US Pre-Grant Publication 2019/0050763A1 (“Panda”) and further in view of Amiri, et al., US Pre-Grant Publication 2021/0303969A1 (“Amiri”) and Rath, et al., US Pre-Grant Publication 2020/0004449A1 (“Rath”). Regarding claim 3, Leonard in view of Panda and Amiri teaches the computing device of claim 2. However, the combination does not explicitly teach wherein each group of time series of the respective one or more groups of time series includes a same number of time series of the plurality of time series. Rath further teaches wherein each group of time series of the respective one or more groups of time series includes a same number of time series of the plurality of time series. (Rath, see Figure 3, As seen in Figure 3, each level has at least one partition of the time series with a partition having at least one time series data point which is interpreted as wherein each group of time series having a same number of time series (i.e. wherein each group of time series of the respective one or more groups of time series includes a same number of time series of the plurality of time series.)). Leonard, in view of Panda and Amiri, and Rath are both in the same field of endeavor (i.e. time-series clustering). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Leonard, in view of Panda and Amiri, and Rath to teach the above limitation(s). The motivation for doing is that clustering groups of data to have similar amounts of data improves querying filtering (cf. Rath, ⁋14, “Among other benefits, the clustered storage of the data at the physical storage resources can reduce an amount of data that needs to be filtered by many types of queries, thereby improving the performance of any applications or processes that rely on querying the data.”). Claims 4 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Leonard, et al., US Pre-Grant Publication 20150052173A1 (“Leonard”) in view of Panda, US Pre-Grant Publication 2019/0050763A1 (“Panda”) and further in view of Amiri, et al., US Pre-Grant Publication 2021/0303969A1 (“Amiri”) and McGrath, et al., US Pre-Grant Publication 2020/0296155A1 (“McGrath”). Regarding claim 4 and analogous claim 13, Leonard in view of Panda and Amiri teaches the computing device of claim 1. However, the combination does not explicitly teach wherein the determination of the hardware computation capability comprises receiving the hardware computation capability from a reference database. McGrath teaches wherein the determination of the hardware computation capability comprises receiving the hardware computation capability from a reference database. (McGrath, ⁋228, “Referring still to FIG. 20, the identification at 2018 may happen through use by the orchestrator of resource landscape data from resource landscape database 2020 [wherein the determination of the hardware computation capability comprises receiving the hardware computation capability from a reference database.] regarding edge nodes within the edge computing system of the orchestrator. The resource landscape data at 2020 is to include at least information regarding relevant compute resources of edge nodes that would allow identification of the edge nodes as compliant with workload compute resource requirements specified in a VIFD.”). Leonard, in view of Panda and Amiri, and McGrath are both in the same field of endeavor (i.e. distributed computing). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Leonard, in view of Panda and Amiri, and McGrath to teach the above limitation(s). The motivation for doing so is that accessing a database of node capabilities improves the selection of the applicable node for the corresponding workload (cf. Mcgrath, ⁋227, “Based on the workload compute resource requirements determined from parsing relevant portions of the VIFD 2012, the orchestrator may then identify at 2018 a set of candidate target edge nodes in the available infrastructure of the edge computing system that are compliant with the workload compute resource requirements and networking requirements specified in within VIFD 2012.”). Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Leonard, et al., US Pre-Grant Publication 20150052173A1 (“Leonard”) in view of Panda, US Pre-Grant Publication 2019/0050763A1 (“Panda”) and further in view of Amiri, et al., US Pre-Grant Publication 2021/0303969A1 (“Amiri”) and Ciarlini, et al., US Patent Publication 10339235B1 (“Ciarlini”). Regarding claim 5 and analogous claim 14, Leonard in view of Panda and Amiri teaches the computing device of claim 1. However, the combination does not explicitly teach wherein the determination of the hardware computation capability comprises performing an initial approximation by performing partial modeling of the plurality of groups of time series at the different partition levels on the computing platform. Ciarlini teaches wherein the determination of the hardware computation capability comprises performing an initial approximation by performing partial modeling of the plurality of groups of time series at the different partition levels on the computing platform. (Ciarlini, col. 6-7, “When better performance is desired, more working compute nodes can be used [wherein the determination of the hardware computation capability] to execute the first learning stage 220. In this way, groups 220 can be smaller but redundancy of the inclusion of time series 210 in different groups 220 can improve the accuracy [of the plurality of groups of time series]. If it is necessary to improve performance even further, due to the number of time series 210, the procedure can be generalized by creating multiple hierarchical learning stages [at the different partition levels on the computing platform.], as would be apparent to a person of ordinary skill in the art. In this case, there is a plurality of hierarchical learning levels to generate the final model and in each intermediate level of the hierarchy, intermediate compute nodes execute both the roles of master compute node for compute nodes of the lower hierarchical level and working compute nodes for the compute nodes of the upper hierarchical level. Intermediate compute nodes receive selected variables and scores from lower-level compute nodes and perform the following steps: rank the variables; select a pre-defined number of variables based on their scores to be considered as input for the generation of an intermediate linear model using an Orthogonal Matching Pursuit algorithm; assign a score to each variable of the intermediate model; and provide such variables and their corresponding scores to the upper level in the hierarchy [comprises performing an initial approximation by performing partial modeling].”). Leonard, in view of Panda and Amiri, and Ciarlini are both in the same field of endeavor (i.e. time-series processing). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Leonard, in view of Panda and Amiri, and McGrath to teach the above limitation(s). The motivation for doing so is that partial modeling at different levels improves the overall system performance as the amount of time-series data scales (cf. Ciarlini, col. 6 lines 52-59, “When better performance is desired, more working compute nodes can be used to execute the first learning stage 220. In this way, groups 220 can be smaller but redundancy of the inclusion of time series 210 in different groups 220 can improve the accuracy. If it is necessary to improve performance even further, due to the number of time series 210, the procedure can be generalized by creating multiple hierarchical learning stages”). Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS S WU whose telephone number is (571)270-0939. The examiner can normally be reached Monday - Friday 8:00 am - 4:00 pm EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle Bechtold can be reached on 571-431-0762. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /N.S.W./Examiner, Art Unit 2148 /MICHELLE T BECHTOLD/Supervisory Patent Examiner, Art Unit 2148
Read full office action

Prosecution Timeline

Apr 15, 2021
Application Filed
May 20, 2024
Non-Final Rejection — §103
Sep 03, 2024
Response Filed
Nov 01, 2024
Final Rejection — §103
Dec 31, 2024
Response after Non-Final Action
Feb 05, 2025
Request for Continued Examination
Feb 09, 2025
Response after Non-Final Action
Jul 08, 2025
Non-Final Rejection — §103
Oct 16, 2025
Response Filed
Jan 13, 2026
Final Rejection — §103
Mar 18, 2026
Interview Requested
Mar 24, 2026
Examiner Interview Summary
Mar 26, 2026
Response after Non-Final Action

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12488244
APPARATUS AND METHOD FOR DATA GENERATION FOR USER ENGAGEMENT
2y 5m to grant Granted Dec 02, 2025
Patent 12423576
METHOD AND APPARATUS FOR UPDATING PARAMETER OF MULTI-TASK MODEL, AND STORAGE MEDIUM
2y 5m to grant Granted Sep 23, 2025
Patent 12361280
METHOD AND DEVICE FOR TRAINING A MACHINE LEARNING ROUTINE FOR CONTROLLING A TECHNICAL SYSTEM
2y 5m to grant Granted Jul 15, 2025
Patent 12354017
ALIGNING KNOWLEDGE GRAPHS USING SUBGRAPH TYPING
2y 5m to grant Granted Jul 08, 2025
Patent 12333425
HYBRID GRAPH NEURAL NETWORK
2y 5m to grant Granted Jun 17, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
47%
Grant Probability
90%
With Interview (+43.1%)
3y 9m
Median Time to Grant
High
PTA Risk
Based on 38 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month