Last updated: April 19, 2026
Application No. 17/850,597
SYSTEM AND METHOD FOR REDUCTION OF DATA TRANSMISSION BY OPTIMIZATION OF INFERENCE ACCURACY THRESHOLDS

Final Rejection §103
Filed
Jun 27, 2022
Examiner
HU, SELINA ELISA
Art Unit
2193
Tech Center
2100 — Computer Architecture & Software
Assignee
DELL PRODUCTS, L.P.
OA Round
2 (Final)
Interview Optional

— +100.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 3 resolved cases, 2023–2026
Examiner Intelligence

HU, SELINA ELISA View full profile →
Grants 67% — above average
Career Allow Rate
2 granted / 3 resolved
+11.7% vs TC avg
Strong +100% interview lift
Without
With
+100.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
32 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
24.4%
-15.6% vs TC avg
§103
53.5%
+13.5% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
10.1%
-29.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 3 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to applicant’s amendment filed on 02/03/2026.
Claims 1-20 are pending and examined.

Response to Arguments
Applicant's arguments filed 02/03/2026 with respect to 35 U.S.C. 103 have been fully considered but they are not persuasive. Applicant argued that “neither Elkabetz nor Lee teaches or suggests any involvement of "simulation of an operation with different levels of error" by the downstream consumer in the obtaining of acceptable error level. Both Elkabetz and Lee are silent about a) "providing the plurality of synthetic data sets to a downstream consumer," b) "simulation of an operation with different levels of error," and c) "receiving, from the downstream consumer, a message that summarizes the operation." Therefore, Elkabetz and Lee fail to teach or suggest the limitations at issue.” Examiner respectfully disagrees, see 35 U.S.C. 103 rejections below for a detailed analysis. 
Examiner interprets Elkabetz’s collected and processed data, which can be generated, being stored and used by cadence and tile layer structures of forecasting components as providing the plurality of synthetic data sets to a downstream consumer of the aggregated data. Additionally, Elkabetz’s forecasting components using comparison tests for forecast data in each tile being based on factors which include the confidence in a particular data source correlates to providing the plurality of synthetic data sets to a downstream consumer of the aggregated data for simulation of an operation with different levels of error. As discussed in the previous office action, Elkabetz’s desired or required accuracy of forecasting results and confidence in a data source from which the collected data tile layer originated from correlates to obtaining an acceptable error level for a downstream customer of the aggregated data based on the data set. While Elkabetz does not explicitly teach that the acceptable error level is based on the message as recited in the amended claims, data sets are a popular component of messages as evidenced by Lee’s end device requesting the server for an inference and providing collected status and sensing data to the server. 
Lastly, Lee’s server setting priorities of an inference request received from an end device based on the received status information, which includes the purpose of the device and kind of data, is interpreted as receiving, from the downstream consumer, a message that summarizes the operation. Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with Lee because new machine learning models can be created by training modules to be specifically adapted to the current status of the end device based on received status information and training datasets. These created machine learning modules can be renewed at caching modules prior to being sent to the end devices. Additionally, priorities of inference requests can be determined based on received status information to determine whether a request requires urgent processing of data.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-7 and 12-20 are rejected under 35 U.S.C. 103 as being unpatentable over Elkabetz et al. (WO Patent No. WO 2019126707 A1), hereinafter “Elkabetz” in view of Lee et al. (U.S. Patent No. US 20200219015 A1), hereinafter “Lee” and Sankar et al. (U.S. Patent No. US 20120185441 A1), hereinafter “Sankar.”

	With regards to Claim 1, Elkabetz teaches:
	A method for aggregating data in a data aggregator of a distributed environment using data collected by a data collector of the distributed environment, the data collecting being remote to the data aggregator (Paragraphs 174 and 276, “According to the described technology, each exemplary server may be implemented as an individual computer system, a collection of computer systems, a collection of processors, or the like, either tightly or loosely clustered, a set of interacting computer systems that are not clustered, or other arrangement as deemed appropriate by those with skill in the art. Computer systems can be implemented using virtual or physical deployments, or by using a combination of these means. In some implementations, the servers may be physically located together, or they may be distributed in remote locations, such as in shared hosting facilities or in virtualized facilities (e.g.“the cloud”)… Referring again to Figure 2, the first server (including a data processing component) is an information collection and normalization component (310) that interacts with external data sources (340-348) and collects relevant information from these data sources for use by the precipitation modeling and forecasting system (300). The Information Collection and Normalization Server (310) provides asynchronous data collection processes involving communicating with the data sources, pre-processing the collected data, and making the collected and processed data available for the forecasting and modelling components of the system.” The computer systems or servers being distributed in remote locations correlates to a distributed environment. The first server including an information collection and normalization component collecting relevant information from external data sources correlates to a data aggregator of a distributed environment aggregating data. The data sources being external from the first server correlates to a data collected by a data collector being remote to the data aggregator), the method comprising:
obtaining, by the data aggregator, a plurality of synthetic data sets (Paragraphs 187 and 276, “The data that is generated and stored includes collected data, data derived or calculated from the collected data, forecast data generated during the forecast cycles of the cadence instance, and information that is further generated or derived from the forecast data… Referring again to Figure 2, the first server (including a data processing component) is an information collection and normalization component (310) that interacts with external data sources (340-348) and collects relevant information from these data sources for use by the precipitation modeling and forecasting system (300). The Information Collection and Normalization Server (310) provides asynchronous data collection processes involving communicating with the data sources, pre-processing the collected data, and making the collected and processed data available for the forecasting and modelling components of the system. In some embodiments, additional processing to produce generated data from the collected and processed data is performed on the server.” The first server interacting with external data sources and collecting relevant information from the data sources correlates to a data aggregator obtaining a plurality of data sets. The additional processing done on collected and processed data to produce, derive, or calculate generated data correlates to a plurality of synthetic data sets);
providing the plurality of synthetic data sets to a downstream consumer of the aggregated data for simulation of an operation with different levels of error (Paragraphs 187, 279, and 472, “The data that is generated and stored includes collected data, data derived or calculated from the collected data, forecast data generated during the forecast cycles of the cadence instance, and information that is further generated or derived from the forecast data… The information collection and normalization server stores the collected data and processed collected data into one or more databases tagged in ways that automatically associate the stored data with the cadence and tile layer structures used by the forecasting components… If the source forecast data passes the comparison testing based one or more criteria, for example if source forecast tile layer differs from the collected data tile layer by less than a threshold amount, then the source forecast tile layer is verified and may be copied directly to the target forecast tile layer. Thresholds used in comparison tests are based on one or more factors including the desired or required accuracy of forecasting results and confidence in a data source. For example, a ground station precipitation measurement may have higher accuracy or fidelity than a radar-based precipitation estimate; therefore tiles of a ground station precipitation tile layer have a higher confidence score than tiles of a corresponding radar-based precipitation layer.” The collected and processed data, which can be generated, being stored and used by cadence and tile layer structures of forecasting components correlates to providing the plurality of synthetic data sets to a downstream consumer of the aggregated data. The forecasting components using comparison tests for forecast data in each tile being based on factors which include the confidence in a particular data source correlates to providing the plurality of synthetic data sets to a downstream consumer of the aggregated data for simulation of an operation with different levels of error);
obtaining an acceptable error level for the downstream consumer of the aggregated data based on the data set (Paragraphs 471-472, “The source forecast data is verified by using one or more forecast comparison tests that compare the source forecast data to the selected collected data tile layer and to determine the significance of any differences between the compared tile layers. If the source forecast data passes the comparison testing based one or more criteria, for example if source forecast tile layer differs from the collected data tile layer by less than a threshold amount, then the source forecast tile layer is verified and may be copied directly to the target forecast tile layer. Thresholds used in comparison tests are based on one or more factors including the desired or required accuracy of forecasting results and confidence in a data source. For example, a ground station precipitation measurement may have higher accuracy or fidelity than a radar-based precipitation estimate; therefore tiles of a ground station precipitation tile layer have a higher confidence score than tiles of a corresponding radar-based precipitation layer.” The desired or required accuracy of forecasting results and confidence in a data source from which the collected data tile layer originated from correlates to obtaining an acceptable error level for a downstream customer of the aggregated data based on the data set);
utilizing the acceptable error level as a threshold for inference accuracy, the threshold being associated with the downstream consumer (Paragraph 472, “If the source forecast data passes the comparison testing based one or more criteria, for example if source forecast tile layer differs from the collected data tile layer by less than a threshold amount, then the source forecast tile layer is verified and may be copied directly to the target forecast tile layer. Thresholds used in comparison tests are based on one or more factors including the desired or required accuracy of forecasting results and confidence in a data source. For example, a ground station precipitation measurement may have higher accuracy or fidelity than a radar-based precipitation estimate; therefore tiles of a ground station precipitation tile layer have a higher confidence score than tiles of a corresponding radar-based precipitation layer.” The threshold used in comparison tests for forecast data being based on one or more factors which include the desired or required accuracy of forecasting results and confidence in a data source correlates to utilizing the acceptable error level as a threshold for inference accuracy. The desired or required accuracy of forecasting results used in the threshold correlates to the threshold being associated with the downstream customer);
obtaining an inference model based on the threshold (Paragraphs 448 and 521, “The ML model validation module (679) retrieves a trained ML model from the system database (320), retrieves evaluation data (i.e. testing and validation data) from the ML training data store, and performs testing and validation operations using the trained model and the retrieved testing and validation data. In some exemplary embodiments, the ML validation module generates a quality metric, e.g., a model accuracy or performance metric such as variance, mean standard error, receiver operating characteristic (ROC) curve, or precision-recall (PR) curve, associated with the trained ML model. For example, the ML model validation model generates the quality metric by executing the model and comparing predictions generated by the model to observed outcomes… In some embodiments, the ML training module retrains a trained ML model if the system determines that an associated quality metric has deteriorated below a threshold amount… Based upon its configuration settings, the ML model execution module retrieves ML model input data, for example processed collected and generated collected weather parameter data and historical forecast data, from one or more systems database(s). The ML model execution module executes the trained ML model using the input data to generate ML model output data which is stored in the system database (320). The ML model output data can include weather parameter estimates, predictions, and forecasts, depending on the ML model that is executed, and is saved to a tile layer of a cadence instance tile stack.” The ML model validation module retrieving a trained ML model from the system database and performing validation operations which generate a quality metric including model accuracy, and retraining the model if the quality metric is below a threshold amount, correlates to obtaining a model based on the threshold. The trained ML model generating predictions based on processed collected and generated collected weather data correlates to an inference model);
and reconstructing the data collected by the data collector using the reduced data size representation of the data and an inference generated by the inference model to obtain the aggregated data, the reconstructed data being different from the data collected by the data collector by less than the acceptable error level (Paragraphs 466, 470 and 472, “Once a determination is made by the cadence manager (905) to perform a partial calculation and update within a cadence instance, several steps occur. First the portions of the tile layers to be copied to the cadence instance along with their corresponding prior tile layers selected from one or more of prior collected data tile layers, processed collected data tile layers, forecast tile layers, forecast post-processing tile layers, and weather product tile layers. The identified prior tile layers are propagated by copying to the current cadence instance… At step (50130), the cadence manager runs a tile layer comparison program (910) to perform comparison testing between the selected source tile layers in order to determine whether a source tile layer can be copied or whether significant differences are present between the selected tile layers that require tile level updating, and to identify those tiles and tile layers that will be propagated by the tile propagation program (930) … If the source forecast data passes the comparison testing based one or more criteria, for example if source forecast tile layer differs from the collected data tile layer by less than a threshold amount, then the source forecast tile layer is verified and may be copied directly to the target forecast tile layer. Thresholds used in comparison tests are based on one or more factors including the desired or required accuracy of forecasting results and confidence in a data source.” The selected source portions of tile layers being copied to the cadence instance for partial calculation correlates to reconstructing data collected using the reduced size representation. The corresponding prior tile layers including forecast tile layers also being copied to the cadence correlates to reconstructing data collected using inferences generated by an inference model. The source forecast data being compared to the collected data tile to ensure the source forecast data differs from the collected data time by less than a threshold amount correlates to the reconstructed data being different from the data collected by the data collector by less than the acceptable error level.

Elkabetz does not explicitly teach that the acceptable error level is based on the message. However, data sets are a popular component of messages as evidenced by Lee (Paragraphs 28-30, “In the case where the inference fails, the first end device 11_1 may request the server 130 to infer and may receive an inference result or a new machine learning model from the server 130… Each of the plurality of end devices 11_1 to 11_n may transmit the collected status information to the server 130. The gateway device 120 may provide the sensing data or the status information received from each of the plurality of end devices 11_1 to 11_n to the server 130.” The end device requesting the server for an inference and providing collected status and sensing data to the server correlates to a message containing data sets).

Elkabetz does not explicitly teach:
receiving, from the downstream consumer, a message that summarizes the operation;
distributing the inference model to the data collector;
obtaining a reduced data size representation of the data collected by the data collector;

However, Lee teaches:
receiving, from the downstream consumer, a message that summarizes the operation (Paragraphs 28-30 and 32 “In the case where the inference fails, the first end device 11_1 may request the server 130 to infer and may receive an inference result or a new machine learning model from the server 130… Each of the plurality of end devices 11_1 to 11_n may collect status information. The status information may include information about a kind or a purpose of an end device, information about an internal status of the end device such as a resource or a power, information about an ambient status of the end device such as a position or a temperature, or information about a kind of collected sensing data (target data). The status information may be an index indicating a characteristic and a current situation of each of the plurality of end devices 11_1 to 11_n. Each of the plurality of end devices 11_1 to 11_n may transmit the collected status information to the server 130. The gateway device 120 may provide the sensing data or the status information received from each of the plurality of end devices 11_1 to 11_n to the server 130… The server 130 may set priorities of the plurality of end devices 11_1 to 11_n based on status information. The server 130 may analyze status information and may set a high priority to an end device that requires urgent processing of data. The server 130 may schedule an inference request based on a priority.” The server setting priorities of an inference request received from an end device based on the received status information, which includes the purpose of the device and kind of data, correlates to receiving, from the downstream consumer, a message that summarizes the operation);
distributing the inference model to the data collector (Fig. 8 and 11, paragraphs 58 and 116-117, “In operation S120, the end device 210 may collect sensing data (or target data). The sensing data may be collected through a sensor included in the end device 210… FIG. 11 is a block diagram of a distributed inference system associated with operation S150 of FIG. 3. Referring to FIG. 11, a distributed inference system 200_4 may include an end device 210_4 and a server 230_4. The end device 210_4 includes a first inference engine 212_4, a receiver 213_4, and a machine learning model manager 214_4. The server 230_4 includes a second inference engine 234_4 and a transmitter 237_4. FIG. 11 shows a data transfer relationship between some components of FIG. 2 performing operation S150. As described with reference to FIG. 8, the transmitter 237_4 may provide the machine learning model MD to the receiver 213_4.” The end device collecting sensing data correlates to the data collector. The distributed inference system including a server providing the machine learning model to the receiver of the end device correlates to distributing the inference model to the data collector);
Additionally, Sankar teaches:
obtaining a reduced data size representation of the data collected by the data collector (Paragraphs 41-42, “In accordance with the various embodiments, the diagnostic solution provides a data collection mechanism that collects and uploads data without causing noticeable overhead while providing the required data accuracy to detect problems. In accordance with an embodiment, instead of collecting and uploading the raw data at a single preconfigured frequency, the data collection mechanism described herein uses two different frequencies, namely collection frequency (CF) and aggregation frequency (AF). In accordance with an embodiment, the CF is a shorter time interval (typically 3-5 seconds but configurable) and is used to sample and collect data related to the component aspects. The AF is a larger time interval (typically 5 minutes but configurable) during which the aggregator will work of the raw data that was collected. The aggregator processes the raw data to compute diagnostic indicators such as efficiency of each collected sample, the number of violations and the average efficiency during the aggregation interval… For example, assuming that each aggregation results in about 50 B of data and an aggregation interval of 5 min the new size of the data that is collected works out to be 50 B*2aspects*50 EJBs*12=60 KB/hr for all aspects of all EJBs. Thus, by applying the above solution the amount of data generated is reduced from 3.6 MB/hr to 60 KB/hr while maintaining the data accuracy required for problem detection.” The aggregator receiving uploaded raw data from the data collection mechanism correlates to the data aggregator. The use of the aggregation frequency for data collection reducing the amount of data generated per time interval correlates to obtaining a reduced data size representation of the data collected by the data collector);

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with receiving, from the downstream consumer, a message that summarizes the operation and distributing the inference model to the data collector as taught by Lee because new machine learning models can be created by training modules to be specifically adapted to the current status of the end device based on received status information and training datasets. These created machine learning modules can be renewed at caching modules prior to being sent to the end devices. Additionally, priorities of inference requests can be determined based on received status information to determine whether a request requires urgent processing of data (Lee: paragraphs 32 and 114-115).

Additionally, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with obtaining a reduced data size representation of the data collected by the data collector as taught by Sankar because reducing the amount of data generated while maintaining the data accuracy required for problem detection can reduce the memory footprint used by the data collector. The aggregation memory also would only need to hold raw data for one aggregation cycle, and the memory can be reused in the next aggregation cycle (Sankar: paragraph 42).

With regards to Claims 12 and 17, the method of Claim 1 performs the same steps as the manufacture and machine of Claims 12 and 17 respectively, and Claims 12 and 17 are therefore rejected using the same rationale set forth above in the rejection of Claim 1.

With regards to Claim 2, Elkabetz in view of Lee and Sankar teach the method of Claim 1 above. Elkabetz further teaches:
wherein the plurality of synthetic data sets comprises: a first synthetic data set being treated as hypothetic data as collected by the data collector (Paragraphs 187 and 444, “The data that is generated and stored includes collected data, data derived or calculated from the collected data, forecast data generated during the forecast cycles of the cadence instance, and information that is further generated or derived from the forecast data… The machine learning (ML) training data preparation module (677) retrieves weather parameter data, for example data comprising features or predictors and corresponding outputs, from a system database (320) and processes the weather parameter data to generate machine learning model training, validation, and testing data formatted as a data frame suitable for processing into one or more ML models.” The ML training data preparation module retrieving data from the system database, which includes data derived or calculated from collected data, to use as training data correlates to a first synthetic data set treated as hypothetic data as collected by the data collector); and
a second synthetic data set, based on the first synthetic data set, and reflecting a representation of the hypothetic data as reconstructed by the data aggregator and through which a level of error is introduced by the reconstruction (Paragraph 444, “The machine learning (ML) training data preparation module (677) retrieves weather parameter data, for example data comprising features or predictors and corresponding outputs, from a system database (320) and processes the weather parameter data to generate machine learning model training, validation, and testing data formatted as a data frame suitable for processing into one or more ML models. The ML training data preparation module is configured to generate training data useful for initial training of a machine learning model and training data useful for retraining or updating a previously trained machine learning model… Processing of the retrieved data can include cleaning the data to remove outliers, interpolating or otherwise filling in missing data points, and removing erroneous or otherwise unneeded data and formatting the data in a date frame.” The ML training data preparation module retrieving data from the system database which includes data derived or calculated from collected data correlates to a first synthetic data set. The ML training data preparation module then processing the retrieved data to generate training data correlates to a second synthetic data set reflecting a representation of the hypothetic data as reconstructed by the data aggregator. The processing including removing outliers, interpolating or filling in missing data points, and removing erroneous data correlates to a level of error introduced by reconstruction).

With regards to Claims 13 and 18, the method of Claim 2 performs the same steps as the manufacture and machine of Claims 13 and 18 respectively, and Claims 13 and 18 are therefore rejected using the same rationale set forth above in the rejection of Claim 2.

With regards to Claim 3, Elkabetz in view of Lee and Sankar teach the method of Claim 2 above. Lee further teaches:
wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets comprises:
identifying first operation of the downstream consumer based on the first synthetic data set (Paragraphs 36, 43, 97 and 106, “The first inference engine 212 may perform inference on target data for the purpose of performing a work corresponding to the purpose of the end device 210. The server 230_3 may accumulate the training dataset TD for creating a new machine learning model based on an inference result of the first inference engine 212_3, in the training database 233_3. The server 230_3 may in advance train and create a machine learning model, based on the training dataset TD and the status information DS… The device manager 232 may decide a grade of the end device 210 based on the status information. Here, the grade may be an index for classifying a status of an end device that is required to the end device 210. For example, in the case where a result of analyzing the status information indicates that an inference request is very urgent (e.g., is directly related to the life of the user), the device manager 232 may grant the highest grade to the end device 210… In operation S410, the first inference engine 212_3 may perform inference on target data, based on a machine learning model stored in the end device 210_3.” The training dataset based on an inference result of the first inference engine correlates to the first synthetic dataset. The first inference engine performing inference on target data based on the machine learning model that was trained with the training dataset correlates to a first operation. The end device being associated with an inference request and status from a user correlate to a first operation of the downstream customer);
identifying second operation of the downstream consumer based on the synthetic data set (Paragraphs 43, 45, 107, “The device manager 232 may decide a grade of the end device 210 based on the status information. Here, the grade may be an index for classifying a status of an end device that is required to the end device 210. For example, in the case where a result of analyzing the status information indicates that an inference request is very urgent (e.g., is directly related to the life of the user), the device manager 232 may grant the highest grade to the end device 210… The device manager 232 may set a priority of an end device, based on a grade of each of a plurality of end devices. In the case where the inference request is input to the server 230, based on the set priority, the inference request may be scheduled. Accordingly, data corresponding to an end device of an urgent state may be processed first of all… In operation S430, the first inference result IR1, the target data, and the inference request IREQ are provided to the second inference engine 234_3. In operation S440, the second inference engine 234_3 performs inference on the target data in response to the inference request IREQ. The second inference engine 234_3 may perform inference on the target data, based on a machine learning model stored in the server 230_3.” The first inference result, target data, and inference request used by the second inference engine to perform inference correlates to the synthetic data set. The second inference engine performing inference in response to the inference request correlates to a second operation. Each of a plurality of end devices being associated with a grade and user sending an inference request correlates to a second operation of the downstream consumer);
identifying a difference between the first operation and the second operation (Paragraph 110, “In operation S510, the server 230_3 may calculate the accuracy AC of the first inference result IR1 by the end device 210_3. The accuracy AC of the first inference result IR1 may be calculated at the second inference engine 234_3. For example, the accuracy AC may be calculated based on a result of comparing the first inference result IR1 and the second inference result IR2, but the disclosure is not limited thereto.” The accuracy being calculated based on a result of comparing the first inference result and the second inference result correlates to identifying a difference between the first and second operation); and
making a determination regarding whether the difference indicates that the downstream consumer is impacted by the level of error to an unacceptable degree (Paragraphs 110-111, “In operation S510, the server 230_3 may calculate the accuracy AC of the first inference result IR1 by the end device 210_3. The accuracy AC of the first inference result IR1 may be calculated at the second inference engine 234_3. For example, the accuracy AC may be calculated based on a result of comparing the first inference result IR1 and the second inference result IR2, but the disclosure is not limited thereto… In operation S520, the server 230_3 may determine whether the calculated accuracy AC satisfies a reference range. The reference range may be a range of the reference accuracy required for a normal operation of the end device 210_3.” The accuracy being calculated based on a result of comparing the first inference result and the second inference result correlates to the difference between the first and second operation. The server determining whether the calculated accuracy satisfies a reference range which is required for the accuracy of normal operation of a particular end device correlate to making a determination whether the difference indicates that the downstream consumer is impacted by the level of error to an unacceptable degree).

Lee does not explicitly teach that the synthetic data set is a second synthetic data set. However, second synthetic data sets are a popular method of generating data sets as evidenced by Elkabetz above (paragraph 444).

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets comprises: identifying first operation of the downstream consumer based on the first synthetic data set; identifying second operation of the downstream consumer based on the second synthetic data set; identifying a difference between the first operation and the second operation; and making a determination regarding whether the difference indicates that the downstream consumer is impacted by the level of error to an unacceptable degree as taught by Lee because reference ranges can be used to determine whether the calculated accuracy for an inference result falls within a required accuracy for normal operation of an end device. In the case where the reference range is not satisfied, additional operations may be executed to create a new machine learning model adapted to the current status of the end device or renewing the current machine learning model (Lee: paragraphs 114-115).

With regards to Claims 14 and 19, the method of Claim 3 performs the same steps as the manufacture and machine of Claims 14 and 19 respectively, and Claims 14 and 19 are therefore rejected using the same rationale set forth above in the rejection of Claim 3.

With regards to Claim 4, Elkabetz in view of Lee and Sankar teach the method of Claim 3 above. Lee further teaches:
wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets further comprises: in an instance where the determination indicates that the downstream consumer is impacted by the level of error to the unacceptable degree (Paragraphs 110-111, “In operation S510, the server 230_3 may calculate the accuracy AC of the first inference result IR1 by the end device 210_3. The accuracy AC of the first inference result IR1 may be calculated at the second inference engine 234_3. For example, the accuracy AC may be calculated based on a result of comparing the first inference result IR1 and the second inference result IR2, but the disclosure is not limited thereto… In operation S520, the server 230_3 may determine whether the calculated accuracy AC satisfies a reference range. The reference range may be a range of the reference accuracy required for a normal operation of the end device 210_3… When the reference range is not satisfied, operation S530 is performed.” The server determining whether the calculated accuracy satisfies a reference range which is required for the accuracy of normal operation of a particular end device correlate to making a determination that the difference indicates that the downstream consumer is impacted by the level of error to an unacceptable degree):
repeatedly identifying a difference between: operation of the downstream consumer for other synthetic data sets that include progressively decreasing levels of error (Paragraphs 101-103, “In the case where the accuracy AC does not satisfy a reference range corresponding to the end device 210_3, the device manager 232_3 may command the training module 235_3 to create a new machine learning model. The training module 235_3 may create a new machine learning model MD in response to the command. The training module 235_3 may train and create the machine learning model MD, based on the status information DS and the training dataset TD. As described above, the training module 235_3 may select or weight the target data with reference to a hit rate or etc. included in the training dataset TD… Also, the training module 235_3 may create the machine learning model MD in further consideration of the calculated accuracy AC.” The accuracy not satisfying the reference range correlates to an instance where the determination indicates that the downstream consumer is impacted by the level of error to the unacceptable degree. The new machine learning model being trained using training datasets correlates to operations for other synthetic datasets. The process of training a new model includes at least one training cycle in consideration of the calculated accuracy correlates to operation of the downstream consumer for other synthetic data sets that include progressively decreasing levels of error), and
the operation of the downstream consumer for the first synthetic data set, until the repeatedly identified difference indicates that the level of error is within an acceptable degree (Paragraphs 97 and 104, “The server 230_3 may in advance train and create a machine learning model, based on the training dataset TD and the status information DS. Also, the server 230_3 may decide renewal of a machine learning model stored in the end device 210_3 based on the inference result of the first inference engine 212_3… The new machine learning model MD may be renewed at the caching module 236_3. The caching module 236_3 may manage both the new machine learning model MD and a previous machine learning model having the highest accuracy or performance.” The previous machine learning model performing inferences to determine its accuracy correlates to the operation of the downstream consumer for the first synthetic data set. The caching model determining if the new machine learning model or the previous learning model has the highest accuracy or performance correlates to identifying a difference between the operation of the downstream consumer for other synthetic datasets and the first synthetic dataset indicating the level of error is within an acceptable degree. The server training a new machine learning model in advance and deciding on a renewal of a machine learning model based on each inference result generated correlates to repeatedly identifying a difference between the operation of the downstream consumer for other synthetic datasets and the first synthetic dataset).

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets further comprises: in an instance where the determination indicates that the downstream consumer is impacted by the level of error to the unacceptable degree: repeatedly identifying a difference between: operation of the downstream consumer for other synthetic data sets that include progressively decreasing levels of error, and the operation of the downstream consumer for the first synthetic data set, until the repeatedly identified difference indicates that the level of error is within an acceptable degree as taught by Lee because reference ranges can be used to determine whether the calculated accuracy for an inference result falls within a required accuracy for normal operation of an end device. In the case where the reference range is not satisfied, additional operations may be executed to create a new machine learning model adapted to the current status of the end device or renewing the current machine learning model (Lee: paragraphs 114-115).

With regards to Claims 15 and 20, the method of Claim 4 performs the same steps as the manufacture and machine of Claims 15 and 20 respectively, and Claims 15 and 20 are therefore rejected using the same rationale set forth above in the rejection of Claim 4.

With regards to Claim 5, Elkabetz in view of Lee and Sankar teach the method of Claim 4 above. Lee further teaches:
wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets further comprises:
using the level of error in the other synthetic data set of the other synthetic data sets for which the identified difference indicated that the level of error is within the acceptable degree as the acceptable error level (Paragraphs 102-104, “The training module 235_3 may create a new machine learning model MD in response to the command. The training module 235_3 may train and create the machine learning model MD, based on the status information DS and the training dataset TD. As described above, the training module 235_3 may select or weight the target data with reference to a hit rate or etc. included in the training dataset TD… Also, the training module 235_3 may create the machine learning model MD in further consideration of the calculated accuracy AC... The caching module 236_3 may manage both the new machine learning model MD and a previous machine learning model having the highest accuracy or performance. The new machine learning model MD may be provided to the receiver 213_3 of the end device 210_3 through the transmitter 237_3.” The new machine learning model being trained using training datasets which are derived from target data correlates to the other synthetic dataset of the other synthetic data sets. The process of training and using a new model in consideration of the calculated accuracy and if the model has a higher accuracy or performance correlates to using the level of error in the other synthetic dataset of the other synthetic data sets where the identified difference indicated the level of error is within the acceptable degree).

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets further comprises: using the level of error in the other synthetic data set of the other synthetic data sets for which the identified difference indicated that the level of error is within the acceptable degree as the acceptable error level as taught by Lee because reference ranges can be used to determine whether the calculated accuracy for an inference result falls within a required accuracy for normal operation of an end device. In the case where the reference range is not satisfied, additional operations may be executed to create a new machine learning model adapted to the current status of the end device or renewing the current machine learning model (Lee: paragraphs 114-115).

With regards to Claim 16, the method of Claim 5 performs the same steps as the manufacture of Claim 16, and Claim 16 is therefore rejected using the same rationale set forth above in the rejection of Claim 5.

With regards to Claim 6, Elkabetz in view of Lee and Sankar teach the method of Claim 3 above. Lee further teaches:
wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets further comprises: in an instance where the determination indicates that the downstream consumer is not impacted by the level of error to the unacceptable degree (Paragraphs 110-111, “In operation S510, the server 230_3 may calculate the accuracy AC of the first inference result IR1 by the end device 210_3. The accuracy AC of the first inference result IR1 may be calculated at the second inference engine 234_3. For example, the accuracy AC may be calculated based on a result of comparing the first inference result IR1 and the second inference result IR2, but the disclosure is not limited thereto… In operation S520, the server 230_3 may determine whether the calculated accuracy AC satisfies a reference range. The reference range may be a range of the reference accuracy required for a normal operation of the end device 210_3… In the case where the reference range is satisfied, because it is unnecessary to change a machine learning model currently stored in the end device 210_3, an operation of renewing the machine learning model may not be performed.” The server determining whether the calculated accuracy satisfies a reference range which is required for the accuracy of normal operation of a particular end device correlate to making a determination that the difference indicates that the downstream consumer is not impacted by the level of error to an unacceptable degree):
Elkabetz further teaches:
repeatedly identifying a difference between: operation of the downstream consumer for other synthetic data sets that include progressively increasing levels of error, and the operation of the downstream consumer for the first synthetic data set (Paragraph 448, “For example, the ML model validation model generates the quality metric by executing the model and comparing predictions generated by the model to observed outcomes… In some exemplary embodiments, the ML model validation module periodically tests trained ML models using training data derived from processed collected or generated collected data and recalculates quality metrics associated with the trained ML models. In some embodiments, the ML training module retrains a trained ML model if the system determines that an associated quality metric has deteriorated below a threshold amount. In some embodiments, trained ML models are retrained on a periodic schedule.” The ML model validation module periodically testing each of the trained ML models using training data derived from processed collected or generated data by executing the model correlates to operations of the downstream consumer for other synthetic datasets and the first synthetic dataset. The models being retrained if the associated quality metric has deteriorated below a threshold amount or periodically retrained includes at least one retraining cycle and therefore correlates to repeatedly identifying a difference between operations for other synthetic datasets including progressively increasing levels of error), until the repeatedly identified difference indicates that the level of error reaches the unacceptable degree (Paragraph 448, “In some embodiments, the ML training module retrains a trained ML model if the system determines that an associated quality metric has deteriorated below a threshold amount. In some embodiments, trained ML models are retrained on a periodic schedule.” Retraining a model if the system determines the associated quality metric has deteriorated below a threshold amount would include detecting that the associated quality metric has fallen below a threshold amount and therefore correlates to the repeatedly identified difference indicating the level of error reaches the unacceptable degree).

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets further comprises: in an instance where the determination indicates that the downstream consumer is not impacted by the level of error to the unacceptable degree as taught by Lee because reference ranges can be used to determine whether the calculated accuracy for an inference result falls within a required accuracy for normal operation of an end device. In the case where the reference range is not satisfied, additional operations may be executed to create a new machine learning model adapted to the current status of the end device or renewing the current machine learning model (Lee: paragraphs 114-115).

With regards to Claim 7, Elkabetz in view of Lee and Sankar teach the method of Claim 6 above. Elkabetz further teaches:
wherein the obtaining the acceptable error level for the downstream consumer of the aggregated data using the plurality of synthetic data sets further comprises: using the level of error in the last other synthetic data set of the other synthetic data sets for which the identified difference did not indicate that the level of error reached the unacceptable degree as the acceptable error level (Paragraph 448, “For example, the ML model validation model generates the quality metric by executing the model and comparing predictions generated by the model to observed outcomes. The ML model validation module stores model quality metrics in the system database (320), associated with the trained ML model, and the system may not include a ML model validation store. In some exemplary embodiments, the ML model validation module periodically tests trained ML models using training data derived from processed collected or generated collected data and recalculates quality metrics associated with the trained ML models. In some embodiments, the ML training module retrains a trained ML model if the system determines that an associated quality metric has deteriorated below a threshold amount. In some embodiments, trained ML models are retrained on a periodic schedule.” The system generating, recalculating and storing model quality metric data associated with the trained ML model for each execution correlates to the level of error in the last other synthetic data set of the other synthetic data sets. If the system determines the associated quality metric has not deteriorated below a threshold amount, then the associated quality metric is above a threshold amount and therefore correlates to the identified difference not indicating the level of error reaches the unacceptable degree. The trained ML model then being used with its associated model quality metrics correlates to using the level of error in the last other synthetic data set as the acceptable error level).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Elkabetz in view of Lee, Sankar and Bhide et al. (U.S. Patent No. US 10305758 B1), hereinafter “Bhide.”

With regards to Claim 8, Elkabetz in view of Lee and Sankar teach the method of Claim 1 above. Elkabetz in view of Lee and Sankar does not explicitly teach:
obtaining an indication from the downstream consumer regarding an adjustment in the acceptable error level (Col. 165, lines 28-37, “As described above, the anomaly point(s) 34701 that are displayed along KPI value graph 34700 are identified based on the sensitivity setting provided by the user (via sensitivity setting control 34695). Accordingly, as the user drags the slider (that is, sensitivity setting control 34695) towards the left, thereby lowering the sensitivity setting (that is, the error threshold by which error values are to be determined to be anomalies with respect to their deviation from historical error values for the KPI), relatively more anomalies are likely to be identified.” The user adjusting the sensitivity setting in the sensitivity setting control to lower the error threshold correlates to obtaining an indication from the downstream consumer regarding an adjustment in the acceptable error level); and
modifying the threshold based on the indication (Col. 165, lines 28-37 and 43-47, “As described above, the anomaly point(s) 34701 that are displayed along KPI value graph 34700 are identified based on the sensitivity setting provided by the user (via sensitivity setting control 34695). Accordingly, as the user drags the slider (that is, sensitivity setting control 34695) towards the left, thereby lowering the sensitivity setting (that is, the error threshold by which error values are to be determined to be anomalies with respect to their deviation from historical error values for the KPI), relatively more anomalies are likely to be identified… In doing so, the user can actively adjust the sensitivity setting via sensitivity setting control 34695 and be presented with immediate visual feedback regarding anomalies that are identified based on the provided sensitivity setting.” The user getting visual feedback with updated anomalies identified based on the updated sensitivity setting correlates to modifying the threshold based on the indication).

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with obtaining an indication from the downstream consumer regarding an adjustment in the acceptable error level; and modifying the threshold based on the indication as taught by Bhide because updating sensitivity settings allow a user to adjust which error values are anomalies and receive immediate feedback through search preview windows reflecting updated error values (Bhide: Col. 164, lines 49-56).

Claims 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Elkabetz in view of Lee, Sankar and Sarkar et al. (U.S. Patent No. US 20230034011 A1), hereinafter “Sarkar.”

With regards to Claim 9, Elkabetz in view of Lee and Sankar teach the method of Claim 1 above. Elkabetz further teaches:
wherein obtaining the inference model based on the threshold comprises: selecting one of a plurality of potential inference models that: has an inference error level that falls within the threshold (Paragraphs 448, “The ML model validation module (679) retrieves a trained ML model from the system database (320), retrieves evaluation data (i.e. testing and validation data) from the ML training data store, and performs testing and validation operations using the trained model and the retrieved testing and validation data. In some exemplary embodiments, the ML validation module generates a quality metric, e.g., a model accuracy or performance metric such as variance, mean standard error, receiver operating characteristic (ROC) curve, or precision-recall (PR) curve, associated with the trained ML model. For example, the ML model validation model generates the quality metric by executing the model and comparing predictions generated by the model to observed outcomes… In some embodiments, the ML training module retrains a trained ML model if the system determines that an associated quality metric has deteriorated below a threshold amount.” The ML model validation module retrieving one of the multiple trained ML models from the system database correlates to selecting one of a plurality of potential inference models. The ML model validation module performing validation operations which generate a quality metric including model accuracy, and retraining the model if the quality metric is below a threshold amount, correlates to selecting a model with an inference error level falling within the threshold. The trained ML model generating predictions based on processed collected and generated collected weather data correlates to an inference model); and using the selected one of the plurality of potential inference models as the inference model (Paragraphs 448 and 521, “The ML model validation module (679) retrieves a trained ML model from the system database (320), retrieves evaluation data (i.e. testing and validation data) from the ML training data store, and performs testing and validation operations using the trained model and the retrieved testing and validation data… Based upon its configuration settings, the ML model execution module retrieves ML model input data, for example processed collected and generated collected weather parameter data and historical forecast data, from one or more systems database(s). The ML model execution module executes the trained ML model using the input data to generate ML model output data which is stored in the system database (320). The ML model output data can include weather parameter estimates, predictions, and forecasts, depending on the ML model that is executed, and is saved to a tile layer of a cadence instance tile stack.” The trained ML model generating predictions based on processed collected and generated collected weather data correlates to using the selected model as the inference model).
Elkabetz does not explicitly teach:
wherein obtaining the inference model based on the threshold comprises: selecting one of a plurality of potential inference models that meets a computing resources consumption goal 
However, Sarkar teaches:
wherein obtaining the inference model based on the threshold comprises: selecting one of a plurality of potential inference models that meets a computing resources consumption goal (Paragraphs 38 and 69, “In certain examples, the selection of ML models is based on available service provider resources and a number of queries received. For example, in instances where the number of queries received greater than a threshold number and the utilization of service provider resources is high, then ML models having low resource consumption level may be selected… The term “machine learning models” or “ML models” refers to one or more methods, algorithms, statistical models, mathematical models, or computer systems trained to identify patterns and correlations, predict answers through inferences and probability, for a given input query.” The ML models being selected based on their resource consumption level correlates to selecting one of a plurality of potential models that meet a computing resources consumption goal. The ML models including models that identify patterns and correlations and predict answers through inferences and probability correlate to an inference model)

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with wherein obtaining the inference model based on the threshold comprises: selecting one of a plurality of potential inference models that meets a computing resources consumption goal as taught by Sarkar because the distribution of ML models can be dynamically changed based on an increase or decrease in the number of queries and utilization of service provider resources (Sarkar: paragraph 38).

With regards to Claim 10, Elkabetz in view of Lee, Sankar and Sarkar teach the method of Claim 9 above. Sarkar further teaches:
wherein the computing resource consumption goal is to minimize a quantity of computing resources consumed (Paragraphs 38, 51 and 69, “In certain examples, the selection of ML models is based on available service provider resources and a number of queries received. For example, in instances where the number of queries received greater than a threshold number and the utilization of service provider resources is high, then ML models having low resource consumption level may be selected… In instances where the utilization of available service provider resources is greater than a threshold and/or the number of queries received from the plurality of clients 104-N is greater than a threshold number, the ML models may be dynamically selected (or reselected) in the stages. Specifically, the ML models in the stages are dynamically selected for execution by the CPU and/or GPU resources based on a comparison of service provider resource specification, which indicates resource type and resource consumption level, and available service provider resources. For example, in response to an increase in utilization of CPU and GPU resources, the selection or execution of already selected high resource-consumption level models, such as BERT 214, BERT Large 222, NER 234, or clustering 232, may be avoided or stopped. Instead, moderate or low resource consumption level models, such as TFIDF or lucene search in document filter stage 202, BERT medium 224 or domain-specific models 220 in answer extraction model, and pipeline analytics 230 from the post-processing model 206, may be selected for processing the queries. Similarly, in response to the number of queries received increases beyond a threshold number, the ML models may be reselected dynamically… The term “machine learning models” or “ML models” refers to one or more methods, algorithms, statistical models, mathematical models, or computer systems trained to identify patterns and correlations, predict answers through inferences and probability, for a given input query.” The ML models being selected based on their resource consumption level, with lower resource consumption level models being selected over higher resource consumption level models, correlates to a computing resources consumption goal to minimize a quantity of computing resources consumed. The ML models including models that identify patterns and correlations and predict answers through inferences and probability correlate to an inference model).

Sarkar does not explicitly teach that the computing resources consumption goal is for reconstructing the data collected by the data collector. However, minimizing resource usage for reconstructing data is a popular method of resource management for reconstructing data from a data collector as evidenced by Elkabetz (Paragraph 461 and 463, “If differences are detected that require a partial or complete calculation of forecasts, the cadence manager determines which method of calculation and/or propagation is most appropriate to minimize the required resource usage to process the current cadence instance based upon a variety of input parameters, such as the amount of change in newly collected data and the volatility of the forecast mechanisms… If the cadence manager selects the option to copy and update a prior forecast tile layer, the cadence manager often still has to run any missing processing programs in order to complete a new cadence instance. Even though the missing processing programs and forecast cycles have to be run, the compute and time savings of these “shortcut” approaches significantly reduces the amount of computing cycles requires to produce the next forecast cycle, and substantially reduces the amount of time required as well. In some cases, the time savings may exceed 75, 80, 85, 90, 95, or even 98%, resulting in corresponding forecast calculation times (assuming a 10 minute forecast cycle) of 2.5 min, 2.0 min.1.5 min, 1.0 min, 30 sec, or 15 sec respectively. Similar percentage savings may be achieved on the forecast calculation compute requirements. 6.8.1.1.1 Tile layer generation by complete (re)calculation approach.” The cadence manager minimizing the required resource usage to process the current cadence instance by copying and updating prior forecast tile layers correlates to minimizing resource usage for reconstructing data).

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with wherein the computing resource consumption goal is to minimize a quantity of computing resources consumed as taught by Sarkar because the distribution of ML models can be dynamically changed based on an increase or decrease in the number of queries and utilization of service provider resources. Certain thresholds can also be used to dynamically select different ML models based on utilization of available service provider resources (Sarkar: paragraphs 38 and 51).

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Elkabetz in view of Lee, Sankar and Neti et al. (U.S. Patent No. US 20190304673 A1), hereinafter “Neti.”

With regards to Claim 11, Elkabetz in view of Lee and Sankar teach the method of Claim 1 above. Elkabetz in view of Lee and Sankar does not explicitly teach:
wherein distributing the inference model establishes a twin inference model at the data collector and the data aggregator, 
the inference model that generates the inference is part of the twin inference model, 
the inference model that generates the inference is hosted by the data aggregator, 
and the reduced data size representation of the data collected by the data collector is obtained using the twin inference model.

However, Neti teaches:
wherein distributing the inference model establishes a twin inference model at the data collector and the data aggregator (Paragraphs 33-34, 36, and 38, “The industrial asset 202 of the IoT architecture 200 is communicatively coupled to a cloud 206 via a connectivity interface 204. The industrial asset 202 in general includes a plurality of industrial systems 201 and may include a fleet of machines… The enterprise system 226 is configured to process data generated by the plurality of industrial systems 201 and transmit the processed data to the cloud 206. The communication infrastructure 228 is configured to establish data transfer between the plurality of industrial systems 201 and the cloud 206… The cloud 206 further includes a plurality of digital twins 218, where each of the digital twins 218 corresponds to a particular industrial system 201 of the industrial asset 202. The plurality of digital twins 218 integrated with the data infrastructure and utilized by the aPaaS 220. The cloud 206 further includes hardware and software based interfaces 230 to provide access to data and services that enable operational control of the one or more of the plurality of industrial systems 201, build and/or store digital twins, such as digital twins 218, design and/or manage analytical solutions, and manage data required for providing cloud services … FIG. 3 illustrates an architecture 300 of a digital asset or digital twin 301 corresponding to an industrial asset (not shown in FIG. 3) in accordance with aspects of the present specification. As described herein, the digital twin 301 includes executing computer code that provides for instantiation of one or more underlying models that are bound to a particular physical asset or group of assets. Various functions of the digital twin 301 may be provided by certain included algorithms, functions, and libraries executed by a computer processor, including code for instantiating the models, binding the models to a particular asset and attendant sensor data feeds from the asset so that the models receive the data feeds from the physical assets, executing the algorithms against the input data, storing the output of the models, and identifying relevant events and outcomes identified by the models.” The industrial asset including a plurality of industrial systems which generate data correlates to the data collector. The cloud receiving processed generated data from the industrial systems correlates to the data aggregator. The digital twin instantiating one or more underlying models for a physical asset correlate to establishing a twin inference model at the data collector. The cloud including hardware and software to build digital twins which are included in the cloud and correspond to industrial system correlates to establishing a twin inference model at the data aggregator), 
the inference model that generates the inference is part of the twin inference model (Fig. 3, paragraphs 38-39, “In one embodiment, the cloud 206 provides services in the form of a Digital Twin-as-a-Service (DTaaS) model for simulation and prediction of industrial processes using the digital twins. In such a scenario, various simulations models corresponding to assets, systems and processes are provided in a cloud library hosted by the cloud 206... As described herein, the digital twin 301 includes executing computer code that provides for instantiation of one or more underlying models that are bound to a particular physical asset or group of assets. Various functions of the digital twin 301 may be provided by certain included algorithms, functions, and libraries executed by a computer processor, including code for instantiating the models, binding the models to a particular asset and attendant sensor data feeds from the asset so that the models receive the data feeds from the physical assets, executing the algorithms against the input data, storing the output of the models, and identifying relevant events and outcomes identified by the models.” The digital twin instantiating underlying models bound to a particular physical asset and executing algorithms against input data to receive relevant events and identified outcomes from the models correlates to the inference model generating the inference being part of the twin inference model), 
the inference model that generates the inference is hosted by the data aggregator (Fig. 3, paragraphs 38-39, “In one embodiment, the cloud 206 provides services in the form of a Digital Twin-as-a-Service (DTaaS) model for simulation and prediction of industrial processes using the digital twins. In such a scenario, various simulations models corresponding to assets, systems and processes are provided in a cloud library hosted by the cloud 206... As described herein, the digital twin 301 includes executing computer code that provides for instantiation of one or more underlying models that are bound to a particular physical asset or group of assets. Various functions of the digital twin 301 may be provided by certain included algorithms, functions, and libraries executed by a computer processor, including code for instantiating the models, binding the models to a particular asset and attendant sensor data feeds from the asset so that the models receive the data feeds from the physical assets, executing the algorithms against the input data, storing the output of the models, and identifying relevant events and outcomes identified by the models.” The digital twin which is hosted by the cloud and instantiating underlying models bound to a particular physical asset and executing algorithms against input data to receive relevant events and identified outcomes from the models correlates to the inference model generating the inference being hosted by the data aggregator)
and the data collected by the data collector is obtained using the twin inference model (Paragraphs 24, 31 and 34, “A digital twin may provide data that may be obtained from, for example, inspecting a physical product… Further, the digital twin 104 is communicatively coupled to the industrial asset 102. By way of example, the digital twin 104 may be configured to directly or indirectly receive data pertaining to sensors and data acquisition units coupled to the industrial asset 102… The enterprise system 226 is configured to process data generated by the plurality of industrial systems 201 and transmit the processed data to the cloud 206. The communication infrastructure 228 is configured to establish data transfer between the plurality of industrial systems 201 and the cloud 206” The digital twin directly or indirectly receiving data pertaining to sensors and data acquisition units which are then transmitted to the cloud correlates to the data collected by the data collector is obtained using the twin inference model).

	Neti does not explicitly teach that the data collected by the data collector is the reduced data size representation of the data collected. However, obtaining reduced data size representations of data is a popular method of data collection as evidenced by Sankar above (paragraphs 41-42).

Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Elkabetz with wherein distributing the inference model establishes a twin inference model at the data collector and the data aggregator, the inference model that generates the inference is part of the twin inference model, the inference model that generates the inference is hosted by the data aggregator, and the data collected by the data collector is obtained using the twin inference model as taught by Neti because digital twins can be configured to provide analytics, health prediction and performance assessments of industrial assets. They may also provide digital equivalents configured to analyze operations of the industrial asset and further include algorithms and subroutines capable of identifying or predicting anomalies (Neti: paragraph 28).

Prior Art Made of Record
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Yang et al. (U.S. Patent No. US 20210406220 A1); teaching a method of labeling data based on a provided labeling accuracy requirement. Process monitoring parameters matching the data are determined and weighted with coefficients having a corresponding size to obtain a comprehensive accuracy according to dependent and causal relationships. If the comprehensive accuracy of the labeled data satisfies the labeling accuracy requirement then the labeled data is outputted from the system.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELINA HU whose telephone number is (571)272-5428. The examiner can normally be reached Monday-Friday 8:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do can be reached at (571) 272-3721. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SELINA ELISA HU/               Examiner, Art Unit 2193                                                                                                                                                                                         

/Chat C Do/               Supervisory Patent Examiner, Art Unit 2193
Read full office action
Prosecution Timeline

Jun 27, 2022
Application Filed
Nov 04, 2025
Non-Final Rejection — §103
Feb 03, 2026
Response Filed
Feb 24, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/895,687
Patent 12585485
Warm migrations for virtual machines in a cloud computing environment
2y 5m to grant Granted Mar 24, 2026
18/020,618
Patent 12563114
CONTENT INITIALIZATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
67%
Grant Probability
99%
With Interview (+100.0%)
3y 3m
Median Time to Grant
Moderate
PTA Risk
Based on 3 resolved cases by this examiner. Grant probability derived from career allow rate.