DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Request for Continued Examination filed on 2/26/2026, which refers to the After-Final Amendment filed on 1/16/2026. Claims 1-6 and 8-20 are pending in the case. Claim 7 has been cancelled. Claims 1 and 19-20 are independent claims.
Response to Arguments
Applicant’s amendments regarding the objections are persuasive. These objections are respectfully withdrawn.
Applicant’s amendments regarding the 35 U.S.C. § 101 rejections are persuasive. These rejections are respectfully withdrawn.
Applicant’s prior art arguments have been considered but are moot because the new grounds of rejection presented below do not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the arguments.
Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA 35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 C.F.R. § 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.
Claims 1 and 19-20 are rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira et al. (US 2021/0232968 A1), hereinafter Ferreira) in view of Tarango et al. (US 2021/0191726 A1, hereinafter Tarango).
As to independent claim 1, Ferreira teaches a non-transitory computer-readable medium configured to store computer logic having instructions for enabling a processing system (“Embodiment 11. A non - transitory storage medium having stored therein instructions that are executable by one or more hardware processors to perform the operations…,” paragraph 0069; “The embodiments disclosed herein may include the use of a special purpose or general - purpose computer including various computer hardware or software modules, as discussed in greater detail below. A computer may include a processor and computer storage media carrying instructions that, when executed by the processor and / or caused to be executed by the processor, perform any one or more of the methods disclosed herein…,” paragraph 0071; “instructions” correspond to “computer logic having instructions”) to:
collect, in a temporary database, raw telemetry data from a plurality of network elements of a network environment, the raw telemetry data being collected as time-series datasets (“Embodiments of the invention relate to learning an intelligent compression / decompression model for a specific purpose such that high compression rates are achieved and low error rates are maintained. Embodiments of the invention are discussed in the context of a task for predicting read and write response times based on historical telemetry data. The telemetry data thus reflects actual read and write response times in one example,” paragraph 0015; “More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “….data nodes or sources…may be operating as together (e.g. a system of appliances or virtual machines that are performing data protection operations),” paragraph 0029; “This invention leverages the availability of real telemetry data coming from different data nodes….”’; since the telemetry data reflects actual read and write response times, it corresponds to “raw telemetry data being collected as time series datasets,” paragraph 0036; “The input data came from the Sizer reporter database, a tool that allows field engineers to upload performance files from a customer site, in order to generate performance reports. The data are composed of 1 million 68-dimensional observations. Here, the efficiency of training a response time predictor with data previously compressed and decompressed by using the disclosed framework,” paragraph 0039; “A method, comprising receiving input data at a compressor from a plurality of data sources, wherein the compressor is configured to compress the data for a purpose associated with a predictor, generating compressed data by the compressor, transmitting the compressed data to a central node,” paragraph 0059; “obtaining the input data” corresponds to “collect raw telemetry data”; since the data nodes can be “operating as together”, “data nodes” correspond to “a network environment”; since input data is initially stored in a “compressor” and then transmitted to the central node, “compressor” corresponds to “a temporary database”; since the input data comes from “performance reports”, this corresponds to “a plurality of network elements”); and
compress the time-series datasets from the temporary database by deploying the time-series datasets as a Deep Neural Network (DNN) in the network environment itself (“More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” paragraph 0023; “FIG. 2 illustrates an example of a distributed autoencoder or autoregressor,” paragraph 0029; “A method, comprising receiving input data at a compressor from a plurality of data sources, wherein the compressor is configured to compress the data for a purpose associated with a predictor, generating compressed data by the compressor, transmitting the compressed data to a central node”; “obtaining the input data”; “input data 102 is passed through a compressor” corresponds to “compress the time-series datasets,” paragraph 0059; since the auto-encoder includes a compressor and the auto-encoder is part of an autoregressor which “is a deep neural network”, “autoregressor” corresponds “Deep Neural Network (DNN)”; since the “autoregressor” is “distributed” the autoregressor belongs “in the network environment itself”; since the compressor both temporarily stores and compresses the input data, this corresponds to “compress the time-series datasets from the temporary database”);
wherein the time-series datasets are configured to be substantially reconstructed from the DNN using predictive functionality of the DNN (“More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” paragraph 0023; since the autoencoder includes a decoder and the autoencoder is part of an autoregressor, “autoregressor” corresponds to “DNN”; “decompress the compressed data 106 to generate the…reconstructed data” corresponds to “time-series datasets are configured to be substantially reconstructed”; “The autoregressor 120 jointly learns how to…predict (regress)” corresponds to “using predictive functionality of the DNN”).
Ferreira does not appear to expressly teach a medium wherein deploying the time-series datasets as the DNN in the network environment itself includes applying the DNN to a host server, the host server configured to allow a query request of the DNN for data retrieval from the DNN at the host server.
Tarango teaches a medium wherein deploying the time-series datasets as the DNN (“Example machine learning techniques that device classification process 248 can employ may include, but are not limited to, … multi-layer perceptron (MLP) artificial neural networks (ANNs) (e.g., for non-linear models), replicating reservoir networks (e.g., for non-linear models, typically for time series), random forest classification, or the like,” paragraph 0037, emphases added) in the network environment itself includes applying the DNN to a host server (“Also as shown in FIG. 4 is a device classification service 408 that may be hosted on one or more of networking devices 406,” paragraph 0046), the host server configured to allow a query request of the DNN for data retrieval from the DNN at the host server (“In general, device classification service 408 is configured to take as input telemetry data 410 captured by networking device 406 regarding network traffic associated with endpoint device 402 and, based on the captured telemetry, identify the device type 412 of endpoint device 402,” paragraph 0046).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira to comprise the in-network deployment of Tarango. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely deploying the DNN in-network (“Also as shown in FIG. 4 is a device classification service 408 that may be hosted on one or more of networking devices 406,” Tarango paragraph 0046). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
As to independent claim 19, Ferreira teaches a system comprising:
a processing device and a memory device configured to store computer logic having instructions that, when executed, enable the processing device (“Embodiment 11. A non - transitory storage medium having stored therein instructions that are executable by one or more hardware processors to perform the operations…,” paragraph 0069; “The embodiments disclosed herein may include the use of a special purpose or general - purpose computer including various computer hardware or software modules, as discussed in greater detail below. A computer may include a processor and computer storage media carrying instructions that, when executed by the processor and / or caused to be executed by the processor, perform any one or more of the methods disclosed herein…,” paragraph 0071; “instructions” correspond to “computer logic having instructions”) to
collect, in a temporary database, raw telemetry data from a plurality of network elements of a network environment, the raw telemetry data being collected as time-series datasets (“Embodiments of the invention relate to learning an intelligent compression / decompression model for a specific purpose such that high compression rates are achieved and low error rates are maintained. Embodiments of the invention are discussed in the context of a task for predicting read and write response times based on historical telemetry data. The telemetry data thus reflects actual read and write response times in one example,” paragraph 0015; “More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “….data nodes or sources…may be operating as together (e.g. a system of appliances or virtual machines that are performing data protection operations),” paragraph 0029; “This invention leverages the availability of real telemetry data coming from different data nodes….”’; since the telemetry data reflects actual read and write response times, it corresponds to “raw telemetry data being collected as time series datasets,” paragraph 0036; “The input data came from the Sizer reporter database, a tool that allows field engineers to upload performance files from a customer site, in order to generate performance reports. The data are composed of 1 million 68-dimensional observations. Here, the efficiency of training a response time predictor with data previously compressed and decompressed by using the disclosed framework,” paragraph 0039; “A method, comprising receiving input data at a compressor from a plurality of data sources, wherein the compressor is configured to compress the data for a purpose associated with a predictor, generating compressed data by the compressor, transmitting the compressed data to a central node,” paragraph 0059; “obtaining the input data” corresponds to “collect raw telemetry data”; since the data nodes can be “operating as together”, “data nodes” correspond to “a network environment”; since input data is initially stored in a “compressor” and then transmitted to the central node, “compressor” corresponds to “a temporary database”; since the input data comes from “performance reports”, this corresponds to “a plurality of network elements”), and
compress the time-series datasets from the temporary database by deploying the time-series datasets as a Deep Neural Network (DNN) in the network environment itself (“More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” paragraph 0023; “FIG. 2 illustrates an example of a distributed autoencoder or autoregressor,” paragraph 0029; “A method, comprising receiving input data at a compressor from a plurality of data sources, wherein the compressor is configured to compress the data for a purpose associated with a predictor, generating compressed data by the compressor, transmitting the compressed data to a central node”; “obtaining the input data”; “input data 102 is passed through a compressor” corresponds to “compress the time-series datasets,” paragraph 0059; since the auto-encoder includes a compressor and the auto-encoder is part of an autoregressor which “is a deep neural network”, “autoregressor” corresponds “Deep Neural Network (DNN)”; since the “autoregressor” is “distributed” the autoregressor belongs “in the network environment itself”; since the compressor both temporarily stores and compresses the input data, this corresponds to “compress the time-series datasets from the temporary database”),
wherein the time-series datasets are configured to be substantially reconstructed from the DNN using predictive functionality of the DNN (“More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” paragraph 0023; since the autoencoder includes a decoder and the autoencoder is part of an autoregressor, “autoregressor” corresponds to “DNN”; “decompress the compressed data 106 to generate the…reconstructed data” corresponds to “time-series datasets are configured to be substantially reconstructed”; “The autoregressor 120 jointly learns how to…predict (regress)” corresponds to “using predictive functionality of the DNN”).
Ferreira does not appear to expressly teach a system wherein deploying the time-series datasets as the DNN in the network environment itself includes applying the DNN to a host server, the host server configured to allow a query request of the DNN for data retrieval from the DNN at the host server.
Tarango teaches a system wherein deploying the time-series datasets as the DNN (“Example machine learning techniques that device classification process 248 can employ may include, but are not limited to, … multi-layer perceptron (MLP) artificial neural networks (ANNs) (e.g., for non-linear models), replicating reservoir networks (e.g., for non-linear models, typically for time series), random forest classification, or the like,” paragraph 0037, emphases added) in the network environment itself includes applying the DNN to a host server (“Also as shown in FIG. 4 is a device classification service 408 that may be hosted on one or more of networking devices 406,” paragraph 0046), the host server configured to allow a query request of the DNN for data retrieval from the DNN at the host server (“In general, device classification service 408 is configured to take as input telemetry data 410 captured by networking device 406 regarding network traffic associated with endpoint device 402 and, based on the captured telemetry, identify the device type 412 of endpoint device 402,” paragraph 0046).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira to comprise the in-network deployment of Tarango. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely deploying the DNN in-network (“Also as shown in FIG. 4 is a device classification service 408 that may be hosted on one or more of networking devices 406,” Tarango paragraph 0046). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
As to independent claim 20, Ferreira teaches a method comprising the steps of:
collecting, in a temporary database, raw telemetry data from a plurality of network elements of a network environment, the raw telemetry data being collected as time-series datasets (“Embodiments of the invention relate to learning an intelligent compression / decompression model for a specific purpose such that high compression rates are achieved and low error rates are maintained. Embodiments of the invention are discussed in the context of a task for predicting read and write response times based on historical telemetry data. The telemetry data thus reflects actual read and write response times in one example,” paragraph 0015; “More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “….data nodes or sources…may be operating as together (e.g. a system of appliances or virtual machines that are performing data protection operations),” paragraph 0029; “This invention leverages the availability of real telemetry data coming from different data nodes….”’; since the telemetry data reflects actual read and write response times, it corresponds to “raw telemetry data being collected as time series datasets,” paragraph 0036; “The input data came from the Sizer reporter database, a tool that allows field engineers to upload performance files from a customer site, in order to generate performance reports. The data are composed of 1 million 68-dimensional observations. Here, the efficiency of training a response time predictor with data previously compressed and decompressed by using the disclosed framework,” paragraph 0039; “A method, comprising receiving input data at a compressor from a plurality of data sources, wherein the compressor is configured to compress the data for a purpose associated with a predictor, generating compressed data by the compressor, transmitting the compressed data to a central node,” paragraph 0059; “obtaining the input data” corresponds to “collect raw telemetry data”; since the data nodes can be “operating as together”, “data nodes” correspond to “a network environment”; since input data is initially stored in a “compressor” and then transmitted to the central node, “compressor” corresponds to “a temporary database”; since the input data comes from “performance reports”, this corresponds to “a plurality of network elements”); and
compressing the time-series datasets from the temporary database by deploying the time-series datasets as a Deep Neural Network (DNN) in the network environment itself (“More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” paragraph 0023; “FIG. 2 illustrates an example of a distributed autoencoder or autoregressor,” paragraph 0029; “A method, comprising receiving input data at a compressor from a plurality of data sources, wherein the compressor is configured to compress the data for a purpose associated with a predictor, generating compressed data by the compressor, transmitting the compressed data to a central node”; “obtaining the input data”; “input data 102 is passed through a compressor” corresponds to “compress the time-series datasets,” paragraph 0059; since the auto-encoder includes a compressor and the auto-encoder is part of an autoregressor which “is a deep neural network”, “autoregressor” corresponds “Deep Neural Network (DNN)”; since the “autoregressor” is “distributed” the autoregressor belongs “in the network environment itself”; since the compressor both temporarily stores and compresses the input data, this corresponds to “compress the time-series datasets from the temporary database”);
wherein the time-series datasets are configured to be substantially reconstructed from the DNN using predictive functionality of the DNN (“More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” paragraph 0023; since the autoencoder includes a decoder and the autoencoder is part of an autoregressor, “autoregressor” corresponds to “DNN”; “decompress the compressed data 106 to generate the…reconstructed data” corresponds to “time-series datasets are configured to be substantially reconstructed”; “The autoregressor 120 jointly learns how to…predict (regress)” corresponds to “using predictive functionality of the DNN”).
Ferreira does not appear to expressly teach a method wherein deploying the time-series datasets as the DNN in the network environment itself includes applying the DNN to a host server, the host server configured to allow a query request of the DNN for data retrieval from the DNN at the host server.
Tarango teaches a method wherein deploying the time-series datasets as the DNN (“Example machine learning techniques that device classification process 248 can employ may include, but are not limited to, … multi-layer perceptron (MLP) artificial neural networks (ANNs) (e.g., for non-linear models), replicating reservoir networks (e.g., for non-linear models, typically for time series), random forest classification, or the like,” paragraph 0037, emphases added) in the network environment itself includes applying the DNN to a host server (“Also as shown in FIG. 4 is a device classification service 408 that may be hosted on one or more of networking devices 406,” paragraph 0046), the host server configured to allow a query request of the DNN for data retrieval from the DNN at the host server (“In general, device classification service 408 is configured to take as input telemetry data 410 captured by networking device 406 regarding network traffic associated with endpoint device 402 and, based on the captured telemetry, identify the device type 412 of endpoint device 402,” paragraph 0046).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira to comprise the in-network deployment of Tarango. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely deploying the DNN in-network (“Also as shown in FIG. 4 is a device classification service 408 that may be hosted on one or more of networking devices 406,” Tarango paragraph 0046). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claim 2 is rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango and Kolar et al. (US 2021/0160148 A1, hereinafter Kolar).
As to dependent claim 2, the rejection of claim 1 is incorporated.
Ferreira/Tarango does not appear to expressly teach a medium wherein the raw telemetry data is network data collected from a communications network, and wherein the raw telemetry data includes information related to one or more of packet count, latency, jitter, Signal-to-Noise Ratio (SNR), SNR estimates, state of polarization, Channel Quality Indicator (CQI) reports, and alarm states.
Kolar teaches a medium wherein the raw telemetry data is network data collected from a communications network, and wherein the raw telemetry data includes information related to one or more of packet count, latency, jitter, Signal-to-Noise Ratio (SNR), SNR estimates, state of polarization, Channel Quality Indicator (CQI) reports, and alarm states (“To predict a rare event in a network, the problem can be formulated as a binary classification problem where the output is to predict whether the rare event will occur in time t…In general, the telemetry data used to make the prediction can be divided into three classes…,” paragraph 0073; “Synchronous timeseries: These features are sampled at each timestep t such as CPU / memory usage, networking metrics such as loss and latency. Additional features aggregating instantaneous metrics over time windows can also be included in the analysis, in further embodiments,” paragraph 0075; “In some embodiments, the service may receive an indication from a networking device in the SD - WAN that the event has occurred. In further embodiments, the service may receive telemetry data from a networking device in the SD - WAN and determine, based on the telemetry data received,” paragraph 0128; “the telemetry data from a networking device” corresponds to “the raw telemetry data is network data collected from a communications network”; “features…such as…latency” corresponds to “information related to one or more of…latency”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the telemetry of Ferreira/Tarango to comprise the data of Kolar. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely collecting latency telemetry (“Synchronous timeseries: These features are sampled at each timestep t such as CPU / memory usage, networking metrics such as loss and latency. Additional features aggregating instantaneous metrics over time windows can also be included in the analysis, in further embodiments,” Kolar paragraph 0075). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claims 3-4 are rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango and Tinawi (“Machine Learning for Time Series Anomaly Detection”).
As to dependent claim 3, the rejection of claim 1 is incorporated. Ferreira/Tarango further teaches a medium wherein the instructions further enable the processing system to compress the time-series datasets by: training the DNN to adjust weights until a desired compression ratio or precision is achieved (“The loss function L 114 is used to adjust the weights of the auto-encoder 100,” Ferreira paragraph 0026; “Thus, the autoregressor 120, in addition to managing the compression/decompression…may incorporate a loss function 114. By incorporating the loss function 114 into the weights of the compressor 104…the compression is performed with a specific task or purpose in mind,” Ferreira paragraph 0027; “…. retraining the…compressor when sufficient compressed data is received from the plurality of data sources or when a prediction error rate exceeds a threshold”; “compressor” corresponds to “compress the time-series datasets,” Ferreira paragraph 0065; “adjust the weights of the auto-encoder” and “retraining the…compressor” corresponds to “training the DNN to adjust weights” since the compressor is part of the auto-encoder; since retraining of the compressor occurs as long at the prediction error rate exceeds a threshold, this means that no retraining occurs if the prediction error rate does not exceed that threshold [e.g., the error rate is at or below the threshold], therefore this reads onto “training…until a desired…precision is achieved”).
Ferreira/Tarango does not appear to expressly teach a medium wherein the instructions further enable the processing system to compress the time-series datasets by:
dividing the raw telemetry data into equal-sized chunks of time; and
feeding indices as inputs to the DNN and obtaining the equal-sized chunks of time as outputs from the DNN.
Tinawi teaches a medium wherein the instructions further enable the processing system to compress the time-series datasets by:
dividing the raw telemetry data into equal-sized chunks of time (“
PNG
media_image1.png
551
1002
media_image1.png
Greyscale
… Rolling window sequences is a method that is commonly used to prepare time series data for building models for forecasting. 𝑂𝑢𝑡_𝑋 acts as the input to the forecasting problem, giving the model context for the previous values,” Section 4.2.2; “X” corresponds to “the raw telemetry data”; “window_size” corresponds to “equal-sized chucks of time”; “out_X” corresponds to “dividing the raw telemetry data into equal-sized chunks of time”); and
feeding indices as inputs to the DNN and obtaining the equal-sized chunks of time as outputs from the DNN (“
PNG
media_image1.png
551
1002
media_image1.png
Greyscale
… Rolling window sequences is a method that is commonly used to prepare time series data for building models for forecasting. 𝑂𝑢𝑡_𝑋 acts as the input to the forecasting problem, giving the model context for the previous values. The model then uses the values in the sequence to predict the next 𝑡𝑎𝑟𝑔𝑒𝑡_𝑠𝑖𝑧𝑒 values,” Section 4.2.2; “In our setup, we implemented around 10 models, and we have selected 4 models to carry out our analysis. The models are: multilayer perceptron, stacked LSTM model, LSTM encoder-decoder, and linear autoregressive model,” Section 4.3; since “X_index” is associated with each out_X value, it corresponds to “feeding indices as inputs to the DNN”; “out_y…based on target_size” corresponds to “obtaining the equal-sized chunks of time as outputs from the DNN”. “LSTM encoder-decoder” corresponds to a “DNN”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira/Tarango to comprise the rolling sliding window technique of Tinawi. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely reconstruct the original time-series data more closely (Tinawi Section 4.5). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
As to dependent claim 4, the rejection of claim 1 is incorporated. Ferreira/Tarango further teaches a medium wherein the instructions further enable the processing system to substantially reconstruct the time-series datasets by: propagating values through the DNN to substantially decompress the desired time-series dataset at the output of the DNN (“More specifically, the process of compressing / decompressing telemetry data using an auto-encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” Ferreira paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” Ferreira paragraph 0023; since the autoencoder includes a decoder and the autoencoder is part of an autoregressor, “autoregressor” corresponds to “DNN”; “predict (regress)” correspond to “decompress…at the output of the DNN”).
Ferreira/Tarango does not appear to expressly teach a medium wherein the instructions further enable the processing system to substantially reconstruct the time-series datasets by:
receiving an index corresponding to a desired time range associated with a desired time-series dataset; and
inputting the index to the trained DNN.
Tinawi teaches a medium wherein the instructions further enable the processing system to substantially reconstruct the time-series datasets by:
receiving an index corresponding to a desired time range associated with a desired time-series dataset (“
PNG
media_image1.png
551
1002
media_image1.png
Greyscale
… Rolling window sequences is a method that is commonly used to prepare time series data for building models for forecasting. 𝑂𝑢𝑡_𝑋 acts as the input to the forecasting problem, giving the model context for the previous values. The model then uses the values in the sequence to predict the next 𝑡𝑎𝑟𝑔𝑒𝑡_𝑠𝑖𝑧𝑒 values,” Section 4.2.2; “In our setup, we implemented around 10 models, and we have selected 4 models to carry out our analysis. The models are: multilayer perceptron, stacked LSTM model, LSTM encoder-decoder, and linear autoregressive model,” Section 4.3; “out_X” corresponds to “a desired time-series dataset”; “X_index” corresponds to “an index corresponding to a desired time range associated with a desired time-series dataset”); and
inputting the index to the trained DNN (“
PNG
media_image1.png
551
1002
media_image1.png
Greyscale
… Rolling window sequences is a method that is commonly used to prepare time series data for building models for forecasting. 𝑂𝑢𝑡_𝑋 acts as the input to the forecasting problem, giving the model context for the previous values. The model then uses the values in the sequence to predict the next 𝑡𝑎𝑟𝑔𝑒𝑡_𝑠𝑖𝑧𝑒 values,” Section 4.2.2; “In our setup, we implemented around 10 models, and we have selected 4 models to carry out our analysis. The models are: multilayer perceptron, stacked LSTM model, LSTM encoder-decoder, and linear autoregressive model,” Section 4.3; since the model uses “X_index” to generate “out_y”, “X_index” corresponds to “inputting the index”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira/Tarango to comprise the rolling sliding window technique of Tinawi. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely reconstruct the original time-series data more closely (Tinawi Section 4.5). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claim 5 is rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango and Komaragiri (“Machine Learning Models for Predictive Maintenance and Performance Optimization in Telecom Infrastructure”).
As to dependent claim 5, the rejection of claim 1 is incorporated.
Ferreira/Tarango does not appear to expressly teach a medium wherein a telemetry device is configured to prune the raw telemetry data before transmitting the raw telemetry data for collection.
Komaragiri teaches a medium wherein a telemetry device is configured to prune the raw telemetry data before transmitting the raw telemetry data for collection (“As a key step for most machine learning models, data collection and data preprocessing will determine the quality of these models…Two types of data are acquired for this study. The first type includes telemetry data that are generated at a base station and/or a cell level every five minutes…Data preprocessing steps consist of cleaning, filtering, and merging the collected telemetry and alarm data to train predictive models. Missing values account for the most data quality issues in this data. Each data recording and value is associated with a clock-time stamped time on a base 10 min granularity. In a day, deleting a recording means losing values for each of its monitored metrics for 288 10 minute blocks. However, when telemetry data recording is missed, an alarm can also be deemed invalid. Thus, there are too many recordings and pruning invalid ones is a desired preprocessing step. Another anticipated data quality issue that needs to be handled is the outliers, specifically in the form of negative numbers in telemetry metrics. This is a domain specific issue, since models are trained based on what insights were gained from the data. The telemetry data to build the baseline models must be decided…in the preprocessing pipeline,” Section 6; “base station and/or a cell level” corresponds to “a telemetry device”; “pruning invalid” data records corresponds to “prune the raw telemetry data”; since the data preprocessing step occurs before the data is passed to the machine learning model, “the telemetry data to build the baseline models must be decided…in the preprocessing pipeline” corresponds to “before transmitting the raw telemetry data for collection”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira/Tarango to comprise the preprocessing of Komaragiri. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely improving the performance quality of the machine learning model (Komaragiri Section 6). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claim 6 is rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango, Komaragiri, and Wang (CN 112906853 A).
As to dependent claim 6, the rejection of claim 5 is incorporated. Ferreira/Tarango/Komaragiri further teaches a medium wherein the instructions further enable the processing system to:
detect a quality factor of a data decompression process related to reconstruction (“The model or the compressor 104 learns through two error signals: an error reconstruction metric of the data (Lc) and a prediction error metric (Lh). The autoencoder 100 shown in FIG. 1 operates as previously described. In addition, the compressed data 106 follows two paths. The compressed data 106 is used as the input of the compressor decoder (g74(Z)) as previously described and is also used for training a prediction model (h(Z)). The quality of these tasks are measured using the following loss function: L(Lc,Lh)=a Lc+bLh. In one example, Lc measures the quality of the decompressed or reconstructed data 110 (i.e., the amount of error introduced into the data by using a lossy compression),” Ferreira paragraphs 0025-0026; “Lc measures the quality of the decompressed or reconstructed data” corresponds to “a quality factor of a data decompression process related to reconstruction”); and
provide a feedback signal to change the parameters of a pruning process associated with the telemetry device in order to reduce a reconstruction error (“As a key step for most machine learning models, data collection and data preprocessing will determine the quality of these models…Two types of data are acquired for this study. The first type includes telemetry data that are generated at a base station and/or a cell level every five minutes…Data preprocessing steps consist of cleaning, filtering, and merging the collected telemetry and alarm data to train predictive models. Missing values account for the most data quality issues in this data. Each data recording and value is associated with a clock-time stamped time on a base 10 min granularity. In a day, deleting a recording means losing values for each of its monitored metrics for 288 10 minute blocks. However, when telemetry data recording is missed, an alarm can also be deemed invalid. Thus, there are too many recordings and pruning invalid ones is a desired preprocessing step. Another anticipated data quality issue that needs to be handled is the outliers, specifically in the form of negative numbers in telemetry metrics. This is a domain specific issue, since models are trained based on what insights were gained from the data. The telemetry data to build the baseline models must be decided…in the preprocessing pipeline”, Komaragiri Section 6; “pruning invalid” data records correspond to “a pruning process”; “More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” Ferreira paragraph 0021; “The model or the compressor 104 learns through two error signals: an error reconstruction metric of the data (Lc) and a prediction error metric (Lh). The autoencoder 100 shown in FIG. 1 operates as previously described. In addition, the compressed data 106 follows two paths. The compressed data 106 is used as the input of the compressor decoder (g74(Z)) as previously described and is also used for training a prediction model (h(Z)). The quality of these tasks are measured using the following loss function: L(Lc,Lh)=a Lc+bLh. In one example, Lc measures the quality of the decompressed or reconstructed data 110 (i.e., the amount of error introduced into the data by using a lossy compression), Lh measures the quality of predictions, a is the weight of the compression loss, and b is the weight of the prediction loss, such that a+b=1. The loss function L 114 is used to adjust the weights of the auto-encoder 100. In one example, a and b may be set by default. Initially, for example, a=b=0.5,” Ferreira paragraphs 0025-0026; “…. retraining the…compressor when sufficient compressed data is received from the plurality of data sources or when a prediction error rate exceeds a threshold,” Ferreira paragraph 0065; “error signals” correspond to “provide a feedback signal”; “the loss function…is used to adjust the weights of the auto-encoder” corresponds to “change the parameters”; since the input data is telemetry data, the error signals, loss function and adjusting the weights are “associated with the telemetry device”; since retraining occurs as long as the prediction error exceeds a threshold, this means that once the error is below the threshold, retraining no longer occurs, thus the retraining occurs “in order to reduce a reconstruction error”).
Ferreira/Tarango/Komaragiri does not appear to expressly teach a medium wherein the instructions further enable the processing system to: adjust pruning level of the pruning process using Reinforcement Learning.
Wang teaches a medium wherein the instructions further enable the processing system to: adjust pruning level of the pruning process using Reinforcement Learning (“In an embodiment of the present application, by extracting the feature information of each layer, generating the feature vectors of each corresponding layer, and sending them into the decision model based on the continuous iterative optimization of the reinforcement learning mechanism, the channel pruning ratio of each layer is generated, and finally the pruning ratio decision of each layer of the model is determined, and the pruning strategy of the entire model is generated. In this way, by relying on the reinforcement learning of the Actor-Critic algorithm and the design of the feedback function, a channel pruning strategy model is constructed to adaptively and dynamically find the optimal channel pruning ratio. The embodiment of the present application provides a self -learning mechanism that automatically learns decision pruning, avoids traditional manual experience and improves the universality of different models,” paragraph 0067; “based on the continuous iterative optimization of the reinforcement learning mechanism, the channel pruning ratio of each layer is generated” corresponds to “adjust pruning level…using Reinforcement Learning”; “pruning strategy” corresponds to “pruning process”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the pruning of Ferreira/Tarango/Komaragiri to comprise the reinforcement learning of Wang. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely adjusting pruning using reinforcement learning (“In an embodiment of the present application, by extracting the feature information of each layer, generating the feature vectors of each corresponding layer, and sending them into the decision model based on the continuous iterative optimization of the reinforcement learning mechanism, the channel pruning ratio of each layer is generated, and finally the pruning ratio decision of each layer of the model is determined, and the pruning strategy of the entire model is generated. In this way, by relying on the reinforcement learning of the Actor-Critic algorithm and the design of the feedback function, a channel pruning strategy model is constructed to adaptively and dynamically find the optimal channel pruning ratio. The embodiment of the present application provides a self -learning mechanism that automatically learns decision pruning, avoids traditional manual experience and improves the universality of different models,” Wang paragraph 0067; “based on the continuous iterative optimization of the reinforcement learning mechanism, the channel pruning ratio of each layer is generated” corresponds to “adjust pruning level…using Reinforcement Learning”; “pruning strategy” corresponds to “pruning process”). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claims 8-10 are rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango, Song et al. (US 2019/0034497 A1, hereinafter Song), and Li et al. (“Anomaly Detection of Time Series With Smoothness-Inducing Sequential Variational Auto-Encoder,” hereinafter Li).
As to dependent claim 8, the rejection of claim 1 is incorporated.
Ferreira/Tarango does not appear to expressly teach a medium wherein the instructions further enable the processing system to create the DNN with indices and relationships between each index and a respective time-series dataset, and wherein, in response to receiving an index for a query request.
Song teaches a medium wherein the instructions further enable the processing system to create the DNN with indices and relationships between each index and a respective time-series dataset, and wherein, in response to receiving an index for a query request (“In the exemplary embodiments of the present invention, methods and devices are provided for employing a Data2Data engine or module to perform efficient multi variate time series retrieval with respect to large scale historical data (located in a history database). In the training stage, given input multivariate time series segments, an input attention based recurrent neural network (LSTM / GRU) can be employed to extract real value features as well as hash codes (for indexing) supervised by a pairwise loss or a triplet loss. Both real value features and their correspond ing hash codes are jointly learned in an end - to - end manner in the deep neural networks. In the test stage, given a multivariate time series segment query, the Data2Data engine or module can automatically generate relevant real value features as well as hash codes of the query and return the most relevant time series segments in the historical data,” paragraph 0019; “In the exemplary embodiments of the present invention, methods and devices are provided for capturing the long - term temporal dependencies of multivariate time series by employing an input attention based LSTM / GRU algorithm. The method can provide effective and compact (higher quality) representations of multivariate time series segments, can generate discriminative binary codes (more effective) for indexing multivariate time segments, and given a query time series segment…,” paragraph 0020; “extract real value features as well as hash codes” and “real value features and their corresponding hash codes are jointly learned…in the deep neural networks” corresponds to “create the DNN with indices and relationships between each index”; “time series segments in historical data” correspond to “respective time-series dataset”; “given a query time series segment” corresponds to “response to receiving an index for a query request”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira/Tarango to comprise the indexing scheme of Song. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely accessing information quickly and efficiently (Song paragraph 0038). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Ferreira/Tarango/Song does not appear to expressly teach a medium wherein the DNN is configured to substantially reconstruct a time-series bucket related to the index.
Li teaches a medium wherein the DNN is configured to substantially reconstruct a time-series bucket related to the index (“To solve this problem, we divide the long time series into short chunks, using sliding windows technique. We apply a sliding window to the time series, which slides over multiple time series synchronously. The sliding windows is
PNG
media_image2.png
686
627
media_image2.png
Greyscale
,” Section IV; “d” corresponds to a “the index”; hence “Xd” corresponds to “a time series bucket related to the index”; since the model is decoding each x in “Xd”, it corresponds to “reconstruct a time series bucket related to the index”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira/Tarango/Song to comprise the reconstructing each time-series bucket of Li. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely robust estimation and detection for correlated time-series datasets (Li Section IV). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
As to dependent claim 9, the rejection of claim 8 is incorporated. Ferreira/Tarango/Song/Li further teaches a medium wherein creating the DNN includes:
picking the indices randomly or according to a pattern (“In the exemplary embodiments of the present invention, methods and devices are provided for employing a Data2Data engine or module to perform efficient multi variate time series retrieval with respect to large scale historical data (located in a history database). In the training stage, given input multivariate time series segments, an input attention based recurrent neural network (LSTM / GRU) can be employed to extract real value features as well as hash codes (for indexing) supervised by a pairwise loss or a triplet loss,” Song paragraph 0019; “At block 128, hash codes are obtained by utilizing tanh () and sign () function,” Song paragraph 0034; “hash codes are obtained by utilizing tanh () and sign () function” corresponds to “picking the indices…according to a pattern); and/or
determining the indices with respect to a bottleneck of an autoencoder.
As to dependent claim 10, the rejection of claim 8 is incorporated. Ferreira/Tarango/Song/Li further teaches a medium wherein creating the DNN includes:
forming multiple dense layers; and/or
using a decoder of an autoencoder (“More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” Ferreira paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” Ferreira paragraph 0023; since the autoencoder includes a decoder and the decoder is part of an autoregressor, “autoregressor” corresponds to “autoencoder”).
Claims 11-13 are rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango and Fajtl et al. (“Latent Bernoulli Autoencoder,” hereinafter Fajtl).
As to dependent claim 11, the rejection of claim 1 is incorporated. Ferreira/Tarango further teaches a medium comprising compressing the time-series datasets (“Embodiments of the invention relate to learning an intelligent compression / decompression model for a specific purpose such that high compression rates are achieved and low error rates are maintained. Embodiments of the invention are discussed in the context of a task for predicting read and write response times based on historical telemetry data. The telemetry data thus reflects actual read and write response times in one example,” Ferreira paragraph 0015; “More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110”; [0029] “….data nodes or sources…may be operating as together (e.g. a system of appliances or virtual machines that are performing data protection operations),” Ferreira paragraph 0021; “This invention leverages the availability of real telemetry data coming from different data nodes…,” Ferreira paragraph 0036; since the telemetry data reflects actual read and write response times, it contains “time series datasets”).
Ferreira/Tarango does not appear to expressly teach a medium wherein compressing the [] datasets includes compressing the time-series datasets at multiple different compression rates and at different precisions depending on a size of a numeric value included in the different [] datasets.
Fajtl teaches a medium wherein compressing the [] datasets includes compressing the time-series datasets at multiple different compression rates and at different precisions depending on a size of a numeric value included in the different [] datasets (“
PNG
media_image3.png
239
598
media_image3.png
Greyscale
PNG
media_image4.png
729
559
media_image4.png
Greyscale
,” Section 3; “We propose a simple, closed form method for sampling from the Bernoulli latent space as well as to perform a smooth interpolation and attribute modification in this space,” Section 6; since the autoencoder is used for sampling from a Bernoulli latent space, the “deterministic autoencoder” corresponds to “a Bernoulli Transformer AutoEncoder (BTAE)”; “produces typically real-valued latent representation z for input X” corresponds to “compress the…datasets into Bernoulli distributed latent states”; “add tanh() before the binarization which limits the gradient flow” in order to reduce noise and speed up training corresponds to “constraint the distortion of reconstructed…datasets”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the autoencoder of Ferreira/Tarango to comprise the BTAE autoencoder of Fajtl. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely constraining noise (Fajtl Section 3). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
As to dependent claim 12, the rejection of claim 1 is incorporated. Ferreira/Tarango further teaches a medium comprising compressing the time-series datasets (“Embodiments of the invention relate to learning an intelligent compression / decompression model for a specific purpose such that high compression rates are achieved and low error rates are maintained. Embodiments of the invention are discussed in the context of a task for predicting read and write response times based on historical telemetry data. The telemetry data thus reflects actual read and write response times in one example,” Ferreira paragraph 0015; “More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110”; [0029] “….data nodes or sources…may be operating as together (e.g. a system of appliances or virtual machines that are performing data protection operations),” Ferreira paragraph 0021; “This invention leverages the availability of real telemetry data coming from different data nodes…,” Ferreira paragraph 0036; since the telemetry data reflects actual read and write response times, it contains “time series datasets”).
Ferreira/Tarango does not appear to expressly teach a medium wherein compressing the [] datasets includes increasing the compression rate by using a Bernoulli Transformer AutoEncoder (BTAE) so as to compress the time-series datasets into Bernoulli distributed latent states and constrain the distortion of reconstructed [] datasets.
Fajtl teaches a medium wherein compressing the [] datasets includes increasing the compression rate by using a Bernoulli Transformer AutoEncoder (BTAE) so as to compress the time-series datasets into Bernoulli distributed latent states and constrain the distortion of reconstructed [] datasets (“
PNG
media_image3.png
239
598
media_image3.png
Greyscale
PNG
media_image4.png
729
559
media_image4.png
Greyscale
,” Section 3; “We propose a simple, closed form method for sampling from the Bernoulli latent space as well as to perform a smooth interpolation and attribute modification in this space,” Section 6; since the autoencoder is used for sampling from a Bernoulli latent space, the “deterministic autoencoder” corresponds to “a Bernoulli Transformer AutoEncoder (BTAE)”; “produces typically real-valued latent representation z for input X” corresponds to “compress the…datasets into Bernoulli distributed latent states”; “add tanh() before the binarization which limits the gradient flow” in order to reduce noise and speed up training corresponds to “constraint the distortion of reconstructed…datasets”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the autoencoder of Ferreira/Tarango to comprise the BTAE autoencoder of Fajtl. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely constraining noise (Fajtl Section 3). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
As to dependent claim 13, the rejection of claim 12 is incorporated. Ferreira/Tarango/Fajtl further teaches a medium wherein the BTAE includes an encoder acting as a feed-forward device and the decoder acting as a dictionary (“
PNG
media_image4.png
729
559
media_image4.png
Greyscale
…
PNG
media_image5.png
518
623
media_image5.png
Greyscale
,” Section 3; “the encoder corrects its output in the direction of the binarized quantities read by the decoder” corresponds to “an encoder acting as a feed-forward device”; “an image X’ is decoded from the binary latent b as X’ = fθ(b)” involves mapping the vector b from the binarization encoding space to a reconstructed vector, thus this corresponds to “the decoder acting as a dictionary”).
Claim 14 is rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango, Fajtl, and Ortiz et al. (US 2021/0173916 A1, hereinafter Ortiz).
As to dependent claim 14, the rejection of claim 12 is incorporated. Ferreira/Tarango/Fajtl further teaches a medium comprising a BTAE (“
PNG
media_image3.png
239
598
media_image3.png
Greyscale
PNG
media_image4.png
729
559
media_image4.png
Greyscale
,” Fajtl Section 3; “We propose a simple, closed form method for sampling from the Bernoulli latent space as well as to perform a smooth interpolation and attribute modification in this space,” Fajtl Section 6; since the autoencoder is used for sampling from a Bernoulli latent space, the “deterministic autoencoder” corresponds to “a Bernoulli Transformer AutoEncoder (BTAE)”) and values of the time-series datasets (“Embodiments of the invention relate to learning an intelligent compression / decompression model for a specific purpose such that high compression rates are achieved and low error rates are maintained. Embodiments of the invention are discussed in the context of a task for predicting read and write response times based on historical telemetry data. The telemetry data thus reflects actual read and write response times in one example,” paragraph 0015; “More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” paragraph 0021; “….data nodes or sources…may be operating as together (e.g. a system of appliances or virtual machines that are performing data protection operations),” paragraph 0029; “This invention leverages the availability of real telemetry data coming from different data nodes….”’; since the telemetry data reflects actual read and write response times, it corresponds to “raw telemetry data being collected as time series datasets,” paragraph 0036; “The input data came from the Sizer reporter database, a tool that allows field engineers to upload performance files from a customer site, in order to generate performance reports. The data are composed of 1 million 68-dimensional observations. Here, the efficiency of training a response time predictor with data previously compressed and decompressed by using the disclosed framework,” paragraph 0039; since the telemetry data reflects actual read and write response times, it contains “time series datasets”).
Ferreira/Tarango/Fajtl does not appear to expressly teach a medium [] configured to reduce the size of a latent state by a factor related to the number of bits of floating point numbers used for the values [].
Ortiz teaches a medium [] configured to reduce the size of a latent state by a factor related to the number of bits of floating point numbers used for the values [] (“The neutral network model 2603 may in some embodiments output a 256 bit floating point latent vector. The model 2603 may learn to represent facial features namely eyes, nose, mouth in a lower dimension. For example, it may be a machine learning based system that looks at a picture, or a frame of a video, processes it to determine that the picture contains a face, and identify the facial features. Training of the model may require large amounts of data. The training process teaches the model 2603 to generate a meaningful vector, which may be 256 floating point numbers that reduce a higher dimension (e.g., 256×256×3) image to a lower dimension (256),” paragraph 0187; an image “dimension” corresponds to “size of a latent state” and “values”; “256 floating point numbers that reduce a higher dimension…image” corresponds to “reduce…by a factor related to the number of bits of floating point numbers for the values”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the BTAE of Ferreira/Tarango/Fajtl to comprise the encoder of Ortiz. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely increased data security (Ortiz paragraph 0008). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claim 15 is rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango and Ryck et al. (“Change Point Detection in Time Series Data Using Autoencoders with a Time-Invariant Representation,” hereinafter Ryck).
As to dependent claim 15, the rejection of claim 1 is incorporated.
Ferreira/Tarango does not appear to expressly teach a medium wherein substantially reconstructing the time-series datasets includes transforming the time-series datasets to a frequency domain.
Ryck teaches a medium wherein substantially reconstructing the time-series datasets includes transforming the time-series datasets to a frequency domain (“
PNG
media_image6.png
522
561
media_image6.png
Greyscale
,” Section II; “use the discrete Fourier transform (DFT) on each window” and “bundling all these transformations…we obtain the frequency-domain counterpart” corresponds to “transforming the time-series datasets to the frequency domain”)
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira/Tarango to comprise the preprocessing of Ryck. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely enabling the autoencoder to better capture changes in data (Ryck Section I). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claim 16 is rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango and Toderici et al. (US 10192327 B1, hereinafter Toderici).
As to dependent claim 16, the rejection of claim 1 is incorporated. Ferreira/Tarango further teaches a medium raw telemetry data (“Embodiments of the invention relate to learning an intelligent compression / decompression model for a specific purpose such that high compression rates are achieved and low error rates are maintained. Embodiments of the invention are discussed in the context of a task for predicting read and write response times based on historical telemetry data. The telemetry data thus reflects actual read and write response times in one example,” Ferreira paragraph 0015).
Ferreira/Tarango does not appear to expressly teach a medium wherein the instructions further enable the processing system to:
determine residuals as a difference between outputs of a reconstruction process and the [] data; and
compress the residuals.
Toderici teaches a medium wherein the instructions further enable the processing system to:
determine residuals as a difference between outputs of a reconstruction process and the [] data (“
PNG
media_image7.png
492
544
media_image7.png
Greyscale
,” column 8 lines 22-42; “rt” corresponds to “a residual” that is “a difference between outputs of a reconstruction process and the raw…data”; since calculations are being done t times, multiple “residuals” are determined); and
compress the residuals (“
PNG
media_image8.png
509
676
media_image8.png
Greyscale
,” column 10 lines 24-42; since the total loss combines all residuals, it compresses “the residuals”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the time-series data of Ferreira/Tarango to comprise the computing a difference/loss at each decoder output of Toderici. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely improving compression accuracy (Toderici column 8 lines 15-17). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claim 17 is rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango, Toderici, and Wang.
As to dependent claim 17, the rejection of claim 16 is incorporated. Ferreira/Tarango/Toderici further teaches a medium wherein the instructions further enable the processing system to perform a prediction-quantization-entropy coding scheme, whereby a prediction procedure is related to a decoding element of an autoencoder (“More specifically, the process of compressing / decompressing telemetry data using an auto - encoder 100 as described in FIG. 1 operates by obtaining the input data 102 (e.g., the telemetry data). Next, the input data 102 is passed through a compressor (e.g., an encoder) fθ generating Z. Z is an example of the compressed data 106. The decompressor 108 (decoder) gθ can be used to decompress the compressed data 106 to generate the decompressed or reconstructed data 110,” Ferreira paragraph 0021; “FIG. 1 thus illustrates an autoregressor 120 that includes the autoencoder 100. The autoregressor 120 jointly learns how to compress and predict (regress). In this example, the autoregressor 120 is a deep neural network that uses the result,” Ferreira paragraph 0023; since the autoencoder includes a decoder and the decoder is part of an autoregressor, “autoregressor” corresponds to “autoencoder”; “decoder” corresponds to “decoding element of an autoencoder”), and residuals (“
PNG
media_image7.png
492
544
media_image7.png
Greyscale
,” column 8 lines 22-42; “rt” corresponds to “a residual” that is “a difference between outputs of a reconstruction process and the raw…data”; since calculations are being done t times, multiple “residuals” are determined).
Ferreira/Tarango/Toderici does not appear to expressly teach a medium whereby quantization and entropy procedures are related to a distortion constraint element for processing.
Wang teaches a medium whereby quantization and entropy procedures are related to a distortion constraint element for processing (“In an embodiment of the present application, by extracting the feature information of each layer, generating the feature vectors of each corresponding layer, and sending them into the decision model based on the continuous iterative optimization of the reinforcement learning mechanism, the channel pruning ratio of each layer is generated, and finally the pruning ratio decision of each layer of the model is determined, and the pruning strategy of the entire model is generated. In this way, by relying on the reinforcement learning of the Actor-Critic algorithm and the design of the feedback function, a channel pruning strategy model is constructed to adaptively and dynamically find the optimal channel pruning ratio. The embodiment of the present application provides a self -learning mechanism that automatically learns decision pruning, avoids traditional manual experience and improves the universality of different models,” paragraph 0067; “Step S307: If the evaluation result indicates that the intermediate model does not meet the target constraint condition, feedback adjustment is performed on the pruning strategy to obtain a new pruning strategy,” paragraph 0088; “if...the result indicates that the...model does not meet the target constraint condition, feedback adjustment is performed….” corresponds to “a distortion constraint element for processing”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the residuals of Ferreira/Tarango/Toderici to comprise the processing of Wang. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely improving the universality of a machine learning model (Wang 0067). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Claim 18 is rejected under 35 U.S.C. § 103 as being unpatentable over Ferreira in view of Tarango and Mentzer et al. (“Conditional Probability Models for Deep Image Compression,” hereinafter Mentzer).
As to dependent claim 18, the rejection of claim 1 is incorporated.
Ferreira/Tarango does not appear to expressly teach a medium wherein the instructions further enable the processing system to determine quantized entropy loss to constrain the size of an encoded residual with respect to total entropy.
Mentzer teaches a medium wherein the instructions further enable the processing system to determine quantized entropy loss to constrain the size of an encoded residual with respect to total entropy (“
PNG
media_image9.png
776
679
media_image9.png
Greyscale
,” Section 3; “
z
^
” corresponds to “an encoded residual”; “We want the encoded representation
z
^
to be as compact” corresponds to “constrain the size of an encoded residual”; “average entropy H” corresponds to “with respect to total entropy”; “rate-distortion trade-off” corresponds to “quantized entropy loss”).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the DNN of Ferreira/Tarango to comprise the entropy loss of Mentzer. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely lossless compression of data (Mentzer Section 2). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure:
US 2021/0303598 A1 disclosing a time-series DNN deployed to a network node
Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
In the interests of compact prosecution, Applicant is invited to contact the examiner via electronic media pursuant to USPTO policy outlined MPEP § 502.03. All electronic communication must be authorized in writing. Applicant may wish to file an Internet Communications Authorization Form PTO/SB/439. Applicant may wish to request an interview using the Interview Practice website: http://www.uspto.gov/patent/laws-and-regulations/interview-practice.
Applicant is reminded Internet e-mail may not be used for communication for matters under 35 U.S.C. § 132 or which otherwise require a signature. A reply to an Office action may NOT be communicated by Applicant to the USPTO via Internet e-mail. If such a reply is submitted by Applicant via Internet e-mail, a paper copy will be placed in the appropriate patent application file with an indication that the reply is NOT ENTERED. See MPEP § 502.03(II).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ryan Barrett whose telephone number is 571 270 3311. The examiner can normally be reached 9:00am to 5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor Michelle Bechtold can be reached at 571 431 0762. The fax phone number for the organization where this application or proceeding is assigned is 571 273 8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Ryan Barrett/
Primary Examiner, Art Unit 2148