Last updated: April 19, 2026
Application No. 17/159,868
METHODS AND SYSTEMS FOR USING MACHINE LEARNING MODELS THAT GENERATE CLUSTER-SPECIFIC TEMPORAL REPRESENTATIONS FOR TIME SERIES DATA IN COMPUTER NETWORKS

Non-Final OA §101§103§112
Filed
Jan 27, 2021
Examiner
KWON, JUN
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
The Bank Of New York Mellon
OA Round
5 (Non-Final)
This examiner grants 38% of cases after interview

— +46.2% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 68 resolved cases, 2023–2026
Examiner Intelligence

KWON, JUN View full profile →
Grants only 38% of cases
Career Allow Rate
26 granted / 68 resolved
-16.8% vs TC avg
Strong +46% interview lift
Without
With
+46.2%
Interview Lift
resolved cases with interview
Typical timeline
4y 3m
Avg Prosecution
34 currently pending
Career history
102
Total Applications
across all art units
Statute-Specific Performance

§101
31.8%
-8.2% vs TC avg
§103
41.4%
+1.4% vs TC avg
§102
7.6%
-32.4% vs TC avg
§112
18.1%
-21.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 68 resolved cases
Office Action

§101 §103 §112
Detailed Action
	This Office Action is in response to the remarks entered on 10/27/2025. Claims 9 and 19 have been canceled. Claims 1-8, 10-18 and 20 are currently pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/27/2025 has been entered.
 
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-8, 10-18 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
	Regarding claim 1, it recites “determine, based on the first clustering recommendation and by comparing the first reconstruction of the first time series data for the first domain to one or more reconstructions of other domains assigned to the first cluster, whether or not the first reconstruction comprises an outlier within the first cluster;” It is unclear what constitutes comparing the first reconstruction of the first time series data to one or more reconstructions of other domains assigned to the first cluster. Claim 1 mentions that the latent representations are input to the clustering layer and then assigned to a specific cluster, not the reconstructions. 
	For purpose of examination, examiner interprets the claim to mean: The latent space representations are input to the clustering layer. The latent space representation is assigned to the first cluster.

Claim 2 is a method claim which implements the same features as the system claim 1, and is rejected for at least the same reasons.

Claim 12 is a non-transitory, computer-readable medium claim which implements the same features as the system claim 1, and is rejected for at least the same reasons.

Claims 3-8, 10-11, 13-18 and 20 depend from independent claims 2 and 12. Therefore, the claims inherit the same deficiency.

	Regarding Claim 5 and 6, claims recite “determining a centroid value of the first cluster based on the first reconstruction and the second reconstruction; determining a first distance of the first reconstruction from the centroid value;” and “determining a second distance of the second reconstruction from the centroid value;” It is unclear what constitutes ‘determining a centroid value based on the reconstruction’ and ‘determining distance of the reconstruction from the centroid value’ as claim 1 mentions that the latent representations are input to the clustering layer and then assigned to a specific cluster, not the reconstructions. Does it mean that the distance is calculated based on the latent space value of the reconstructions?
	For purpose of examination, examiner interprets the claim to mean: The centroid value and the distances from the centroid value are determined based on the first latent space representation and the second latent representation.

Claims 15 and 16 are non-transitory, computer-readable medium claims which implements the same features as the method claims 5 and 6, and are rejected for at least the same reasons.

Claim Rejections - 35 USC § 101
	Amended claims were received on 10/27/2025. 35 U.S.C. 101 rejection has been withdrawn. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-3 and 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Servajean et al. (US 20200106795 A1, hereinafter ‘Servajean’) in view of Cai et al. (Cai et al, “Stacked sparse auto-encoder for deep clustering”, 2019, hereinafter ‘Cai’) and further in view of Koral et al. (US 20200112574 A1, hereinafter ‘Koral’).

Regarding claim 2, Servajean teaches: 
A method for generating network alerts based on detected variances in trends of domain traffic over a given time period for disparate domains in a computer network using machine learning models that generate cluster-specific temporal representations for time series sequences, the method comprising ([Servajean, 0024] Servajean detects anomalous data 226 for a device 202 on the network 200. [Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities): 
receiving first time series data for a first domain for a first period of time, wherein the first period of time comprises a currently active monitoring window for the first domain; ([Servajean, 0015] A Network Analyzer 204 is a hardware, software, firmware, or combination component adapted to access and store information about network communication, which provides input to the time series generator 206. [Servajean, 0016] The set of time series comprises characteristics over fixed length time windows)
generating a first feature input based on the first time series data; ([Servajean, 0017] and [Servajean, 0018] collectively disclose generating (clustering) the time series into a plurality of clusters that are input to the autoencoder trainer. An autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, there are a plurality of time series data which goes into a plurality of autoencoders, which correspond to the first and the second time series) 
inputting the first feature input into an encoder portion of a machine learning model to generate a first latent representation, wherein the encoder portion of the machine learning model is trained to generate latent representations of inputted feature inputs ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network [Koral, US-20200112574-A1, 0068]. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions); 
inputting the first latent representation into a decoder portion of the machine learning model to generate a first reconstruction of the first time series data, wherein the decoder portion of the machine learning model is trained to generate reconstructions of inputted feature inputs; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network [Koral, US-20200112574-A1, 0068]. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions)
determining, based on the first reconstruction of the first time series data for the first domain within the first cluster; ([Servajean, 0025] discloses the anomaly detector 224 determines reconstruction errors for the production time series (for each time window) and compares these errors with the aggregate model 222 of reconstruction errors to determine if there is a distance exceeding a predetermined threshold to determine the anomaly)
Servajean does not specifically disclose: 
inputting the first latent representation into a clustering layer of the machine learning model to generate a first clustering recommendation for the first domain, wherein the first clustering recommendation indicates that the first domain corresponds to a first cluster of a plurality of clusters, and wherein the clustering layer of the machine learning model is trained to cluster domains based on respective time series data; 
determining, based on the first clustering recommendation and by comparing the first reconstruction of the first time series data for the first domain to one or more reconstructions of other domains assigned to the first cluster, whether or not the first reconstruction comprises an outlier within the first cluster; 
generating for display, on a user interface, a network alert based on the determination that the first reconstruction is an outlier within the first cluster;
Cai teaches: 
inputting the first latent representation into a clustering layer of the machine learning model to generate a first clustering recommendation for the first domain, wherein the first clustering recommendation indicates that the first domain corresponds to a first cluster of a plurality of clusters, and wherein the clustering layer of the machine learning model is trained to cluster domains based on respective [Cai, page 1535, Figure 1] and [Cai, page 1534, left col, line 1-9] collectively disclose inputting the latent space representation Z for input data X, and processing the latent the clustering layer captures Z to assign a soft label (i.e., the first clustering recommendation) to each learned embedded points)
determining, based on the first clustering recommendation and by comparing the first reconstruction of the first time series data for the first domain to one or more reconstructions of other domains assigned to the first cluster, (loss value) [Cai, page 1534, left col, B. The definition of clustering loss, line 3-15] discloses that the soft label (i.e., the clustering recommendation) is generated, and [Cai, page 1534, right col, D. Optimization strategy, line 1-12] discloses that the cluster centers which are used for the comparison are updated based on the previous latent representations and corresponding soft labels)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean and Cai to use the method of inputting latent representation into a clustering layer of Cai to implement the machine learning method of Servajean. The suggestion and/or motivation for doing so is to improve the efficiency of the machine learning method as utilizing a single model instead of a plurality of models as taught in Servajean helps reducing the amount of calculation needed to cluster input data.
Servajean in view of Cai does not specifically disclose: 
generating for display, on a user interface, a network alert based on the determination that the first reconstruction is an outlier within the first cluster;
Koral teaches: 
generating for display, on a user interface, a network alert based on the determination that the first reconstruction is an outlier within the first cluster; ([Koral, 0052; Fig. 3] The Fig. 3 illustrates an example graph 320 of compressed vector representations of input vectors derived from DNS traffic records. [Koral, 0053] The graph 320 displays which data belongs to which cluster, which indicates which data is anomalous. In the example, networks grouped within cluster 321 are normal network traffic records, and other groups indicates abnormal network (outliers). According to the present specification [005], the broadest reasonable interpretation of ‘generate network alerts’ encompasses indicating abrupt changes, likely changes, and/or other discrepancies in one or more values based on changes of a metric, which directs to mere display of output data)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai and Koral to use the method of utilizing the latent representation and displaying a network alert of Koral to implement the network alert generation system of Servajean. The suggestion and/or motivation for doing so is to improve the visibility of the alert generation system, as the Figure 3 of Koral shows which data point is clustered with which data points and grouping the similar data points with circles 321, 322, 323, and 324 to better display anomalies to the user.

Regarding claim 3, Servajean in view of Cai and further in view of Koral teaches further comprising: 
receiving second time series data for a second domain for the first period of time; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, there are a plurality of time series data which goes into a plurality of autoencoders, which correspond to the first and the second time series. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions) 
generating a second feature input based on the second time series data; ([Servajean, 0016] A time series generator 206 is a hardware, software, firmware or combination component for generating a time series of network characteristics for each of a plurality of network connected devices 202. Each time series is defined by grouping network characteristics for each of a series of fixed length time windows, most consecutive time windows, for each of which a set of network characteristics are identified based on the output of the network analyzer 204. Thus, a set of time series is generated, each for a different device 202, and each comprising characteristics over fixed length time windows)
inputting the second feature input into the encoder portion of the machine learning model to generate a second latent representation; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions) 
inputting the second latent representation into a decoder portion of the machine learning model to generate a second reconstruction of the second time-series data; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions)
Servajean does not specifically disclose: 
inputting the second latent representation into the clustering layer of the machine learning model to generate a second clustering recommendation for the second domain; 
determining to generate for display the network alert based on a determination that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster.
Cai teaches: 
inputting the second latent representation into the clustering layer of the machine learning model to generate a second clustering recommendation for the second domain ([Cai, page 1535, Figure 1] and [Cai, page 1534, left col, line 1-9] collectively disclose inputting the latent space representation Z for input data X, and processing the latent the clustering layer captures Z to assign a soft label (i.e., the clustering recommendation) to each learned embedded points. [Cai, page 1535, left col, line 4-9] discloses that at least two label assignment process (the first clustering and the second clustering) is performed); 
Servajean in view of Cai does not specifically disclose: 
determining to generate for display the network alert based on a determination that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster.
Koral teaches: 
determining to generate for display the network alert based on a determination that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster. ([Koral, 0052; Fig. 3] The Fig. 3 illustrates an example graph 320 of compressed vector representations of input vectors derived from DNS traffic records. [Koral, 0053] The graph 320 displays which data belongs to which cluster, which indicates which data is anomalous. In the example, networks grouped within cluster 321 are normal network traffic records, and other groups indicates abnormal network (outliers). According to the present specification [005], the broadest reasonable interpretation of ‘generate network alerts’ encompasses indicating abrupt changes, likely changes, and/or other discrepancies in one or more values based on changes of a metric, which directs to mere display of output data)

Regarding claim 10, Servajean in view of Cai teaches the method of claim 2. 
Servajean in view of Cai does not specifically disclose wherein the network alert indicates that the first reconstruction comprises an outlier from respective reconstructions of domains in the first cluster.
Koral teaches: 
wherein the network alert indicates that the first reconstruction comprises an outlier from respective reconstructions of domains in the first cluster. ([Koral, 0052; Fig. 3] The Fig. 3 illustrates an example graph 320 of compressed vector representations of input vectors derived from DNS traffic records. [Koral, 0053] The graph 320 displays which data belongs to which cluster, which indicates which data is anomalous. In the example, networks grouped within cluster 321 are normal network traffic records, and other groups indicates abnormal network (outliers). According to the present specification [005], the broadest reasonable interpretation of ‘generate network alerts’ encompasses indicating abrupt changes, likely changes, and/or other discrepancies in one or more values based on changes of a metric, which directs to mere display of output data)

Regarding claim 11, Servajean teaches: 
wherein the machine learning model maintains a time dependency for the first time series data. ([Servajean, 0017] A clustering process 208 is performed to cluster the set of time series into a plurality of clusters each constituting a subset of the set. In one embodiment, each cluster is defined based on a random division of the set of time series. In another embodiment, each cluster is defined based on an autoencoder as input to a clustering algorithm such as k-means. For example, an autoencoder can be employed to convert a time series to a feature vector on which basis clustering is performed. Thus, for the set of time series each time series can be converted to a feature vector as input to a clustering algorithm such as k-means. In this way time series with common features determined by the autoencoder can be clustered together. In one embodiment, such clustering results in devices 202 having similar network communication characteristics being clustered together.
[Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis)

Claims 1 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Servajean in view of Cai in view of Koral and further in view of Migliori et al. (US 10291268 B1, hereinafter ‘Migliori’).

Regarding claim 1, Servajean teaches: 
A system for generating network alerts based on detected variances in trends of domain traffic over a given time period for disparate domains in a computer network using machine learning models that generate cluster-specific temporal representations for time series sequences, the system comprising: ([Servajean, 0024] Servajean detects anomalous data 226 for a device 202 on the network 200. [Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities)
cloud-based storage circuitry configured to a machine learning model, wherein an encoder portion of the machine learning model is trained to generate latent representations of inputted feature inputs, wherein the machine learning model maintains a time dependency for time series data, wherein the machine learning model comprises an autoencoder constructed using a clustering layer of the machine learning model is trained to cluster domains based on respective time series data; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions)
control circuitry configured to: receive first time series data for a first domain for a first period of time, wherein the first period of time comprises a currently active monitoring window for the first domain; ([Servajean, 0015] A Network Analyzer 204 is a hardware, software, firmware, or combination component adapted to access and store information about network communication, which provides input to the time series generator 206. [Servajean, 0016] The set of time series comprises characteristics over fixed length time windows)
generate a first feature input based on the first time series data; ([Servajean, 0017] and [Servajean, 0018] collectively disclose generating (clustering) the time series into a plurality of clusters that are input to the autoencoder trainer. An autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, there are a plurality of time series data which goes into a plurality of autoencoders, which correspond to the first and the second time series)
input the first feature input into an encoder portion of a machine learning model to generate a first latent representation; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions)
input the first latent representation into a decoder portion of the machine learning model to generate a first reconstruction of the first time series data; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions)
determine, based on the first reconstruction of the first time series data for the first domain within the first cluster; ([Servajean, 0025] discloses the anomaly detector 224 determines reconstruction errors for the production time series (for each time window) and compares these errors with the aggregate model 222 of reconstruction errors to determine if there is a distance exceeding a predetermined threshold to determine the anomaly)
However, Servajean does not specifically disclose:
input the first latent representation into a clustering layer of the machine learning model to generate a first clustering recommendation for the first domain, wherein the first clustering recommendation indicates that the first domain corresponds to a first cluster of a plurality of clusters; and 
determine, based on the first clustering recommendation and by comparing the first reconstruction of the first time series data for the first domain to one or more reconstructions of other domains assigned to the first cluster, whether or not the first reconstruction comprises an outlier within the first cluster;
Cai teaches:
input the first latent representation into a clustering layer of the machine learning model to generate a first clustering recommendation for the first domain, wherein the first clustering recommendation indicates that the first domain corresponds to a first cluster of a plurality of clusters; and ([Cai, page 1535, Figure 1] and [Cai, page 1534, left col, line 1-9] collectively disclose inputting the latent space representation Z for input data X, and processing the latent the clustering layer captures Z to assign a soft label (i.e., the first clustering recommendation) to each learned embedded points)
determine, based on the first clustering recommendation and by comparing the first reconstruction of the  the first cluster, (loss value) [Cai, page 1534, left col, B. The definition of clustering loss, line 3-15] discloses that the soft label (i.e., the clustering recommendation) is generated, and [Cai, page 1534, right col, D. Optimization strategy, line 1-12] discloses that the cluster centers which are used for the comparison are updated based on the previous latent representations and corresponding soft labels)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean and Cai to use the method of inputting latent representation into a clustering layer of Cai to implement the machine learning method of Servajean. The suggestion and/or motivation for doing so is to improve the efficiency of the machine learning method as utilizing a single model instead of a plurality of models as taught in Servajean helps reducing the amount of calculation needed to cluster input data.
Servajean in view of Cai does not specifically disclose: 
wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network;
input/output circuitry configured to: generate for display, on a user interface, a network alert based on the determination that the first reconstruction is an outlier within the first cluster.
Koral teaches: 
input/output circuitry configured to: generate for display, on a user interface, a network alert based on the determination that the first reconstruction is an outlier within the first cluster. ([Koral, 0074] The optional step 565 provides the graph via at least one display, which is the user interface. The graph of [Koral, 0052; Fig. 3] is being displayed to users. [Koral, 0053] The graph 320 displays which data belongs to which cluster, which indicates which data is anomalous. In the example, networks grouped within cluster 321 are normal network traffic records, and other groups indicates abnormal network (outliers) )
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai and Koral to use the method of utilizing the latent representation and displaying a network alert of Koral to implement the network alert generation system of Servajean. The suggestion and/or motivation for doing so is to improve the visibility of the alert generation system, as the Figure 3 of Koral shows which data point is clustered with which data points and grouping the similar data points with circles 321, 322, 323, and 324 to better display anomalies to the user.
Servajean in view of Cai and further in view of Koral does not specifically disclose: 
wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network;
Migliori teaches: 
wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network ([Migliori, col 5, line 39-51] The convolutional autoencoder in Migliori is used to catch the anomaly in radio-frequency signal which is a time-series data, and denoise it [Migliori, col 8, line 28-29]).
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Koral, and Migliori to use the method of wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network of Migliori to implement the network alert generation system of Servajean and Koral. The suggestion and/or motivation for doing so is to improve the performance of the alert generation system, as utilizing convolutional layers in autoencoder can avoid the computational cost drawback of image denoising by posing the task within the statistical framework of regression, which constitutes a more tractable computation; thus, it permits greater representational power than density estimation (Jain & Seung, 2008, “Natural Image Denoising with Convolutional Networks”, page 6-7, 5 Discussion).

Regarding claim 8, Servajean in view of Cai and further in view of Koral teaches: 
The method of claim 2.
Servajean in view of Cai and further in view of Koral does not specifically disclose wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network.
Migliori teaches: 
wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network. ([Migliori, col 5, line 39-51] The convolutional autoencoder in Migliori is used to catch the anomaly in radio-frequency signal which is a time-series data, and denoise it [Migliori, col 8, line 28-29]. Since Migliori processes real time data, non-causal sequence cannot exist, thus causal)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai, Koral, and Migliori to use the method of wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network of Migliori to implement the network alert generation system of Servajean. The suggestion and/or motivation for doing so is to improve the performance of the alert generation system, as utilizing convolutional layers in autoencoder can avoid the computational cost drawback of image denoising by posing the task within the statistical framework of regression, which constitutes a more tractable computation; thus, it permits greater representational power than density estimation (Jain & Seung, 2008, “Natural Image Denoising with Convolutional Networks”, page 6-7, 5 Discussion).

Claims 4-7 are rejected under 35 U.S.C. 103 as being unpatentable over Servajean in view of Cai in view of Koral and further in view of Wen et al. (US 11308365 B2, hereinafter ‘Wen’).

Regarding claim 4, Servajean in view of Cai teaches: 
further comprising: comparing the first clustering recommendation to the second clustering recommendation ([Cai, page 1535, left col, line 4-9] discloses comparing the label assignment between two successive iterations (the first clustering and the second clustering) and comparing the result to the pre-defined threshold                         
                            δ
                        
                    ); 
determining that the first clustering recommendation and the second clustering recommendation correspond to a first cluster of a plurality of clusters ([Cai, page 1535, left col, line 4-9] discloses comparing the label assignment between two successive iterations (the first clustering and the second clustering) and the iteration is stopped when the difference between the two successive iterations are less than the pre-defined threshold                         
                            δ
                        
                     (determining that the first and the second label corresponds to the first label) ); 
Servajean in view of Cai does not specifically disclose: 
determining that the network alert is generated in response to the first clustering recommendation and the second clustering recommendation corresponding to the first cluster, and that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster.
Servajean in view of Cai and further in view of Koral does not specifically disclose: 
determining that the network alert is generated in response to the first clustering recommendation and the second clustering recommendation corresponding to the first cluster, and that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster.
Wen teaches: 
determining that the network alert is generated in response to the first clustering recommendation and the second clustering recommendation corresponding to the first cluster, and that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster. ([Wen, claim 18] wherein the one or more processors are programmed by further executable instructions to determine to display the label based at least partly on at least one of: the first confidence score satisfying a threshold, or a difference between the first confidence score and the second confidence score. The first confidence score corresponds to the first clustering recommendation, and the second confidence score corresponds to the second clustering recommendations, as confidence scores are calculated based on how close the data and the classification is)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai, Koral and Wen to use the method of determining to base the network alert on the first reconstruction and the second reconstruction based on determining that the first clustering recommendation corresponds to the second clustering recommendation of Wen to implement the network alert system of Servajean. The suggestion and/or motivation for doing so is to improve the efficiency of the system, as determining to base the network alert based on both the first and the second clustering recommendation better represents the alert status compared to the graph in Figure 2 of Koral which is not based on the relationship between the first clustering recommendation and the second clustering recommendation.

Regarding claim 5, Servajean in view of Cai in view of Koral and further in view of Wen teaches: 
determining a centroid value of the first cluster based on the first reconstruction and the second reconstruction; ([Cai, page 1534, left col, B. The definition of clustering loss, line 3-15] discloses that the soft label (i.e., the clustering recommendation) is generated, and [Cai, page 1534, right col, D. Optimization strategy, line 1-12] discloses that the cluster centers which are used for the comparison are updated based on the previous latent representations and corresponding soft labels)
determining a first distance of the first reconstruction from the centroid value; ([Cai, page 1534, left col, B. The definition of clustering loss, line 1-13] discloses determining a distance of the data point                         
                            
                                    z
                                
                                    i
                                
                     from the cluster center                         
                            
                                    μ
                                
                                    j
                                
                     to cluster the data point. [Cai, page 1535, left col, line 4-9] discloses comparing the label assignment between two successive iterations (the first clustering and the second clustering) and comparing the result to the pre-defined threshold                         
                            δ
                        
                    )
comparing the first distance to a threshold distance; ([Cai, page B. The definition of clustering loss, line 3-13] discloses soft label assignment based on the distance between the center point                         
                            
                                    μ
                                
                                    j
                                
                     and the data point                         
                            
                                    z
                                
                                    i
                                
                    . The data point is assigned to a cluster based on the output value                         
                            
                                    q
                                
                                    i
                                    j
                                
                     (threshold distance to assign the data point to a cluster) )
However, Servajean in view of Cai does not specifically disclose:
wherein determining to generate for display the network alert based on the first reconstruction and the second reconstruction comprises:
determining to generate for display the network alert based on a determination that the first distance equals or exceeds the threshold distance, whereby the first reconstruction is identified as an outlier within the first cluster.
Koral teaches:
wherein determining to generate for display the network alert based on the first reconstruction and the second reconstruction comprises: ([Koral, 0052; Fig. 3] The Fig. 3 illustrates an example graph 320 of compressed vector representations of input vectors derived from DNS traffic records. [Koral, 0053] The graph 320 displays which data belongs to which cluster, which indicates which data is anomalous. In the example, networks grouped within cluster 321 are normal network traffic records, and other groups indicates abnormal network (outliers). According to the present specification [005], the broadest reasonable interpretation of ‘generate network alerts’ encompasses indicating abrupt changes, likely changes, and/or other discrepancies in one or more values based on changes of a metric, which directs to mere display of output data)
However, Koral does not specifically disclose:
determining to generate for display the network alert based on a determination that the first distance equals or exceeds the threshold distance, whereby the first reconstruction is identified as an outlier within the first cluster.
Wen teaches:
	determining to generate for display the network alert based on a determination that the first distance equals or exceeds the threshold distance, whereby the first reconstruction is identified as an outlier within the first cluster. ([Wen, claim 18] wherein the one or more processors are programmed by further executable instructions to determine to display the label based at least partly on at least one of: the first confidence score satisfying a threshold, or a difference between the first confidence score and the second confidence score. The first confidence score corresponds to the first clustering recommendation, and the second confidence score corresponds to the second clustering recommendations, as confidence scores are calculated based on how close the data and the classification is.).

Regarding claim 6, Servajean in view of Cai in view of Koral and further in view of Wen teaches: 
further comprising: determining a second distance of the second reconstruction from the centroid value; ([Cai, page 1534, left col, B. The definition of clustering loss, line 1-13] discloses determining a distance of the data point                         
                            
                                    z
                                
                                    i
                                
                     from the cluster center                         
                            
                                    μ
                                
                                    j
                                
                     to cluster the data point. [Cai, page 1535, left col, line 4-9] discloses comparing the label assignment between two successive iterations (the first clustering and the second clustering) and comparing the result to the pre-defined threshold                         
                            δ
                        
                    )
comparing the second distance to a threshold distance; ([Cai, page B. The definition of clustering loss, line 3-13] discloses soft label assignment based on the distance between the center point                         
                            
                                    μ
                                
                                    j
                                
                     and the data point                         
                            
                                    z
                                
                                    i
                                
                    . The data point is assigned to a cluster based on the output value                         
                            
                                    q
                                
                                    i
                                    j
                                
                     (threshold distance to assign the data point to a cluster) )
Servajean in view of Cai in view of Koral does not specifically disclose:
determining not to generate for display the network alert when it is determined that the second distance does not equal or exceed the threshold distance, whereby the second reconstruction is not identified as an outlier within the first cluster.
Wen teaches:
determining not to generate for display the network alert when it is determined that the second distance does not equal or exceed the threshold distance, whereby the second reconstruction is not identified as an outlier within the first cluster. ([Wen, claim 18] wherein the one or more processors are programmed by further executable instructions to determine to display the label based at least partly on at least one of: the first confidence score satisfying a threshold, or a difference between the first confidence score and the second confidence score. The first confidence score corresponds to the first clustering recommendation, and the second confidence score corresponds to the second clustering recommendations, as confidence scores are calculated based on how close the data and the classification is.)

Regarding claim 7, Servajean in view of Cai teaches: 
wherein the first distance is based on a Euclidean distance objective. ([Cai, page 1534, left col, B. The definition of clustering loss, line 3-13] The                         
                            
                                                    z
                                                
                                                    i
                                                
                                            -
                                            
                                                    μ
                                                
                                                    j
                                                
                                    2
                                
                     in the equation (4) denotes the Euclidean distance between the cluster center and the embedded point)

	Claims 12-13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Servajean in view of Cai in view of Koral and further in view of Cella et al. (US 20190041842 A1, hereinafter ‘Cella’).

Regarding claim 12, Servajean teaches: 
A non-transitory, computer-readable medium for improving hardware resiliency during serial processing tasks [Servajean, 0024] Servajean detects anomalous data 226 for a device 202 on the network 200. [Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212, which is trained for a cluster based on each of the time series in the cluster. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal by anomalous network traffics.): 
receiving first time series data for a first domain for a first period of time, wherein the first period of time comprises a currently active monitoring window for the first domain; ([Servajean, 0015] A Network Analyzer 204 is a hardware, software, firmware, or combination component adapted to access and store information about network communication, which provides input to the time series generator 206. [Servajean, 0016] The set of time series comprises characteristics over fixed length time windows.)
generating a first feature input based on the first time series data; ([Servajean, 0017] and [Servajean, 0018] collectively disclose generating (clustering) the time series into a plurality of clusters that are input to the autoencoder trainer. An autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, there are a plurality of time series data which goes into a plurality of autoencoders, which correspond to the first and the second time series)
inputting the first feature input into an encoder portion of a machine learning model to generate a first latent representation, wherein the encoder portion of the machine learning model is trained to generate latent representations of inputted feature inputs; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions) 
inputting the first latent representation into a decoder portion of the machine learning model to generate a first reconstruction of the first time series data, wherein the decoder portion of the machine learning model is trained to generate reconstructions of inputted feature inputs; ([Servajean, 0018] An autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions) 
determining, based on the first reconstruction of the first time series data for the first domain within the first cluster; ([Servajean, 0025] discloses the anomaly detector 224 determines reconstruction errors for the production time series (for each time window) and compares these errors with the aggregate model 222 of reconstruction errors to determine if there is a distance exceeding a predetermined threshold to determine the anomaly)
Servajean does not specifically disclose: 
A non-transitory, computer-readable medium for improving hardware resiliency during serial processing tasks in distributed computer networks; 
inputting the first latent representation into a clustering layer of the machine learning model to generate a first clustering recommendation for the first domain, wherein the first clustering recommendation indicates that the first domain corresponds to a first cluster of a plurality of clusters, and wherein the clustering layer of the machine learning model is trained to cluster domains based on respective time series data; 
determining, based on the first clustering recommendation and by comparing the first reconstruction of the first time series data for the first domain to one or more reconstructions of other domains assigned to the first cluster, whether or not the first reconstruction comprises an outlier within the first cluster; 
generating for display, on a user interface, a network alert based on the determination that the first reconstruction is an outlier within the first cluster. 
Cai teaches: 
inputting the first latent representation into a clustering layer of the machine learning model to generate a first clustering recommendation for the first domain, wherein the first clustering recommendation indicates that the first domain corresponds to a first cluster of a plurality of clusters, and wherein the clustering layer of the machine learning model is trained to cluster domains based on respective [Cai, page 1535, Figure 1] and [Cai, page 1534, left col, line 1-9] collectively disclose inputting the latent space representation Z for input data X, and processing the latent the clustering layer captures Z to assign a soft label (i.e., the first clustering recommendation) to each learned embedded points)
determining, based on the first clustering recommendation and by comparing the first reconstruction of the first time series data for the first domain to one or more reconstructions of other domains assigned to the first cluster, (loss value) [Cai, page 1534, left col, B. The definition of clustering loss, line 3-15] discloses that the soft label (i.e., the clustering recommendation) is generated, and [Cai, page 1534, right col, D. Optimization strategy, line 1-12] discloses that the cluster centers which are used for the comparison are updated based on the previous latent representations and corresponding soft labels)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean and Cai to use the method of inputting latent representation into a clustering layer of Cai to implement the machine learning method of Servajean. The suggestion and/or motivation for doing so is to improve the efficiency of the machine learning method as utilizing a single model instead of a plurality of models as taught in Servajean helps reducing the amount of calculation needed to cluster input data.
Servajean in view of Cai does not specifically disclose: 
A non-transitory, computer-readable medium for improving hardware resiliency during serial processing tasks in distributed computer networks; 
generating for display, on a user interface, a network alert based on the determination that the first reconstruction is an outlier within the first cluster. 
Koral teaches: 
generating for display, on a user interface, a network alert based on the determination that the first reconstruction is an outlier within the first cluster. ([Koral, 0052; Fig. 3] The Fig. 3 illustrates an example graph 320 of compressed vector representations of input vectors derived from DNS traffic records. [Koral, 0053] The graph 320 displays which data belongs to which cluster, which indicates which data is anomalous. In the example, networks grouped within cluster 321 are normal network traffic records, and other groups indicates abnormal network (outliers). According to the present specification [005], the broadest reasonable interpretation of ‘generate network alerts’ encompasses indicating abrupt changes, likely changes, and/or other discrepancies in one or more values based on changes of a metric, which directs to mere display of output data)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai and Koral to use the method of utilizing the latent representation and displaying a network alert of Koral to implement the network alert generation system of Servajean. The suggestion and/or motivation for doing so is to improve the visibility of the alert generation system, as the Figure 3 of Koral shows which data point is clustered with which data points and grouping the similar data points with circles 321, 322, 323, and 324 to better display anomalies to the user.
	Servajean in view of Cai and further in view of Koral does not specifically disclose: 
A non-transitory, computer-readable medium for improving hardware resiliency during serial processing tasks in distributed computer networks.
	Cella teaches: 
A non-transitory, computer-readable medium for improving hardware resiliency during serial processing tasks in distributed computer networks ([Cella, 0314] In embodiments, the cognitive data marketplace 4102 may use a secure architecture for tracking and resolving transactions, such as a distributed ledger 4004, wherein transactions in data packages are tracked in a chained, distributed data structure, such as a Blockchain™, allowing forensic analysis and validation where individual devices store a portion of the ledger representing transactions in data packages. ).
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai, Koral and Cella to implement the transitory, computer-readable medium for improving hardware resiliency during serial processing tasks in distributed computer networks such as blockchain of Cella to implement the network alert generation system of Servajean. The suggestion and/or motivation for doing so is to improve the security of the system, as using distributed computer networks such as blockchains allows forensic analysis and validation ([Cella, 0314]).

Regarding claim 13, Servajean teaches: 
wherein the instructions further cause operations comprising: receiving second time-series data for a second domain for the first period of time; generating a second feature input based on the second time series data; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, there are a plurality of time series data which goes into a plurality of autoencoders, which correspond to the first and the second time series. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions) 
inputting the second feature input into the encoder portion of the machine learning model to generate a second latent representation; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network [Koral, US-20200112574-A1, 0068]. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions) 
inputting the second latent representation into a decoder portion of the machine learning model to generate a second reconstruction of the second time-series data; ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. The autoencoder trainer 210 operates on the basis of time series generated as training data, such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. Thus, such time series can be referred to as training time series. Autoencoders inherently contains an encoder which generates a latent representation and a decoder which decodes the latent representation, as the definition of the autoencoder is encoder-decoder network [Koral, US-20200112574-A1, 0068]. [Servajean, 0019] further discloses identifying a set of reconstruction errors, which implies that the autoencoder contains an encoder and a decoder, and generates reconstructions)
Servajean does not specifically disclose: 
inputting the second latent representation into the clustering layer of the machine learning model to generate a second clustering recommendation for the second domain; and 
determining to generate for display the network alert based on a determination that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster.
Cai teaches: 
inputting the second latent representation into the clustering layer of the machine learning model to generate a second clustering recommendation for the second domain; ([Cai, page 1535, Figure 1] and [Cai, page 1534, left col, line 1-9] collectively disclose inputting the latent space representation Z for input data X, and processing the latent the clustering layer captures Z to assign a soft label (i.e., the first clustering recommendation) to each learned embedded points. [Cai, page 1535, left col, line 4-9] discloses that at least two successful iterations are performed (the first clustering and the second clustering) )
Servajean in view of Cai does not specifically disclose: 
determining to generate for display the network alert based on a determination that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster.
Koral teaches: 
determining to generate for display the network alert based on a determination that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster. ([Koral, 0052; Fig. 3] The Fig. 3 illustrates an example graph 320 of compressed vector representations of input vectors derived from DNS traffic records. [Koral, 0053] The graph 320 displays which data belongs to which cluster, which indicates which data is anomalous. In the example, networks grouped within cluster 321 are normal network traffic records, and other groups indicates abnormal network (outliers). According to the present specification [005], the broadest reasonable interpretation of ‘generate network alerts’ encompasses indicating abrupt changes, likely changes, and/or other discrepancies in one or more values based on changes of a metric, which directs to mere display of output data)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai and Koral to use the method of utilizing the latent representation and displaying a network alert of Koral to implement the network alert generation system of Servajean. The suggestion and/or motivation for doing so is to improve the visibility of the alert generation system, as the Figure 3 of Koral shows which data point is clustered with which data points and grouping the similar data points with circles 321, 322, 323, and 324 to better display anomalies to the user.

Regarding claim 20, Servajean teaches: 
The non-transitory, computer readable medium of claim 12. 
Servajean in view of Cai does not specifically disclose: 
wherein the network alert indicates that the first reconstruction comprises an outlier from respective reconstructions of domains in the first cluster.
Koral teaches: 
wherein the network alert indicates that the first reconstruction comprises an outlier from respective reconstructions of domains in the first cluster ([Koral, 0052; Fig. 3] The Fig. 3 illustrates an example graph 320 of compressed vector representations of input vectors derived from DNS traffic records. [Koral, 0053] The graph 320 displays which data belongs to which cluster, which indicates which data is anomalous. In the example, networks grouped within cluster 321 are normal network traffic records, and other groups indicates abnormal network (outliers).).

Claims 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Servajean in view of Cai in view of Koral in view of Cella and further in view of Wen (US 11308365 B2).

Regarding claim 14, Servajean teaches: 
determining that the first clustering recommendation and the second clustering recommendation correspond to a first cluster of a plurality of clusters; ([Servajean, 0018] An autoencoder determines whether the data belongs to various clusters such as time series defined on the basis of network communication that is known to reflect normal, typical, non-suspicious and/or safe communication unencumbered by malicious, erroneous or suspicious network traffic or entities. There are a plurality of time series data which goes into a plurality of autoencoders, which correspond to the first and the second time series)
Servajean in view of Cai does not specifically disclose: 
wherein the instructions further cause operations comprising: 
comparing the first clustering recommendation to the second clustering recommendation; and 
determining that the network alert is generated in response to the first clustering recommendation and the second clustering recommendation corresponding to the first cluster, and that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster.
Koral teaches: 
wherein the instructions further cause operations comprising: comparing the first clustering recommendation to the second clustering recommendation ([Koral, 0097] The additional sample of network traffic data may be associated with the first one of the at least one of the plurality of clusters by determining that a distance between the additional sample of network traffic data and the first one of the at least one of the plurality of clusters is less than the threshold distance. This process compares how close both clusters are); 
Servajean in view of Cai in view of Koral and further in view of Cella does not specifically disclose: 
determining that the network alert is generated in response to the first clustering recommendation and the second clustering recommendation corresponding to the first cluster, and that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster.
Wen teaches: 
determining that the network alert is generated in response to the first clustering recommendation and the second clustering recommendation corresponding to the first cluster, and that at least one of the first reconstruction or the second reconstruction is an outlier within the first cluster. ([Wen, claim 18] wherein the one or more processors are programmed by further executable instructions to determine to display the label based at least partly on at least one of: the first confidence score satisfying a threshold, or a difference between the first confidence score and the second confidence score. The first confidence score corresponds to the first clustering recommendation, and the second confidence score corresponds to the second clustering recommendations, as confidence scores are calculated based on how close the data and the classification is)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai, Koral, and Wen to use the method of determining to base the network alert on the first reconstruction and the second reconstruction based on determining that the first clustering recommendation corresponds to the second clustering recommendation of Wen to implement the network alert generation system of Servajean. The suggestion and/or motivation for doing so is to improve the efficiency of the system, as determining to base the network alert based on both the first and the second clustering recommendation better represents the alert status compared to the graph in Figure 2 of Koral which is not based on the relationship between the first clustering recommendation and the second clustering recommendation.

Regarding claim 15, Servajean in view of Cai teaches: 
determining a centroid value of the first cluster based on the first reconstruction and the second reconstruction; ([Cai, page 1534, left col, B. The definition of clustering loss, line 3-15] discloses that the soft label (i.e., the clustering recommendation) is generated, and [Cai, page 1534, right col, D. Optimization strategy, line 1-12] discloses that the cluster centers which are used for the comparison are updated based on the previous latent representations and corresponding soft labels)
determining a first distance of the first reconstruction from the centroid value; ([Cai, page 1534, left col, B. The definition of clustering loss, line 1-13] discloses determining a distance of the data point                         
                            
                                    z
                                
                                    i
                                
                     from the cluster center                         
                            
                                    μ
                                
                                    j
                                
                     to cluster the data point. [Cai, page 1535, left col, line 4-9] discloses comparing the label assignment between two successive iterations (the first clustering and the second clustering) and comparing the result to the pre-defined threshold                         
                            δ
                        
                    )
comparing the first distance to a threshold distance; ([Cai, page B. The definition of clustering loss, line 3-13] discloses soft label assignment based on the distance between the center point                         
                            
                                    μ
                                
                                    j
                                
                     and the data point                         
                            
                                    z
                                
                                    i
                                
                    . The data point is assigned to a cluster based on the output value                         
                            
                                    q
                                
                                    i
                                    j
                                
                     (threshold distance to assign the data point to a cluster) )
However, Servajean in view of Cai in view of Koral does not specifically disclose:
wherein determining to display the network alert based on the first reconstruction and the second reconstruction comprises:
determining to generate for display the network alert based on a determination that the first distance equals or exceeds the threshold distance, whereby the first reconstruction is identified as an outlier within the first cluster.
Wen teaches:
wherein determining to display the network alert based on the first reconstruction and the second reconstruction comprises: ([Wen, claim 18] wherein the one or more processors are programmed by further executable instructions to determine to display the label based at least partly on at least one of: the first confidence score satisfying a threshold, or a difference between the first confidence score and the second confidence score. The first confidence score corresponds to the first clustering recommendation, and the second confidence score corresponds to the second clustering recommendations, as confidence scores are calculated based on how close the data and the classification is)
determining to generate for display the network alert based on a determination that the first distance equals or exceeds the threshold distance, whereby the first reconstruction is identified as an outlier within the first cluster. ([Wen, claim 18] wherein the one or more processors are programmed by further executable instructions to determine to display the label based at least partly on at least one of: the first confidence score satisfying a threshold, or a difference between the first confidence score and the second confidence score. The first confidence score corresponds to the first clustering recommendation, and the second confidence score corresponds to the second clustering recommendations, as confidence scores are calculated based on how close the data and the classification is)

Regarding claim 16, Servajean in view of Cai teaches: 
determining a second distance of the second reconstruction from the centroid value; ([Cai, page 1534, left col, B. The definition of clustering loss, line 1-13] discloses determining a distance of the data point                         
                            
                                    z
                                
                                    i
                                
                     from the cluster center                         
                            
                                    μ
                                
                                    j
                                
                     to cluster the data point. [Cai, page 1535, left col, line 4-9] discloses comparing the label assignment between two successive iterations (the first clustering and the second clustering) and comparing the result to the pre-defined threshold                         
                            δ
                        
                    )
comparing the second distance to the threshold distance; ([Cai, page B. The definition of clustering loss, line 3-13] discloses soft label assignment based on the distance between the center point                         
                            
                                    μ
                                
                                    j
                                
                     and the data point                         
                            
                                    z
                                
                                    i
                                
                    . The data point is assigned to a cluster based on the output value                         
                            
                                    q
                                
                                    i
                                    j
                                
                     (threshold distance to assign the data point to a cluster) )
Servajean in view of Cai in view of Koral does not specifically disclose:
determining not to generate for display the network alert when it is determined that the second distance does not equal or exceed the threshold distance, whereby the second reconstruction is not identified as an outlier within the first cluster.
Wen teaches:
determining not to generate for display the network alert when it is determined that the second distance does not equal or exceed the threshold distance, whereby the second reconstruction is not identified as an outlier within the first cluster. ([Wen, claim 18] wherein the one or more processors are programmed by further executable instructions to determine to display the label based at least partly on at least one of: the first confidence score satisfying a threshold, or a difference between the first confidence score and the second confidence score. The first confidence score corresponds to the first clustering recommendation, and the second confidence score corresponds to the second clustering recommendations, as confidence scores are calculated based on how close the data and the classification is)

Regarding claim 17, Servajean in view of Cai teaches: 
wherein the first distance is based on a Euclidean distance objective. ([Cai, page 1534, left col, B. The definition of clustering loss, line 3-13] The                         
                            
                                                    z
                                                
                                                    i
                                                
                                            -
                                            
                                                    μ
                                                
                                                    j
                                                
                                    2
                                
                     in the equation (4) denotes the Euclidean distance between the cluster center and the embedded point)

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Servajean in view of Cai in view of Koral in view of Cella and further in view of Migliori.

Regarding claim 18, Servajean in view of Cai in view of Koral and further in view of Cella teaches: 
A non-transitory, computer-readable medium of claim 12, … wherein the machine learning model maintains a time dependency for the first time series data ([Servajean, 0018] An autoencoder trainer 210 is a hardware, software, firmware or combination component for training an autoencoder 212 for each cluster defined by the clustering process 208 such that each cluster has a separately trained autoencoder 212. Thus, an autoencoder 212 is trained for a cluster based on each of the time series in the cluster on a time window by time window basis. According to the definition of autoencoder, autoencoder is a neural network. Since Servajean processes ‘real time’ data, non-causal sequence cannot exist, thus causal)
Servajean in view of Cai in view of Koral and further in view of Cella does not specifically disclose: 
wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network.
Migliori teaches: 
wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network. ([Migliori, col 5, line 39-51] The convolutional autoencoder in Migliori is used to catch the anomaly in radio-frequency signal which is a time-series data, and denoise it [Migliori, col 8, line 28-29]. Since Migliori processes real time data, non-causal sequence cannot exist, thus causal)
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having the teachings of Servajean, Cai, Koral and Migliori to use the method of wherein the machine learning model comprises an autoencoder constructed using a causal sequence convolutional neural network of Migliori to implement the network alert generation system of Servajean. The suggestion and/or motivation for doing so is to improve the performance of the alert generation system, as utilizing convolutional layers in autoencoder can avoid the computational cost drawback of image denoising by posing the task within the statistical framework of regression, which constitutes a more tractable computation; thus, it permits greater representational power than density estimation (Jain & Seung, 2008, “Natural Image Denoising with Convolutional Networks”, page 6-7, 5 Discussion).

Response to Arguments
Response to Arguments under 35 U.S.C. 101
	Amended claims were received on 10/27/2025. 35 U.S.C. 101 rejection has been withdrawn.

Response to Arguments under 35 U.S.C. 103
Applicant’s arguments with respect to claims 1-8, 10-18 and 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUN KWON whose telephone number is (571)272-2072. The examiner can normally be reached Monday – Friday 7:30AM – 4:30PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached at (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JUN KWON/Examiner, Art Unit 2127                                                                                                                                                                                                        

/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127
Read full office action
Prosecution Timeline

Jan 27, 2021
Application Filed
Apr 12, 2024
Non-Final Rejection — §101, §103, §112
Aug 23, 2024
Response Filed
Sep 05, 2024
Final Rejection — §101, §103, §112
Jan 09, 2025
Request for Continued Examination
Jan 14, 2025
Response after Non-Final Action
Mar 20, 2025
Non-Final Rejection — §101, §103, §112
Jun 25, 2025
Response Filed
Jul 28, 2025
Final Rejection — §101, §103, §112
Oct 27, 2025
Request for Continued Examination
Oct 30, 2025
Response after Non-Final Action
Mar 06, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/087,881
Patent 12602569
EXTRACTING ENTITY RELATIONSHIPS FROM DIGITAL DOCUMENTS UTILIZING MULTI-VIEW NEURAL NETWORKS
2y 5m to grant Granted Apr 14, 2026
17/178,360
Patent 12602609
UPDATING MACHINE LEARNING TRAINING DATA USING GRAPHICAL INPUTS
2y 5m to grant Granted Apr 14, 2026
18/451,880
Patent 12579436
Tensorized LSTM with Adaptive Shared Memory for Learning Trends in Multivariate Time Series
2y 5m to grant Granted Mar 17, 2026
18/811,610
Patent 12572777
Policy-Based Control of Multimodal Machine Learning Model via Activation Analysis
2y 5m to grant Granted Mar 10, 2026
18/759,617
Patent 12493772
LAYERED MULTI-PROMPT ENGINEERING FOR PRE-TRAINED LARGE LANGUAGE MODELS
2y 5m to grant Granted Dec 09, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
38%
Grant Probability
84%
With Interview (+46.2%)
4y 3m
Median Time to Grant
High
PTA Risk
Based on 68 resolved cases by this examiner. Grant probability derived from career allow rate.