DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant's submission filed on 2025-10-09 has been entered. The status of claims is as follows:
Claims 1, 5-12, 14-18, and 20-25 are pending in the application.
Claims 1, 10, and 16 are amended.
Claims 2-4, 13, and 19 are cancelled.
Response to Arguments
Applicant's arguments filed in response to rejections under 35 USC 101 have been fully considered but they are not persuasive.
Applicant argues on Remarks Page 24 that the application provides a technical solution to a technical problem, the solution being that “the system dynamically forecasts the probability that the application will enter an anomaly state” and “these real-time, proactive updates empower administrators to take preventive action against imminent failures or undesirable system behaviors, thus significantly enhancing operational reliability and responsiveness.”
Examiner respectfully disagrees that the claimed limitations amount to an improvement to the functioning of a computer. Examiner notes that the computer is not improved by the limitations, but rather the limitations “empower administrators to take preventive action”, and as such the subsequent human actions result in improved reliability. In other words, the claimed limitations are not directly improving the functioning of a computer, but are rather improving decision making, which is an abstract idea. Indeed, Applicant states the “technical problem” as “existing methods struggle to forecast the likelihood of future anomalies” because they “depend on models of predefined correct behavior.” Examiner notes that the “improvement” is to a method to “forecast the likelihood of future anomalies”, which is a mental process. MPEP 2106.05(a), regarding improvements to computer technology, states: “It is important to note, the judicial exception alone cannot provide the improvement”. In this case, the “improvement” is realized via constructing a probabilistic graph, and a decision making process making use thereof, which is itself an abstract idea. While Applicant emphasizes the ”real-time” aspect “during execution”, the generalized recitation of these limitations without any significant detail about how this configuration is accomplished, amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. In other words, the mere recitation of “real-time” and “during execution” does not meaningfully limit the practice of the abstract idea of constructing and analyzing probabilistic graphs in order to predict anomalous behavior. If some technical details are claimed about how the “real-time” generation of log files is accomplished by the computer system, this could be a potential avenue to explore overcoming 101 rejections.
Applicant argues on Remarks Pages 24-25 that claim 1 “does not recite a judicial exception” because the limitations “cannot practically be performed in the human mind or by a human using pen and paper” as they “involve large-scale, rapid data aggregation and continual probabilistic modeling”.
Examiner respectfully disagrees. A human with pen and paper could analyze logs and generate log templates, as the claimed limitations state that the logs are merely “entries describing events, transitions, and states”. A human with pen and paper could also generate a probabilistic graph and perform probabilistic modeling. Performing such operations on a large scale or more rapidly amounts to mere instructions to implement the stated abstract ideas on a computer.
Applicant argues on Remarks Pages 25-26 that the claim integrates the abstract idea into a practical application because it explicitly recites steps that “significantly improve the functioning of computer systems in the context of real-time anomaly forecasting”.
Examiner respectfully disagrees, and reiterates the response stated above, that this amounts not to an improvement to technology, but to an abstract idea of “anomaly forecasting”, and the mere generic recitation of “real-time” does not sufficiently limit the practice of the abstract idea.
Applicant argues on Remarks Page 26 that the claimed limitations amount to significantly more than any alleged abstract idea, again stating that “the claim encompasses a series of technical steps that improve the functioning of a computer” and comprises steps that are not well-understood, routine, or conventional.
Examiner respectfully disagrees with the assertion of an improvement to technology, for the reasons articulated above. Examiner also notes that the statement that “these elements are not well-understood, routine, or conventional activities” is a mere conclusory statement, with no specific argument given for any particular element as to why it is not well-understood, routine, and conventional. As for two new limitations regarding “aggregates” and “cleans … by removing at least one of … redundant data”, Berkheimer evidence has been provided in the 101 rejections below showing that these limitations are in fact, well-understood, routine, and conventional.
Applicant's arguments filed in response to rejections under 35 USC 103 have been fully considered but they are not persuasive and/or are moot. Applicant argues that the new amendments are not taught by the previously applied combination of references. Examiner partially agrees, as the limitation “cleans the aggregated log structure by removing at least one of corrupted data or redundant data from the entries” was not taught by the previous combination of references, and thus a new reference Kent has been added, rendering the argument moot. Some of the other amended limitations are still taught by the previous combination of references, and details are given below in the section on rejections under 35 USC 103.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 5-12, 14-18, and 20-25 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1:
Claims 1, 5-9, and 24 are directed to a system, Claims 10-12, 14-15, and 25 are directed to a method, and Claims 16-18 and 20-23 are directed to a computer program product comprising a computer readable storage medium, wherein, as per Specification [0123], “The computer readable storage medium can be a tangible device” and “computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media.” Therefore, each of the claims is directed to one of the four statutory categories of patent eligible subject matter.
Step 2A Prong 1:
Claims 1, 10, and 16 recite:
“analyzes log files”; analyzing is a mental process
“generates standardized log templates for the log entries, wherein the standardized log templates represent uniform structures for generation of sequences of the events, the transitions, and the states”; a human can generate a template with pen and paper, thus this is a mental process
“generates, based on the log entries, event sequences according to the uniform structures of the standardized log template”; a human can generate text in the form of a template with pen and paper, thus this is a mental process
“generates, based on the event sequences, a probabilistic graph that models the event sequences, wherein the probabilistic graph comprises respective branches for the event sequences comprising nodes representing the events, edges representing the transitions between the events of the event sequences, wherein the nodes further comprises respective indications of probabilities of the transitions, wherein the respective branches comprise end nodes with states at which the respective branches ended, wherein the states comprise an anomaly state and a desired state, wherein the anomaly state comprises an occurrence of one or more anomalies, and wherein the desired state does not comprise any occurrence of anomalies”; a human can generate a probabilistic graph with pen and paper, thus this is a mental process
“iteratively … analyzes an additional log entry added to the log files”; analyzing is a mental process
“determines whether the standardized log templates comprises a standardized log template for the additional log entry”; determining is a mental process
“in response to determining that the standardized log templates do not comprise any standardized log template for the additional log entry, generates an additional standardized log template for the additional log entry”; a human can generate a template with pen and paper, thus this is a mental process
“updates, based on the additional log entry and the standardized log templates, an event sequence of the event sequences”; a human can write down updated entries with pen and paper, thus this is a mental process
“identifies, based on the updated event sequence, a node in the probabilistic graph”; updating a node in a graph can be performed by a human with pen and paper, and is thus a mental process
“updates, based on the event sequences, the probabilistic graph”; a human can generate a probabilistic graph with pen and paper, thus this is a mental process
“determines, in real-time based on the respective indications of probabilities of the transitions in the probabilistic graph and the node in the probabilistic graph [corresponding to the current execution of the computer application], a probability of the computer application executing the anomaly state at a future time”; determining a probability can be performed by a human with pen and paper, and is thus a mental process
Step 2A Prong 2:
This judicial exception is not integrated into a practical application because the additional elements are as follows:
“A system, comprising: a memory that stores computer executable components; and a processor that executes at least one of the computer executable components that”; “computer-implemented”; “by a cloud server system”; “computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor of a cloud computing system to cause the processor to”; these limitations amount to “nothing more than an instruction to apply the abstract idea using a generic computer” as per MPEP 2106.05(f)
“generated during historical executions of computer applications, wherein the log files comprise log entries describing events, transitions, and states that occurred during the historical executions of the computer applications”; “during a current execution of one or more of the computer applications”; “by a computer application of the one or more computer applications”; “corresponding to the current execution of the computer application”; “probability of the computer application executing”; these limitations amount to merely indicating a field of use or technological environment in which to apply a judicial exception as per MPEP 2106.05(h); Examiner notes that these limitations serve to merely take the recited abstract idea (“analyzes log files”, “generates … a probabilistic graph”, “determines, in real-time based on the respective indications of probabilities in the probabilistic graph and the node in the probabilistic graph … a probability of … executing the anomaly state”) and merely limit the practice of said abstract idea to computer systems and logs thereof; Examiner notes also from MPEP 2106.05(h): “Examples of limitations that the courts have described as merely indicating a field of use or technological environment in which to apply a judicial exception include … x. Requiring that the abstract idea of creating a contractual relationship that guarantees performance of a transaction (a) be performed using a computer that receives and sends information over a network, or (b) be limited to guaranteeing online transactions, because these limitations simply attempted to limit the use of the abstract idea to computer environments.”
“aggregates the log entries into an aggregated log structure, where the log entries in the aggregated log structure are ordered in a time sequence according to respective times when the log entries were generated”; this amounts to insignificant extra solution activity, mere data gathering, as per MPEP 2106.05(g)
“cleans the aggregated log structure by removing at least one of a corrupted data of redundant data from the log entries”; this amounts to insignificant extra solution activity, selecting a particular data source or type of data to be manipulated, as per MPEP 2106.05(g)
“displays, on a display device, a visualization comprising the probability of the computer application executing the anomaly state at the future time”; this amounts to insignificant extra-solution activity, specifically mere data gathering and outputting, as per MPEP 2106.05(g)
Step 2B:
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows:
“A system, comprising: a memory that stores computer executable components; and a processor that executes at least one of the computer executable components that”; “computer-implemented”; “by a cloud server system”; “computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor of a cloud computing system to cause the processor to”; these limitations amount to “nothing more than an instruction to apply the abstract idea using a generic computer” as per MPEP 2106.05(f)
“generated during historical executions of computer applications, wherein the log files comprise log entries describing events, transitions, and states that occurred during the historical executions of the computer applications”; “during a current execution of one or more of the computer applications”; “by a computer application of the one or more computer applications”; “corresponding to the current execution of the computer application”; “probability of the computer application executing”; these limitations amount to merely indicating a field of use or technological environment in which to apply a judicial exception as per MPEP 2106.05(h); Examiner notes that these limitations serve to merely take the recited abstract idea (“analyzes log files”, “generates … a probabilistic graph”, “determines, in real-time based on the respective indications of probabilities in the probabilistic graph and the node in the probabilistic graph … a probability of … executing the anomaly state”) and merely limit the practice of said abstract idea to computer systems and logs thereof; Examiner notes also from MPEP 2106.05(h): “Examples of limitations that the courts have described as merely indicating a field of use or technological environment in which to apply a judicial exception include … x. Requiring that the abstract idea of creating a contractual relationship that guarantees performance of a transaction (a) be performed using a computer that receives and sends information over a network, or (b) be limited to guaranteeing online transactions, because these limitations simply attempted to limit the use of the abstract idea to computer environments.”
“aggregates the log entries into an aggregated log structure, where the log entries in the aggregated log structure are ordered in a time sequence according to respective times when the log entries were generated”; this amounts to insignificant extra solution activity, mere data gathering, as per MPEP 2106.05(g); furthermore, this amounts to well-understood, routine, and conventional activity as per MPEP 2106.05(d) and as evidenced by Berkheimer reference Rinnan (“Benefits of Centralized Log file Correlation”), which is a 2005 paper that cites earlier papers that discuss the aggregation, normalization, and de-duplication of logs and log entries – see Pages 5-7, particularly noting “The centralization process may be defined as gathering the log files/entries in one server” and “Data aggregation [2, 23] organizes normalized data by category, for instance IT systems, applications, etc. [4] states that it is necessary to eliminate redundant or duplicate event data from the security event data stream … The reduction of data can be done by examining incoming events from multiple sources for duplicate information and removing redundancies. Then processing rules can filter the arriving data and decide what to keep and what to eliminate”; Examiner also notes Kent et al. (“Guide to Computer Security Log Management”), a 2006 paper which states on Page 3-3: “Event filtering is the suppression of log entries from analysis, reporting, or long-term storage because their characteristics indicate that they are unlikely to contain information of interest. For example, duplicate entries and standard informational entries might be filtered because they do not provide useful information to log analysts … In event aggregation, similar entries are consolidated into a single entry containing a count of the number of occurrences of the event”
“cleans the aggregated log structure by removing at least one of a corrupted data of redundant data from the log entries”; this amounts to insignificant extra solution activity, selecting a particular data source or type of data to be manipulated, as per MPEP 2106.05(g); furthermore, this amounts to well-understood, routine, and conventional activity as per MPEP 2106.05(d) and as evidenced by Berkheimer reference Rinnan (“Benefits of Centralized Log file Correlation”), which is a 2005 paper that cites earlier papers that discuss the aggregation, normalization, and de-duplication of logs and log entries – see Pages 5-7, particularly noting “The centralization process may be defined as gathering the log files/entries in one server” and “Data aggregation [2, 23] organizes normalized data by category, for instance IT systems, applications, etc. [4] states that it is necessary to eliminate redundant or duplicate event data from the security event data stream … The reduction of data can be done by examining incoming events from multiple sources for duplicate information and removing redundancies. Then processing rules can filter the arriving data and decide what to keep and what to eliminate”; Examiner also notes Kent et al. (“Guide to Computer Security Log Management”), a 2006 paper which states on Page 3-3: “Event filtering is the suppression of log entries from analysis, reporting, or long-term storage because their characteristics indicate that they are unlikely to contain information of interest. For example, duplicate entries and standard informational entries might be filtered because they do not provide useful information to log analysts … In event aggregation, similar entries are consolidated into a single entry containing a count of the number of occurrences of the event”
“displays, on a display device, a visualization comprising the probability of the computer application executing the anomaly state at the future time”; this amounts to insignificant extra-solution activity, specifically mere data gathering and outputting, as per MPEP 2106.05(g); furthermore, sending output to a user amounts to well-understood, routine, and conventional activity as per MPEP 2106.05(d): “Receiving or transmitting data over a network.”
Dependent Claims:
Claims 5, 11, and 17 recite: “wherein the probabilistic graph is a type selected from a group consisting of a Markov chain, a probabilistic tree, Bayesian network, and Markov Random fields”; as explained above in the rejection to Claim 1, a human can generate a probabilistic graph with pen and paper, and this is thus a mental process
Claims 6, 14, and 22 recite: “wherein the at least one of the computer executable components further determines a probability that a last event delineated by one of the event sequences will be executed by the computer application by aggregating probability values associated with the transitions of the event sequence”; determining a probability that an event will occur can be performed by a human with pen and paper, and is thus a mental process; as explained above in the rejection to Claim 1, the recitation of the events being related to a computer application merely link the abstract idea to a field of use under Steps 2A Prong 2 and 2B
Claims 7, 12, and 18 recite: “wherein a first event sequence from the plurality of event sequences characterizes a first order of events that achieves the anomaly state, and wherein a second event sequence from the plurality of event sequences characterizes a second order of events that achieves the desired state”; as stated above, a human can generate event sequences from a log file using pen and paper, thus this is a mental process;
Claims 8 and 20 recite: “wherein the at least one of the computer executable components further maps a current state of the computer application to a first position on the probabilistic graph”; mapping a state to a position on a graph can be performed by a human with pen and paper, and is thus a mental process; as explained above in the rejection to Claim 1, the recitation of the events being related to a computer application merely link the abstract idea to a field of use under Steps 2A Prong 2 and 2B
Claims 9 and 21 recite: “wherein the at least one of the computer executable components forecasts whether the computer application will execute the anomaly state by aggregating probability values associated with a subset of the transitions between the first position of the computer application on the probabilistic graph and a second position on the probabilistic graph corresponding to the anomaly state, and wherein the last event is associated with the anomaly state”; forecasting based on probability values can be performed by a human with pen and paper, and is thus a mental process; as explained above in the rejection to Claim 1, the recitation of the events being related to a computer application merely link the abstract idea to a field of use under Steps 2A Prong 2 and 2B
Claim 15 recites: “mapping, by the cloud server system, a current state of the computer application to a first position on the probabilistic graph; and forecasting, by the cloud server system, whether the computer application will execute the anomaly state by aggregating probability values associated with a subset of the transitions between the first position of the computer application on the probabilistic graph and a second position on the probabilistic graph corresponding to the anomaly state”; mapping and forecasting based on probability are processes that can be performed by a human with pen and paper and are thus a mental process; as explained above in the rejection to Claim 1, the recitation of the events being related to a computer application merely link the abstract idea to a field of use under Steps 2A Prong 2 and 2B
Claims 23, 24, and 25 recite: “wherein the probabilistic graph describes relationships between the event sequences”; as explained above, a human can generate a probabilistic graph with pen and paper, thus this is a mental process
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 5-12, 14-18, and 20-25 are rejected under 35 U.S.C. 103 as being unpatentable over Puri et al. (US 2018/0322283 A1; hereinafter “Puri”) in view of Kent et al. (“Guide to Computer Security Log Management”; hereinafter “Kent”), further in view of Kowalski (US 2011/0320228 A1), and further in view of Liu (“Data Analysis of Minimally-Structured Heterogeneous Logs: An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes”)
As per Claim 1, Puri teaches a system, comprising: a memory that stores computer executable components; and a processor that executes at least one of the computer executable components that: (Puri, [0047], teaches the claimed “memory that stores computer executable components” that is associated with the disclosed “non-transitory computer readable medium,” disclosing “In some examples, the elements of the apparatus 100 may be machine readable instructions stored on a non-transitory computer readable medium. In this regard, the apparatus 100 may include or be a non-transitory computer readable medium. In some examples, the elements of the apparatus 100 may be hardware or a combination of machine readable instructions and hardware.”)
analyzes log files generated during historical executions of computer applications, wherein the log files comprise log entries describing events, transitions, and states that occurred during the historical execution of the computer applications (Puri [0018] teaches the claimed “analyzes one or more log files generated during historical executions of computer applications” that is associated with the disclosed “performing analytics with low latency and rapid results with streaming data is needed when finding relevant security events and being operationally aware in real-time.” Puri [0018] also teaches the claimed “the log files comprise log entries describing events” that is associated with the disclosed “data present in log files, or trace data, generated from a device source is characterized by attributes that include unique identifiers, timestamps, events, and actions,” disclosing “Many enterprise environments need to manage copious amounts of log files where forensic evidence of those threats and suspect anomalies reside unnoticed in logs until it may be too late. Analyzing log data from many heterogeneous sources to find errors and anomalies can be difficult, both in terms of computation and information technology (IT) coordination. Learning the behavior of applications through log traces, understanding the flow of events that occur within many applications, performing analytics at massive scales, and performing analytics with low latency and rapid results with streaming data is needed when finding relevant security events and being operationally aware in real-time. Often data present in log files, or trace data, generated from a device source is characterized by attributes that include unique identifiers, timestamps, events, and actions. These unique attributes can be indicative of application behaviors, processes, and patterns created by a series of events. Data contained within the trace sources can be modeled as a graph containing information about states and transitions between them.”)
aggregates the log entries into an aggregated log structure (Puri [0081]: “The CEP 252 may be used for performing real-time analytics and driving real-time insights. As new data streams in from its sources, pre-processing and aggregation may respectively perform the initial pre-processing and transformations to count parts of the data and use the totals to expedite future processing of data batches. The pre-processing and aggregation may be performed by combining historical data with new data, matching the data against pre-determined patterns as well as inferring new patterns in the data, and triggering events and actions based on the detected patterns, delivering real-time insights for decision making.”)
where the log entries in the aggregates log structure are ordered in a time sequence according to respective times when the log entries were generated (Puri [0051]: “Data present in log files may be characterized by log traces containing unique identifiers, timestamps, events, and actions.” Puri [0061]: “Additionally, the data anomaly analyzer 116 may ensure that all data will be in sequential temporal order for rapid data ingestion and optimal memory usage.”)
generates standardized log templates for the log entries, wherein the standardized log templates represent uniform structures for generation of sequences of the events, the transitions, and the states (Puri [0067] discloses: “With respect to data normalization for data ingestion at 202, the data anomaly analyzer 116 may provide for the normalization of data into an agnostic format that the apparatus 100 may consume. For example, data may be normalized by converting timestamps to ZULU time. Normalization of data may be performed via customizable connectors that allow for reuse. Additionally, normalization may occur in real-time.”)
generates, based on the log entries, event sequences according to the uniform structures of the standardized log templates (Puri [0085] teaches the claimed “one or more event sequences according to the uniform structure” that is associated with the disclosed “Once normalized, the framework may proceed to mine the log traces for their placement in a behavior graph model with a mining algorithm…..The framework may index the event and identifier pair to create a series of compound event sequences,” disclosing “With respect to processing of log files in the trace analysis framework, the apparatus 100 may include a preprocessing stage that may include the selection and ingestion of trace event information into a query-able format for normalization. Once normalized, the framework may proceed to mine the log traces for their placement in a behavior graph model with a mining algorithm. The mining algorithm may discover the temporal ordering between trace events based on an ordering or timestamp that may be present in a log file and an identifier that groups log traces together. That is, the mining algorithm may extract an event type and an identifier for each ingested trace entry. The framework may index the event and identifier pair to create a series of compound event sequences.”)
generates, based on the event sequences, a probabilistic graph that models the event sequences (Puri [0021-0022] teaches the claimed “probabilistic graph that models the event sequences” that is associated with the disclosed “probabilistic event graphs,” disclosing “Armed with an ever-watching tool, capable of evolving over time providing context to events, an analyst may be confident that the tool will generate alerts, quarantine and control agents, and stop malicious behavior before irreparable damage occurs to the Enterprise and its assets. With respect to the apparatus and methods disclosed herein, behavior learning may denote learning common behaviors that occur within an Enterprise network and transforming the behaviors into probabilistic event graphs (based on extract-transform-load or ETL, distributed storage, distributed processing, and machine learning).”)
wherein the probabilistic graph comprises respective branches for the event sequences comprising nodes representing the events, edges representing the transitions between the events of the event sequences, wherein the nodes further comprises respective indications of probabilities of the transitions, wherein the respective branches comprise end nodes with states at which the respective branches ended (Puri, Fig. 3 and 4 discloses:
PNG
media_image1.png
436
308
media_image1.png
Greyscale
PNG
media_image2.png
330
464
media_image2.png
Greyscale
wherein the states comprise an anomaly state and a desired state, wherein the anomaly state comprises an occurrence of one or more anomalies, and wherein the desired state does not comprises any occurrence of anomalies (See Puri Fig. 4 above, specifically the dashed lines. Puri [0102] discloses: “For example, as shown in FIG. 4, the data anomaly analyzer 116 may categorize how a real-time activity graph for a user-1 at 400 differs from a user-1 baseline at 402, which is determined from the master directed graph 104 that represents known or pre-established events. For example, the “dashed” lines for the real-time activity graph for a user-1 at 400 represent anomalies with respect to the master directed graph 104. In this regard, based on the rules 114, an event such as the “dashed” lines for the real-time activity graph for the user-1 at 400 may have been characterized as a very-high anomalous event (since no corresponding event such as the “dashed” lines exists in the user-1 baseline at 402 or in the master directed graph 104). In this regard, any event that is not present in the user-1 baseline at 402 or in the master directed graph 104 may have been categorized as highly anomalous. Alternatively, assuming that the master directed graph 104 includes an anomalous categorized event (not shown in FIG. 4) such as the event including the “dashed” lines, based on a match of the event including the “dashed” lines with the corresponding anomalous categorized event from the master directed graph 104, the event including the “dashed” lines may be categorized accordingly. That is, the data anomaly analyzer 116 may determine a bounded metric to characterize the degree of contextual fitness or anomalousness of an incoming walk of trace events or graph (e.g., the real-time activity graph for the user-1 at 400) compared to that of another walk or graph (e.g., the user-1 baseline at 402). Accordingly, based on the rules 114, the data anomaly analyzer 116 may characterize the “dashed” lines for the real-time activity graph for the user-1 at 400 as a very-high anomalous event.” Thus, Examiner notes, that the dashed nodes are anomaly states, and solid nodes are desired states.)
iteratively, during a current execution of one or more of the computer applications: analyzes an additional log entry added to the log files by a computer application of the one or more computer applications (Puri [0062] teaches the claimed “iteratively, during a current execution of one or more of the computer applications: analyzes an additional log entry” that is associated with the disclosed “receive a nightly feed of encrypted, compressed logs into a staging server. The data 118 may be analyzed for anomalies,” disclosing “According to an example of network security events, the data 118 for the data anomaly analyzer 116 may include log files of network security events from all the devices on a network such as laptops, workstations, servers, routers, switches and intrusion detection systems and antivirus systems. The data anomaly analyzer 116 may receive a nightly feed of encrypted, compressed logs into a staging server. The data 118 may be analyzed for anomalies as disclosed herein.”);
updates, based on the additional log entry and the standardized log templates, an event sequence of the event sequences (Puri [0085] teaches the claimed “updates…the one or more event sequences” that is associated with the disclosed “event traces may be combined to create event sequences,” disclosing “The framework may index the event and identifier pair to create a series of compound event sequences. The power of the framework may come from the ability to execute several mapper functions with the capability to normalize data on-the-fly, with each producing a localized version of a network graph model for the data. Additionally, the algorithm may discover and sort the temporal relation between events based on an ordering that may be present in the log file and an identifier that groups the security log traces together (e.g., internet address). Security event traces may be combined to create event sequences that aggregate to create a master graph model with statistical information on the likelihood of transitions between events, likelihood of the occurrence of events, and other relevant metadata.”)
identifies, based on the updated event sequence, a node in the probabilistic graph corresponding to the current execution of the computer application (Puri [0043]: “In this regard, all edges for a master directed graph may be tracked according to the number of times an event sequence has gone, for example, from a node A to a node B.” Puri [0086]: “FIG. 3 depicts an example of a master directed graph (i.e., a directed cyclic graph (DCG)) from a mined input log file where each of the nodes corresponds to an event within the log with overlapping identifier features.” Here, Puri indicates that nodes in the graph correspond to events, and earlier Puri discloses real-time, and therefore discloses a node corresponding to the current execution of the computer application.)
determines, in real-time based on the respective indications of probabilities of the transitions in the probabilistic graph and the node in the probabilistic graph corresponding to the current execution of the computer application, a probability of the computer application executing the anomaly state [at a future time] (Puri, Para [0103], discloses, “Thus, the data anomaly analyzer 116 may grade an incoming or emerging (in-flight) sequence of events against the probabilistic rankings of all known event walks that are contained within the master directed graph 104.” Puri was shown above to teach that the nodes correspond to a current execution of the computer application.)
displays, on a display device, a visualization comprising the probability of the computer application executing the anomaly state [at the future time]. (Puri, Para [0107], discloses: “Referring to FIG. 5, with respect to the example of network security events disclosed herein, the apparatus 100 may be applied to three months (e.g., three petabyte) of security data to generate graphs with nodes representing the events, edges connecting events that are related to each other, the size representing the anomalousness (i.e., the very high probability of anomalousness events being displayed on the outer bounds as shown in FIG. 5 at 500, to the very-low probability of anomalousness events being displayed towards the middle), and different colors (e.g., red, yellow, orange, etc.) representing the probability of occurrence of the events” and in [0110] discloses: “The output of graph analysis may provide input into dashboards and exploratory visualizations. For example, ranked event anomalies may be stored and streaming events may also be compared against a stored set of the anomaly rankings. Any streamed event that falls within the highest anomaly category may be marked, aggregated, and cumulative event information may be streamed to an in-memory database from which polling will occur at a constant rate to update the visualization for quick display.”)
However, Puri does not teach cleans the aggregated log structure by removing at least one of corrupted data or redundant data from the log entries; determines whether the standardized log templates comprises a standardized log template for the additional log entry; in response to determining that the standardized log templates do not comprise any standardized template for the additional log entry, generates an additional standardized log template for the additional log entry; updates, based on the event sequences, the probabilistic graph; at a future time
Kent teaches cleans the aggregated log structure by removing at least one of corrupted data or redundant data from the log entries (Kent, Page 3-3, discloses: “Event filtering is the suppression of log entries from analysis, reporting, or long-term storage because their characteristics indicate that they are unlikely to contain information of interest. For example, duplicate entries and standard informational entries might be filtered because they do not provide useful information to log analysts.”)
Kent, like Puri, also teaches generates standardized log templates for the log entries, wherein the standardized log templates represent uniform structures for generation of sequences of the events, the transitions, and the states (Kent, Page 3-5 to 3-6: “In a logging infrastructure based on the syslog protocol, each log generator uses the same high-level format for its logs … Many log sources either use syslog as their native logging format or offer features that allow their log formats to be converted to syslog format. Section 3.3.1 describes the format of syslog messages … Syslog is intended to be very simple, and each syslog message has only three parts. The first part specifies the facility and severity as numerical values. The second part of the message contains a timestamp and the hostname or IP address of the source of the log. The third part is the actual log message content.”)
Kent, like Puri, also teaches generates, based on the log entries, event sequences according to the uniform structures of the standardized log templates (Kent, Page 3-5 to 3-6: “Many log sources either use syslog as their native logging format or offer features that allow their log formats to be converted to syslog format … Syslog is intended to be very simple, and each syslog message has only three parts.”)
Kent is analogous art because it is in the field of endeavor of log file analysis. It would have been obvious before the effective filing date of the claimed invention to combine the log file analysis of Puri with the event filtering of Kent. One of ordinary skill in the art would have been motivated to do so in order to increase efficiency by avoiding the need to analyze superfluous data (Kent, Page 3-3: “because their characteristics indicate that they are unlikely to contain information of interest”).
However, the combination of Puri and Kent does not teach determines whether the standardized log templates comprises a standardized log template for the additional log entry; in response to determining that the standardized log templates do not comprise any standardized template for the additional log entry, generates an additional standardized log template for the additional log entry; updates, based on the event sequences, the probabilistic graph; at a future time
Kowalski teaches updates, based on the event sequences, the probabilistic graph (Kowalski, Para [0014-0015], discloses: “Also, as more data becomes available the probability of state transitions may need to be updated based on the new information. For example, events or system log entries can be interpreted and used to alter probabilities associated with state transitions … Once built, the Markov Chain can be maintained and updated by subsequent monitoring of information available in the IT environment.”
determines, in real-time based on the respective indications of probabilities of the transitions in the probabilistic graph and the node in the probabilistic graph corresponding to the current execution of the computer application, a probability of the computer application executing the anomaly state at a future time (Kowalski, Para [0021], discloses: “In the context of change management within an enterprise system, having a model that can provide probabilities of certain outcomes given a current state, can allow administrators to pro-actively take action if certain undesirable future states have probabilities that rise above certain thresholds.”)
Puri and Kowalski disclose methods related to field of endeavor of utilizing probabilistic graphs to predicting and/or forecasting outcomes and are therefore analogous. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the system of probabilistic graph analysis of a “master directed graph” to identify anomalies of Puri, with the ongoing updating of the graph probabilities of Kowalski to forecast future anomalies, this resulting in ongoing updates of a “master directed graph” to forecast future anomalies. One of ordinary skill in the art would have been motivated to do so in order to take proactive action and achieve better planning, anticipation, and root cause analysis (Kowalski [0014]: “Once constructed, a Markov Chain can be used to determine, for a given state, what the probability is for future states of CIs in a data center. Such a method and system could be useful in proactive datacenter management, problem anticipation, determination and remediation, system and application provisioning, capacity planning, etc. The disclosed systems and methods could also be valuable in root cause analysis, forensics and post mortem determination of system outages.”)
However, the combination of Puri and Kowalski does not explicitly teach determines whether the standardized log templates comprises a standardized log template for the additional log entry; in response to determining that the standardized log templates do not comprise any standardized template for the additional log entry, generates an additional standardized log template for the additional log entry
Liu teaches determines whether the standardized log templates comprises a standardized log template for the additional log entry (Liu, Page 5 Chapter 2, states: “Definition 2.0.2. Log Template (or Template) is the common format of a group of log entries, which are sharing the same layout but filled with different parameters.” Liu, Page 46, discloses: “Log matching is the process, where new logs are matched with the obtained template representation stored in the search dictionary for obtaining their own IDs. The log matching is actually divided into two steps. First, the command token of the new log extracted together with the log length in order to create a tuple (command, length). Using this tuple as the key, a partition of templates referring different clusters will be found for further matching. Then, the log will be compared with each of the template representations for similarity check. If there is a match, the new log will be given with the corresponding key ID of the value template; otherwise, the ID 0 will be given meaning that this is an unknown log.”
Here, Liu uses clustering, and for a given log, determines if a standardized log template for it exists within the clusters. If it does not match any cluster, it is unknown).
in response to determining that the standardized log templates do not comprise any standardized template for the additional log entry, generates an additional standardized log template for the additional log entry (Liu, Page 62, discloses: “When a new log is compared with all the created cluster representations, if the obtained dissimilarity value is lower than or equal to the predefined threshold, that means this log could belong to the cluster. If more than one cluster are meeting this requirement, the new log will be grouped into the cluster with lowest result. If all results are higher than the threshold, a new cluster will be created and the new log will become the initial representation of this cluster. Now, another experiment is conducted to evaluate the effect of this preset threshold.”
Here, Liu disclose that if there is no existing log template (“dissimilarity value … higher than the threshold”) then generate a new log template (“a new cluster will be created and the new log will become the initial representation of this cluster”.))
Liu is analogous art because it is directed to “How to represent the minimally-structured heterogeneous log data into decent formats in order to conduct learning process?” (Liu Page 2), which is similarly analogous to the problem faced by Puri (Puri [0051]: “The data anomaly analyzer 116 may support computing system transparency by applying analytics to (semi-) automatically consume and analyze the data 118 that includes heterogeneous computer generated trace entries.”) It would have been obvious before the effective filing date of the claimed invention to combine the system log analysis of Puri with the updating of log templates of Liu. One of ordinary skill in the art would have been motivated to do so in order to overcome the problem of analyzing data from different sources, as noted by Puri in [0018]: “Analyzing log data from many heterogeneous sources to find errors and anomalies can be difficult” and to use the heterogeneous logs for better diagnosis, as noted in Liu Page 8: “The problem here is to understand this heterogeneous descriptive part and extract features and templates from it, in order to make the massive log messages well-regulated, since it contains most of the useful information needed for diagnosis.”)
As per Claim 5, the combination of Puri, Kent, Kowalski, and Liu teaches the system of Claim 1. Puri teaches wherein the probabilistic graph is a type selected from the group consisting of a Markov chain, a probabilistic tree, Bayesian network, and Markov Random fields (Puri, Figure 4, discloses:
PNG
media_image3.png
354
304
media_image3.png
Greyscale
Examiner notes that this starts from a root and branches off at several levels, and is in the structure of a tree, with probabilities for each branch. This is a probabilistic tree.)
Furthermore, Examiner notes that Kowalski teaches a Markov Chain (Kowalski, Para [0008], discloses: “FIG. 2 illustrates, diagrams representing a method of modeling probabilities associated with states and potential state changes between states (e.g., a Markov Chain or a Directed Graph).”
A per Claim 6, the