Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/21/2026 has been entered.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-14 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Grilli (10805327) in views of He et al (2015/0317327) and Nemirofsky et al (2020/0329233).
For claim 1, Grilli teaches a method (reducing the size of messages) of log files of a plurality of devices coupled with at least one processor (abstract), the method comprising: receiving a plurality of log files generated by the plurality of devices (Grilli teaches that computing resource service provider provides an anomaly detection service that obtains log information from a plurality of computing resources within a computing environment as Grilli teaches in col.1, lines 50-65), creating a table mapping each string of the plurality of repeated strings to a unique value of the unique values (Grilli teaches that the result of the spatial cosine similarity algorithm is within a value relative to a threshold, the message may be classified as “known but still rare” or another classification indicating that the message includes information that has previously been analyzed by the anomaly detection service but is uncommon relative to other messages, for example, the result of the spatial cosine similarity algorithm may indicate that the message is common and may be disregarded by the system, and one example, the anomaly detection service maintains a global dictionary of tokens included in messages and frequencies associated with the tokens, and this information may be used to generate the virtual messages and determine the classification of messages and the data includes several separate data tables, databases, data documents, dynamic data storage schemes and/or other data storage mechanisms and media for storing data relating to a particular aspect of the present disclosure as Grilli teach in col.2, lines 30-45), storing the table in a storage medium (Grilli teaches that a training phase is performed to generate a global dictionary and a virtual message. The global dictionary may include content of the log information such as tokens, characters, words, components of a message and/or entry, or any other information that may be obtained from the log information as Grilli teaches in col.5, lines5-15); creating a vector encoding the unique value mapped to each string of the plurality of repeated strings using the stored table (to generate vectors or other data formats that are suitable for analysis using the spatial cosine similarity algorithm such that may utilize the spatial cosine similarity algorithm to measure the similarity between two or more vectors and the training phase 228 may be repeated after a certain number of messages are processed during the detection and update phase 226. In various embodiments, the frequencies included in the global dictionary may be halved prior to generating the virtual message as Grilli teaches in col.5, lines 60-68 and col.7, lines 15-30); inputting the classification of the plurality of lines into a trained machine learning (ML) model (Grilli teaches that a classification for messages and other information obtained from log information generated by computing resources such that the classified messages may be provided to various end points such as a system engineer or a machine learning algorithm and a computing resource service provider provides computing resources to users and other entities, such that the users may use the classified message information to generate regular expressions of other information that may be used by the anomaly detection service or other systems to detect anomalous activity, and the anomaly detection subsystem may use a spatial cosine similarity algorithm to identify and/or classify messages, log entries, or other information included in the log information obtained from the computing resources as Grilli teaches in col.2, lines 10-30, 50-60, and col.4, lines 40-45), and receiving as an output of the ML model a security relevance score for each unique value (Grilli teaches that the anomaly detection subsystem 106 may execute a detection phase during which messages obtained from the log information are compared to the virtual message using the spatial cosine similarity algorithm. The anomaly detection service 110 may determine a threshold value for the results of comparing messages obtained from the log information to the virtual message using the spatial cosine similarity algorithm based at least in part on an amount of classified message which can be machine learning, information to be provided to the user as Grilli teaches in col.5, lines 1-15); filtering the encoded unique values in the vector according to the security relevance score of each unique value and removing non-security-relevant encoded values (Grilli teaches that the anomaly detection subsystem may filter messages based at least in part on a severity level associated with the message, such as any message obtained from the log information with a severity level of 4 or higher is filtered such that the messages are not used by the anomaly detection service 110 or component thereof, such as the anomaly detection subsystem and the messages are encoded as Grilli teaches in col.4,lines 60-68 and col.6, lines 30-40) and selecting a subset of the filtered encoded unique values to generate a compressed representation of the plurality of log files (Grilli teaches that the anomaly detection subsystem may filter messages based at least in part on a severity level associated with the message and the interface 220 obtains log information (e.g., messages generated by the computing resources 204) and provides the log information to a filter 222. As described in greater detail below, the filter 222 may reduce a number of messages, which compressed, provided to the anomaly detection subsystem 206 such that log information 316.sub.k-i may be filtered, cleaned, tokenized, and/or otherwise modified or formatted for use by the anomaly detection subsystem, Messages that are not discarded by the filter 222 may be provided to the message processor 224. In various embodiments, the message processor 224 removes English stop words, articles, pronouns, numbers, special characters, or any other information in the message not required by the anomaly detection subsystem as Grilli teaches in col.4, lines 55-68, col.6, lines 15-25, 43-50 and col.9, lines 20-43).
Grilli fails to teach for lossy compression, performing a first level of hierarchical clustering of the plurality of log files, where line-based text log files are clustered together based on the device or component that generated a respective log file; performing one or more additional levels of hierarchical clustering of the plurality of log files to classify each line of a plurality of lines of the plurality of log files, where repeated strings in each line at any level of the one or more additional levels are encoded as unique values and permanently removing non-security-relevant encoded values.
He teaches, similar system, performing a first level of hierarchical clustering of the plurality of log files, where line-based text log files are clustered together based on the device or component that generated a respective log file ((He teaches that log files may be organized as a tree structure or hierarchical structure of data having rows of similar with different levels a raw string index and level 1 index are constructed for nodes of the logfile; then in a second pass, key-value pairs are checked for correlation, and if key-value pairs are correlated as He teaches in par.36 and 42); performing one or more additional levels of hierarchical clustering of the plurality of log files to classify each line of a plurality of lines of the plurality of log files, where repeated strings in each line at any level of the one or more additional levels are encoded as unique values (He teaches that log files may be organized as a tree structure or hierarchical structure of data having rows of similar with different levels a raw string index and level 1 index are constructed for nodes of the logfile; then in a second pass, key-value pairs are checked for correlation, and if key-value pairs are correlated, then a level 2 index is built based on the level 1 index for those correlated pairs in that node. This may be repeated for the nodes and all of the rows of the logfile 236. The resulting three layers of the index tree 232 (raw string, level 1, and level 2) can cover the whole logfile 236 and may be used to compress logfile 236, as He teaches in par.36 and 42) and (He teaches that keys or items that repeat more frequently (such as a frequently occurring key-value pair) may be replaced with a shorter expression, such as a unigram (e.g., “1”), while items appearing less frequently may be assigned longer expressions as He teaches in par.45 and 46). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli to include two levels hierarchy clustering and replacing each string with unique value as taught and suggested by He in order to reduce some data logs by twenty-three percent on top of existing compression and provides better reliability and less maintenance costs (He, par.16).
Nemirofsky teaches, similar system, lossy compression (par.31) and permanently removing non-security-relevant encoded values (lossy compression means permanent, irreversible loss of data. Once a file is compressed using a lossy algorithm (such as JPEG, MP3, or AAC), the "removed" data is discarded forever, and the file cannot be restored to its exact original state, therefore, Nemirofsky teaches removing unnecessary or less important information as Nemirofsky teaches in par.31). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli to include lossy compression as taught and suggested by Nemirofsky in order to establish increased optimization processing is required in order to find computationally feasible compression with acceptable space encoding and decoding times (Nemirofsky, par.35).
For claim 2, Grilli in views of He and Nemirofsky discloses the method of claim1, further teaches inputting the vector into a detector to detect anomalies in the plurality of log files, detecting an anomaly, and in response, indicating the detection of malicious behavior by activating an electrical indicator (Grilli teaches to detect anomalous activity of the computing resources within the servers in sets of racks 322A-322B. The system 300 includes the anomaly detection subsystem 306 of a computing resource service provider 312, as described above, that classifies messages, log entries, or other information generated by the server computer systems or other components of the sets of racks 322A-322B. The log information 316.sub.k-i obtained from different server computer systems in the sets of racks 322A-322B may include messages or any other information generated by the server computer systems as described above. In one example, the messages include syslog messages generated by network devices included in the sets of racks 322A-322B. In another example, the log information 316.sub.k-i includes any information that may be converted into a vector and used in connection with the spatial cosine similarity algorithm as Grilli teaches in col.9, lines 10-40).
For claim 3, Grilli in views of He and Nemirofsky discloses the method of claim1, assigning each encoded unique value in the vector the security relevance score according to the classification of the line including the unique value further comprises (col.3, lines 15-30), training at least one model with the plurality of log files to classify each line of the plurality of lines in the plurality of log files and assign each line of the plurality of lines a corresponding security relevance score according to the classification of each line of the plurality of lines (Grilli teaches that the result of the spatial cosine similarity algorithm is within a value relative to a threshold, the message may be classified as “known but still rare” or another classification indicating that the message includes information that has previously been analyzed by the anomaly detection service but is uncommon relative to other messages, for example, the result of the spatial cosine similarity algorithm may indicate that the message is common and may be disregarded by the system, and one example, the anomaly detection service maintains a global dictionary of tokens included in messages and frequencies associated with the tokens, and this information may be used to generate the virtual messages and determine the classification of messages and the data includes several separate data tables, databases, data documents, dynamic data storage schemes and/or other data storage mechanisms and media for storing data relating to a particular aspect of the present disclosure as Grilli teach in col.2, lines 30-45); and assigning each line of the plurality of lines the security relevance score using the trained model (Grilli teaches that If the two vectors overlap, the spatial cosine similarity value will be 1. A spatial cosine similarity value of −1 indicates that the vectors are directionally opposed. In such embodiments, however, the minimum possible spatial cosine similarity value is 0 because the frequencies of the tokens cannot be negative. Accordingly, a value of 1 means that the two messages have the same tokens while a value of 0 indicates that the two messages have no tokens in common, as Grilli teaches in col.4, lines 55-68 and col.6, lines 15-25).
For claim 4, Grilli in views of He and Nemirofsky discloses the method of claim 3, further teaches further comprising: extracting string parameters from each repeated string and of the plurality of repeated strings and storing the string parameters in a separate file (the global dictionary is sorted and the first m tokens are extracted and used to generate the virtual message and executable instructions and/or other data may be stored on a non-transitory computer-readable storage medium as Grilli teaches in col.7, lines 25-32).
For claim 5, Grilli in views of He and Nemirofsky discloses the method of claim1, fails to teach unique values correspond to symbols, a most repeated string of the plurality of repeated strings is assigned a shortest symbol of the symbols, and a least repeated string of the plurality of repeated strings is assigned a longest symbol of the symbols.
He further teaches unique values correspond to symbols, a most repeated string of the plurality of repeated strings is assigned a shortest symbol of the symbols, and a least repeated string of the plurality of repeated strings is assigned a longest symbol of the symbols (He, par.45-46). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli to include unique values correspond to symbols as taught and suggested by He in order to reduce some data logs by twenty-three percent on top of existing compression and provides better reliability and less maintenance costs (He, par.16).
For claim 6, Grilli in views of He and Nemirofsky discloses the method of claim 1, further teaches compressing the selected subset of the unique values mapped to the plurality of repeated strings, with a binary compression algorithm (Grilli col.4, lines 45-60).
For claim 7, Grilli in views of He and Nemirofsky discloses the method of claim1, further teaches creating a second vector that counts appearances of each string of the plurality of repeated strings in a log file of the plurality of log files (Grilli teaches that a classification for messages and other information obtained from log information generated by computing resources and a computing resource service provider provides computing resources to users and other entities, such that the users may use the classified message information to generate regular expressions of other information that may be used by the anomaly detection service or other systems to detect anomalous activity, and the anomaly detection subsystem may use a spatial cosine similarity algorithm to identify and/or classify messages, log entries, or other information included in the log information obtained from the computing resources as Grilli teaches in col.2, lines 10-30 and col.4, lines 40-45).
For claim 8, Grilli in views of He and Nemirofsky discloses the method of claim 2, further teaches wherein the detector includes a supervised machine learning algorithm that is trained with a labelled log lines including labelled malicious and benign behavior, to detect malicious behavior in other log lines (as Grilli teaches in col.2, lines 1-5 and lines 50-55).
For claim 9, Grilli in views of He and Nemirofsky discloses the method of claim 8, further teaches wherein the supervised machine learning algorithm is a member of the following list: decision tree, neural network, and support vector machines (SVM) (as Grilli teaches in col.2, lines 50-55 and col.7, lines 15-25).
For claim 10, Grilli in views of He and Nemirofsky discloses the method of claim 2, further teaches wherein the detector includes by an unsupervised machine learning algorithm that is trained with unlabeled log lines to detect and distinguish anomaly behavior from normal behavior of other log lines (as Grilli teaches in col.2, lines 50-55 and col.7, lines 15-25 and col.8, lines 45-53).
For claim 11, Grilli in views of He and Nemirofsky discloses the method of claim 10, further teaches wherein the unsupervised machine learning algorithm is a member of the following list: one class support vector machines (SVM) or auto-encoder (as Grilli teaches in col.2, lines 50-55 and col.7, lines 15-25 and col.8, lines 45-53).
For claim 12, Grilli in views of He and Nemirofsky discloses the method of claim 1, further teaches wherein the log files includes vehicular data (as Grilli teaches in col.14, lines 3-15).
For claim 13, Grilli in views of He and Nemirofsky discloses the method of claim 1, fails to teach wherein the table is a hash table.
He further teaches wherein the table is a hash table (He, par.17). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli to include a hash table as taught and suggested by He in order to reduce some data logs by twenty-three percent on top of existing compression and provides better reliability and less maintenance costs (He, par.16).
For claim 14, Grilli in views of He and Nemirofsky discloses the method of claim 2, further teaches vector is used to analyze a time period within a log file of the plurality of log files (grilli teaches in col.7, lines 20-30).
For claim 16, Grilli teaches an apparatus for logs compressing log files (reducing the size of messages), comprising at least one processor configured to execute a code for, and a storage medium storing instructions that when executed (col.15, lines 55-62), cause the processor to: receive a plurality of line-based text log files generated by a plurality of devices (Grilli teaches that computing resource service provider provides an anomaly detection service that obtains log information from a plurality of computing resources within a computing environment as Grilli teaches in col.1, lines 50-65), input each line of each log file of the plurality of log files into a classification model, and receive a classification of the line as an output of the classification model (Grilli teaches that a classification for messages and other information obtained from log information generated by computing resources and a computing resource service provider provides computing resources to users and other entities, such that the users may use the classified message information to generate regular expressions of other information that may be used by the anomaly detection service or other systems to detect anomalous activity, and the anomaly detection subsystem may use a spatial cosine similarity algorithm to identify and/or classify messages, log entries, or other information included in the log information obtained from the computing resources as Grilli teaches in col.2, lines 10-30 and col.4, lines 40-45), identify a plurality of repeated strings in the plurality of log files (Grilli teaches that the anomaly detection system provides a technical advantage by providing improved mechanisms to detect anomalies by at least measuring the similarity between messages and a virtual message. In one example, the virtual message is generated during a training phase and is representative of a majority or other percentage of messages observed by the anomaly detection service over an interval or time. During a detection phase the anomaly detection service may compare messages to the virtual message using a spatial cosine similarity algorithm described in greater detail below. In an example, the result of the spatial cosine similarity algorithm indicates an amount of similarity between the message and the virtual message. The anomaly detection server may then categorize or otherwise indicate classification of the message as Grilli teaches in col.2, lines 10-25); create a table mapping each string of the plurality of repeated strings to a unique value (Grilli teaches that the result of the spatial cosine similarity algorithm is within a value relative to a threshold, the message may be classified as “known but still rare” or another classification indicating that the message includes information that has previously been analyzed by the anomaly detection service but is uncommon relative to other messages, for example, the result of the spatial cosine similarity algorithm may indicate that the message is common and may be disregarded by the system, and one example, the anomaly detection service maintains a global dictionary of tokens included in messages and frequencies associated with the tokens, and this information may be used to generate the virtual messages and determine the classification of messages and the data includes several separate data tables, databases, data documents, dynamic data storage schemes and/or other data storage mechanisms and media for storing data relating to a particular aspect of the present disclosure as Grilli teach in col.2, lines 30-45); store the table in the storage medium (to generate vectors or other data formats that are suitable for analysis using the spatial cosine similarity algorithm such that may utilize the spatial cosine similarity algorithm to measure the similarity between two or more vectors and the training phase 228 may be repeated after a certain number of messages are processed during the detection and update phase 226. In various embodiments, the frequencies included in the global dictionary may be halved prior to generating the virtual message as Grilli teaches in col.5, lines 60-68 and col.7, lines 15-30); create a vector encoding the unique values mapped to each string of the plurality of repeated strings, using the stored table; assign each encoded unique value in the vector a security relevance score according to the classification of the line including the unique value (Grilli teaches that If the two vectors overlap, the spatial cosine similarity value will be 1. A spatial cosine similarity value of −1 indicates that the vectors are directionally opposed. In such embodiments, however, the minimum possible spatial cosine similarity value is 0 because the frequencies of the tokens cannot be negative. Accordingly, a value of 1 means that the two messages have the same tokens while a value of 0 indicates that the two messages have no tokens in common, as Grilli teaches in col.4, lines 55-68 and col.6, lines 15-25);; filter the encoded unique values in the vector according to the security relevance score of each unique value value and remove filtered encoded values below a security relevance threshold (Grilli teaches that the anomaly detection subsystem may filter messages based at least in part on a severity level associated with the message, such as any message obtained from the log information with a severity level of 4 or higher is filtered such that the messages are not used by the anomaly detection service 110 or component thereof, such as the anomaly detection subsystem and the messages are encoded as Grilli teaches in col.4,lines 60-68 and col.6, lines 30-40); select a subset of the filtered encoded unique values as a compressed representation of the plurality of log files, the subset selected based on the security relevance scores of the filtered encoded unique values (Grilli teaches that the anomaly detection subsystem may filter messages based at least in part on a severity level associated with the message and the interface 220 obtains log information (e.g., messages generated by the computing resources 204) and provides the log information to a filter 222. As described in greater detail below, the filter 222 may reduce a number of messages, which compressed, provided to the anomaly detection subsystem 206 such that log information 316.sub.k-i may be filtered, cleaned, tokenized, and/or otherwise modified or formatted for use by the anomaly detection subsystem as Grilli teaches in col.4, lines 55-68, col.6, lines 15-25 and col.9, lines 20-43).
Grilli fails to teach the classification based on a first level of hierarchical clustering based on a device of the plurality of devices that generated the log file, and a second level of hierarchical clustering based on a similarity of strings within the plurality of log files and replace each string of the plurality of repeated strings with a corresponding unique value in each log file of the plurality of log files and permanently remove filtered encoded values below a security relevance threshold.
He teaches, similar system, the classification based on a first level of hierarchical clustering based on a device of the plurality of devices that generated the log file, and a second level of hierarchical clustering based on a similarity of strings within the plurality of log files (He teaches that log files may be organized as a tree structure or hierarchical structure of data having rows of similar with different levels a raw string index and level 1 index are constructed for nodes of the logfile; then in a second pass, key-value pairs are checked for correlation, and if key-value pairs are correlated, then a level 2 index is built based on the level 1 index for those correlated pairs in that node. This may be repeated for the nodes and all of the rows of the logfile 236. The resulting three layers of the index tree 232 (raw string, level 1, and level 2) can cover the whole logfile 236 and may be used to compress logfile 236, as He teaches in par.36 and 42) and replace each string of the plurality of repeated strings with a corresponding unique value in each log file of the plurality of log files (He teaches that keys or items that repeat more frequently (such as a frequently occurring key-value pair) may be replaced with a shorter expression, such as a unigram (e.g., “1”), while items appearing less frequently may be assigned longer expressions as He teaches in par.45 and 46). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli to include two levels hierarchy clustering and replacing each string with unique value as taught and suggested by He in order to reduce some data logs by twenty-three percent on top of existing compression and provides better reliability and less maintenance costs (He, par.16).
Nemirofsky teaches permanently remove filtered encoded values below a security relevance threshold (lossy compression means permanent, irreversible loss of data. Once a file is compressed using a lossy algorithm (such as JPEG, MP3, or AAC), the "removed" data is discarded forever, and the file cannot be restored to its exact original state, therefore, Nemirofsky teaches removing unnecessary or less important information as Nemirofsky teaches in par.31 and 37). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli to include permanently remove filtered encoded as taught and suggested by Nemirofsky in order to establish increased optimization processing is required in order to find computationally feasible compression with acceptable space encoding and decoding times (Nemirofsky, par.35).
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Grilli (10805327) in views of He et al (2015/0317327), Kalevo et al (2014/0164419) and Nemirofsky et al (2020/0329233).
For claim 15, Grilli teaches a method for log files of data (abstract), comprising: receiving an encoded log file including a plurality of unique values (col.4, lines 50-68 and col.6, lines 30-40), wherein the original log file further comprises: for each unique value in the encoded log file (Grilli teaches he interface 220 or other component of the computing resource service provider 212, such as a stream service, collects, parses, and encodes messages obtained from the log information. In one example, the interface 220 encodes the messages in JavaScript Object Notation (JSON) format as Grilli teaches in col.6, lines 25-40), retrieving a string corresponding to the unique value from a table stored in a storage medium (Grillie teaches combination of devices capable of storing, accessing and retrieving data, which may include any combination and number of data servers, databases, data storage devices and data storage media, in any standard, distributed, virtual or clustered system as Grilli teaches in col.14, lines 35-50); and inserting the retrieved parameters into the string (Grilli teaches that If the two vectors overlap, the spatial cosine similarity value will be 1. A spatial cosine similarity value of −1 indicates that the vectors are directionally opposed. In such embodiments, however, the minimum possible spatial cosine similarity value is 0 because the frequencies of the tokens cannot be negative. Accordingly, a value of 1 means that the two messages have the same tokens while a value of 0 indicates that the two messages have no tokens in common, as Grilli teaches in col.4, lines 55-68 and col.6, lines 15-25).
Grilli fails to teach the encoded log file compressed using lossy compression, data from the encoded log file, reconstructing an original log file comprising lines of text prior to encoding; wherein reconstructing the original log file replacing the unique value with the string; replacing the unique value with the string.
Kalevo teaches, similar system, from the encoded log file, reconstructing an original log file comprising lines of text prior to encoding; wherein reconstructing the original log file replacing the unique value with the string (par. 102 and138). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli to include decoding the encoded file, and to reconstruct an original line of the encoded file before encoding as taught and suggested by Kalevo for the purpose of generating corresponding encoded data for transmission or storage includes matching one or more portions of the source data to one or more elements in one or more databases, wherein the one or more elements are representative of corresponding one or more data blocks, and recording reference values which relate the one or more portions of the source data to the one or more matched elements (Kalevo teaches abstract). Grilli, as modified by Kalevo does not explicitly teach replacing the unique value with the string.
Grilli and Kalevo fails to explicitly teach, however, He teaches, similar system, replacing the unique value with the string; (He teaches that keys or items that repeat more frequently (such as a frequently occurring key-value pair) may be replaced with a shorter expression, such as a unigram (e.g., “1”), while items appearing less frequently may be assigned longer expressions as He teaches in par.45 and 46). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli and Kalevo to include replacing the unique value with the string as taught and suggested by He in order to reduce some data logs by twenty-three percent on top of existing compression and provides better reliability and less maintenance costs (He, par.16).
Nemirofsky teaches the encoded log file compressed using lossy compression (lossy compression as Nemirofsky teaches in par.31 and 37). It would have been obvious to one ordinary skill in the art before effective filling date to modify Grilli to include lossy compression as taught and suggested by Nemirofsky in order to establish increased optimization processing is required in order to find computationally feasible compression with acceptable space encoding and decoding times (Nemirofsky, par.35).
Response to Amendments/Arguments
Applicant’s arguments with respect to claim(s) 1-16 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant argues that in page 3 of remarks that the Office improperly combines Grilli and He by treating their unrelated mechanisms as directly interchangeable. However, examiner respectfully disagrees with applicant because Grilli teaches that the anomaly detection subsystem or other computer system performing the process filters the messages included in the log information. As described above, messages may be filtered based at least in part on information included in the message such as a severity level. A filter, in various embodiments, is used to filter messages to reduce a number of messages or the size of messages processed by the anomaly detection subsystem as such that means of compressing the log filling, and the secondary reference, He, teaches that hierarchical index based compression of building an index tree for compressing a logfile which may be indexed using multiple levels of detail including a raw-string level for raw string representations of the node, a first level for indexing keys and common values, and a second level for indexing correlated key-value pairs. The index tree may be used to compress rows of the data log and also used to decompress and restore the log. Therefore, it would have been obvious to one ordinary skill in the art to modify the system of Grilli which does reduce the size of the file with the system of He which does of compressing by having hierarchical index based compression with different levels for the purpose of preserving structural information in the data log, such as a global index, provides better reliability and less maintenance costs. So the combination of Grilli with He provides better system.
Applicant argues that in page 3 of remarks that neither Grilli nor He teaches using a machine learning model to generate security relevance scores, as recited in the amended claim 1 element of "inputting the classification of the plurality of lines into a trained machine learning (ML) model, and receiving as an output of the ML model a security relevance score for each unique value.". However, examiner respectfully disagrees with applicant because Grilli teaches classification for messages and other information obtained from log information generated by computing resources such that the classified messages may be provided to various end points such as a system engineer or a machine learning algorithm and a computing resource service provider provides computing resources to users and other entities, such that the users may use the classified message information to generate regular expressions of other information that may be used by the anomaly detection service or other systems to detect anomalous activity, and the anomaly detection subsystem may use a spatial cosine similarity algorithm to identify and/or classify messages, log entries, or other information included in the log information obtained from the computing resources. Therefore, Grilli uses computing resources such as machine learning system for inputting classification and outputting security score in unique value.
Applicant argues that in page 4 of remarks that there is no teaching in He performing a first level of hierarchical clustering of the plurality of log files, where line-based text log files are clustered together based on the device or component that generated a respective log file; performing one or more additional levels of hierarchical clustering of the plurality of log files to classify each line of a plurality of lines of the plurality of log files. However, examiner respectfully disagrees with applicant because He teaches that hierarchical index based compression of building an index tree for compressing a logfile which may be indexed using multiple levels of detail including a raw-string level for raw string representations of the node, a first level for indexing keys and common values, and a second level for indexing correlated key-value pairs. The index tree may be used to compress rows of the data log and also used to decompress and restore the log. Therefore, He meets the claims limitation.
The applicant's arguments regarding new amendments limitations in claims 1, 15 and 16, has been considered but is moot, because the examiner applied new art, Nemirofsky et al (2020/0329233), that covers new amendments limitations.
Regarding dependent claims arguments, said arguments are moot because the applied references are not considered to have alleged differences, and therefore are considered to properly show that for which they were cited.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AYUB A MAYE whose telephone number is (571)270-5037. The examiner can normally be reached Monday-Friday 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SHEWAYE GELAGAY can be reached at 571-272-4219. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AYUB A MAYE/Examiner, Art Unit 2436 /SHEWAYE GELAGAY/Supervisory Patent Examiner, Art Unit 2436