Last updated: May 29, 2026
Application No. 18/115,758
ERROR DEDUPLICATION AND REPORTING FOR A DATA MANAGEMENT SYSTEM BASED ON NATURAL LANGUAGE PROCESSING OF ERROR MESSAGES

Non-Final OA §103
Filed
Feb 28, 2023
Examiner
HICKS, SHIRLEY D.
Art Unit
2168
Tech Center
2100 — Computer Architecture & Software
Assignee
Rubrik Inc.
OA Round
5 (Non-Final)
Interview Optional

— +54.2% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 63% grant rate with +54.2% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 109 resolved cases, 2023–2026
Examiner Intelligence

HICKS, SHIRLEY D. View full profile →
Grants 63% of resolved cases
Career Allowance Rate
69 granted / 109 resolved
+8.3% vs TC avg
Strong +54% interview lift
Without
With
+54.2%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
25 currently pending
Career history
145
Total Applications
across all art units
Statute-Specific Performance

§103
74.3%
+34.3% vs TC avg
§102
25.5%
-14.5% vs TC avg
§112
0.2%
-39.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 109 resolved cases
Office Action

§103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
2.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 2/17/2026 has been entered.
Accordingly, claims 1, 3-16, 18-21, and 23 are pending in this application. Claims 1, 4, 7, 10, 16, 19, and 20 are currently amended.

Response to Arguments
Applicant’s arguments with respect to amended pending claims filed on 2/17/2026 have been fully considered but they are not persuasive. In view of the claim amendments, the rejections are being updated accordingly.
In regards to independent claim 1, Applicant argued that cited reference Goldberg does not teach amended limitations of “ identifying a set of unique errors associated with the set of error logs based at least in part on comparison of the set of natural language error strings generated by the natural language processing," and the "corresponding metadata that is extracted based at least in part on performing the natural language processing,"
In response to the arguments, it is submitted the cited limitations are being properly addressed by Goldberg based at least on Goldberg disclosing the following:
Goldberg discloses corresponding metadata that is extracted based at least in part on performing the natural language processing in Figs. 3-4; [0034]-[0038], which states, “the log parser includes log parsing rules to extract and format lines of the log message into log message fields described below… . For example, the time stamp 304, thread 306, and IP address 308 arguments of the log write instruction 302 are assigned corresponding numerical parameters 404, 406, and 408 in the log message 402”. The instant specification describes examples of extracted metadata type information as e.g., account, timestamp, pod ID, etc.. Goldberg teaches the timestamp in paragraph [0062], which state, “FIGS. 11A-11B show… assigning the character string 2019-07-31T10:13:03.1926 1108 in the log message 1104 to the variable identifier timestamp_iso8601 1110. Thus, for at least the reasons as set forth above, it is submitted that the amended limitations are properly addressed. 
Applicant also argued that Goldberg does not teach or suggest "the set of natural language error strings generated by the natural language processing" for at least the reason that Goldberg does not teach "generating a set of natural language error strings ..., based at least in part on performing natural language processing on the set of error logs [that] . . . extracts the corresponding metadata from the set of error logs," as in amended independent claim 1. As stated above, Goldberg teaches the extraction of the time stamp, which is an example of the corresponding metadata from the set of error logs. Thus, for at least the reasons as set forth above, it is submitted that the amended limitations are properly addressed.
Additionally, Applicant argues that Jha, Goldberg, Yang, and Cohen-alone or in combination-do not teach or suggest "wherein performing the natural language processing on the set of error logs extracts the corresponding metadata from the set of error logs," However, each reference teaches that a time stamp is extracted from log entries. According to the specification, the timestamp corresponds to the corresponding metadata extracted from the set of error logs. See at least paragraphs [0012]: “remove and extract metadata type information (e.g., account, timestamp, pod ID, etc.)”, [0025]: “a snapshot 135 may include metadata that defines a state of the computing object as of a particular point in time.” [0036]: “metadata associated with the error (e.g., account, timestamp, pod ID, etc.)”, and [0040]: “remove and extract metadata type information (e.g., account, timestamp, pod ID, etc.).”  Thus, for at least the reasons as set forth above, it is submitted that the amended limitations are properly addressed. 
In regards to independent claims 16 and 20 and dependent claims 4 and 19, the emphasized limitations that the Applicant argues in claims 16 and 20 are similar to the emphasized limitations of claim 1, which have been addressed above. See the response of claim 1 above for explanation.
Furthermore, it is also submitted that all limitations in pending claims, including those not specifically argued, are properly addressed. The reason is set forth in the rejections. See claim analysis below for detail.





Claim Rejections - 35 USC § 103
4.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
5.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

6.	Claims 1, 3-16, 18-21, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US 20230128244 A1) in view of Goldberg et al. (US 20180218004 A1), Yang (US 20140164376 A1), and Cohen et al (US 20110185234 A1).


7.	Regarding Claim 1, Jha discloses a method, comprising: 
receiving, from a first database, a set of error logs associated with a data management system (Fig. 7; [0043]: In response to receiving a request via the GUI 704, the log management server 702 queries the log message database 706 for log messages… The log DBMS responds to the request by reading a representative log message from each class of log messages with time stamps in the user-selected time interval from the application log files stored on the data-storage device); 
generating a set of natural language error strings and corresponding metadata for the set of error logs based at least in part on performing natural language processing on the set of error logs (Fig. 4; [0038]: The text strings and natural-language words and phrases of the log write instruction 302 also appear unchanged in the log message 402 and are used to describe the type of event (e.g., informative, warning, error, or fatal) that occurred during execution of the event source; Fig. 7; [0043]: The log management server 702 performs log message curation, as described below, on the representative log messages to obtain corresponding curated text statements; [0074]: the log management server uses a natural language processor (“NLP”) engine), 
wherein performing the natural language processing on the set of error logs extracts the corresponding metadata from the set of error logs (Figs. 3-4; [0034]-[0038]: the log parser includes log parsing rules to extract and format lines of the log message into log message fields described below… . For example, the time stamp 304, thread 306, and IP address 308 arguments of the log write instruction 302 are assigned corresponding numerical parameters 404, 406, and 408 in the log message 402 [extracted metadata type information (e.g., account, timestamp, pod ID, etc.)]; Fig. 7; [0043]-[0045]: A curated text statement of a log message are character strings extracted from the log message by the log management server 702… A Grok expression… is used by the log management server 702 to extract character strings (e.g., words, terms, and alphanumeric character strings) and parameters from log messages; [0062]: FIGS. 11A-11B show… assigning the character string 2019-07-31T10:13:03.1926 1108 in the log message 1104 to the variable identifier timestamp_iso8601 1110 [metadata type information (e.g., account, timestamp, pod ID, etc.)]);
storing the set of natural language error strings and corresponding extracted metadata for the set of error logs in a second database (Fig. 7; [0043]: The log management server 702… stores the curated text statements in a curated text statements database 710); 
However, Jha does not explicitly teach “identifying a set of unique errors associated with the set of error logs based at least in part on comparison of the set of natural language error strings generated by the natural language processing, and the corresponding metadata that is extracted based at least in part on performing the natural language processing, wherein identifying the set of unique errors comprises: determining whether two or more error logs of the set of error logs correspond to a single unique error of the set of unique errors based at least determining whether two or more natural language error strings generated by performance of the natural language processing on the two or more error logs satisfy a similarity threshold; and generating an error report based on the set of natural language error strings and corresponding extracted metadata stored in the second database, wherein the error report identifies the set of unique errors.”
On the other hand, in the same field of endeavor, Goldberg teaches 
identifying a set of unique errors associated with the set of error logs based at least in part on comparison of the set of natural language error strings generated by the natural language processing (Figs. 1-3; [0017-0018]: The log scanning component 135 determines whether log entries are duplicates of one another by comparing the machine-encoded text 127 of the entries. For example, the log scanning component 135 can detect duplicates when log entries each contain some or all of the same words; [0024]-[0028]:The text processing device 232 includes a log scanning component 135 and a log analysis component 140, which monitor, deduplicate, analyze, and flag system logs… the log scanning component can locate redundant copies by comparing the machine-encoded text 127 of multiple log entries, and locating matching portions… all redundant copies can be removed; [natural language error strings generated by the natural language processing corresponds to machine-encoded text 127]), 
and the corresponding metadata that is extracted based at least in part on performing the natural language processing ([0011]: System log entries can be records of system events, changes, operations, etc. The log entries can contain information about a problem with the system, and analyzing the logs can help a user or technical support group determine the best solution to the problem [log entries must contain corresponding metadata, i.e. timestamps, information representing the date and time of the problem with the system]; [0030]: an email sent when a log entry is flagged because multiple copies of the log entry have been detected in a designated time span [based on extracted metadata, i.e. the timestamp]; [0036]: text processing component 130 include data in addition to instructions or statements).
Additionally, Yang teaches wherein identifying the set of unique errors comprises:
determining whether two or more error logs of the set of error logs corresponds to a single unique error of the set of unique errors ([0016]-[0017]: Referring initially to FIG. 1, cluster system 100 is illustrated. The cluster system receives a set of strings as input…  the string can correspond to an event message from a diagnostic log… the preprocess component can be configured to filter out duplicate messages such that the resulting output are unique strings) based at least in part on determining whether two or more natural language error strings generated by performance of the natural language processing on the two or more error logs satisfy a similarity threshold (Figs. 5-7; [0040]-[0044]: Referring to FIG. 5… At reference numeral 510, a set of strings is assigned to one of a plurality of clusters based on similarity… the method continues at reference 740, where a determination is made concerning whether the pattern length is less than or equal to a threshold length); and
Furthermore, Cohen teaches generating an error report based on the set of natural language error strings and corresponding extracted metadata stored in the second database, wherein the error report identifies the set of unique errors (Fig. 1a; [0024]: An event 104 typically has a timestamp 105; [0036]: For the following purposes, each log event, e, will be denoted by a tuple (t,msg), where t is the timestamp 105 of the message; [0071]: The table in FIG. 8 summarizes the results of running each of the datasets through the template generator module 412. For every data set the table shows the timeframe of the messages in the log, how many messages were processed (number of messages), how many distinct messages were in the logs (number of unique messages) [The timestamps correspond to the corresponding extracted metadata]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Jha to incorporate the teachings of Goldberg, Yang, and Cohen to include identifying a set of unique errors associated with the set of error logs and generating an error report.
The motivation for doing so would be to remove redundant copies of log entries, as recognized by Goldberg ([0018] of Goldberg: In some embodiments, all redundant copies of log entries are removed during deduplication), analyze clusters based on string similarity, as recognized by Yang ([Abstract of Yang]: clusters can be analyzed based on the similarity or difference of strings in a cluster), and generate an error report based on the set of natural language error strings ([0012] of Cohen]: FIG. 8 is a table showing results of applying an embodiment of the invention to various source computer system log files).



8.	Regarding Claim 3, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1. 
Jha further teaches wherein generating the error report comprises: generating the error report indicating a quantity of instances each unique error occurred in the set of error logs ([0044]: FIG. 8A shows an example GUI 802… The GUI 802 also includes a window 808 thin displays examples of different classes of log messages. A representative log message of each class is displayed along with a count of the number of log messages in each class… FIG. 20 includes a window 2006 that displays a count of log messages represented by selected curated text statements within the user-selected time window).

9.	Regarding Claim 4, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1. 
Yang further teaches, further comprising: identifying that two or more natural language error strings in different error logs of the set of error logs correspond to a first same error based on a quantity of characters that are the same between the two or more natural language error strings satisfying a threshold (Figs. 5-7; [0040]-[0044]: Referring to FIG. 5… At reference numeral 510, a set of strings is assigned to one of a plurality of clusters based on similarity… the method continues at reference 740, where a determination is made concerning whether the pattern length is less than or equal to a threshold length).

Regarding Claim 5, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1.
Goldberg further teaches wherein generating the error report comprises: generating the error report based on a duration since generation of a previous error report satisfying a threshold (Fig. 3; [0018]: If the log scanning component 135 records that a log entry has a number of redundant copies that surpasses a threshold number of copies over a designated time span, this log entry may be considered significant or in some way indicative of a problem with the system).

Regarding Claim 6, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 5, 
Goldberg further teaches, further comprising: receiving, from a user interface view associated with an administrator of the data management system, an indication of the threshold (Fig. 3; [0029]-[0030]: In some cases, there can be a threshold number of log entries for a period of time. For example, a threshold can be set so that, if 50 or more copies of a log entry are produced in 30 minutes or less, the log entry is flagged… The user is alerted to flagged log entries in operation 350).

Regarding Claim 7, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 5. 
Goldberg further teaches wherein generating the error report comprises: 
generating the error report based on additional natural language error strings and additional corresponding extracted metadata associated with additional sets of error logs stored in the second database since the previous error report ([0009]: Text in the photographic images of the logs can be converted to machine-encoded text, stored, and analyzed in the same device or in another device. The system logs can additionally be deduplicated, and displayed to a user.), 
wherein sets of error logs are periodically received (Fig. 3; [0027]: Additionally, the log scanning component 135 can record the number of copies of a log entry, as well as the time period in which the copies appeared), and wherein the natural language processing is periodically performed on the sets of error logs that are periodically received to generate sets of natural language error strings and corresponding extracted metadata for the set of error logs that are periodically (Fig. 3; [0029]: The log analysis component 140 can flag log entries that have been repeated in great quantities or over long time spans, as was discussed with respect to FIG. 1. These quantities and time spans can be designated by user-input and/or preprogrammed settings. In some cases, there can be a threshold number of log entries for a period of time), and wherein the sets of natural language error strings and corresponding extracted metadata for the set of error logs that are periodically received are periodically stored in the second database (Fig. 3; [0009]: Text in the photographic images of the logs can be converted to machine-encoded text, stored, and analyzed in the same device or in another device).

Regarding Claim 8, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1. 
Goldberg further teaches wherein generating the error report comprises: generating the error report based on a quantity of instances of a first same error occurring in different error logs of the set of error logs satisfying a threshold (Figs. 1-3[0018]: For example, a user may specify that log entries should be deduplicated when the number of log entries surpasses a threshold number. The portion size can be any number of system log copies, and the number can be preprogrammed or specified by the user. Additionally, instructions could specify that log entries recorded at particular times be deduplicated).

Regarding Claim 9, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 8. 
Goldberg further teaches, further comprising: receiving, from a user interface view associated with an administrator of the data management system, an indication of the threshold ([0018]:For example, a user may specify that log entries should be deduplicated when the number of log entries surpasses a threshold number. The portion size can be any number of system log copies, and the number can be preprogrammed or specified by the user. Additionally, instructions could specify that log entries recorded at particular times be deduplicated; [0029]: For example, a threshold can be set so that, if 50 or more copies of a log entry are produced in 30 minutes or less, the log entry is flagged; Fig. 1; [0010]: Environment 100 includes a console display 105).

Regarding Claim 10, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1.
Jha further teaches herein generating the error report comprises: generating the error report based on the corresponding extracted metadata satisfying a triggering condition, the triggering condition comprising an account identifier, an object identifier, a job type, or a customer identifier (Fig. 3; [0037]: The example log write instruction 302 also includes text strings and natural-language words and phrases that identify the level of importance of the log message 310 and type of event that triggered the log write instruction, such as “Repair session” argument 312).

Regarding Claim 11, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1. 
Jha further teaches wherein performing the natural language processing on the set of error logs comprises: removing punctuation in error strings of the set of error logs, removing stop words in the error strings of the set of error logs, removing numerals in error strings of the set of error logs, removing uniform resource locators in the error strings of the set of error logs, removing geographic information in the error strings of the set of error logs ([0028]: FIG. 22 is a flow diagram illustrating an example implementation of the “filter unacceptable character strings from the log messages to obtain curated text statements based on the Grok expressions and acceptable character strings” procedure performed in FIG. 21), 
performing tokenization on the error strings of the set of error logs, performing stemming on the error strings of the set of error logs ([0019]: FIG. 13 shows an example of tokens formed from character strings of a log message and corresponding Grok patterns of a Grok expression for the log message), 
performing lemmatization on the error strings of the set of error logs, or a combination thereof (Figs. 22-25; [0028]: FIG. 22 is a flow diagram illustrating an example implementation of the “filter unacceptable character strings from the log messages to obtain curated text statements based on the Grok expressions and acceptable character strings” procedure performed in FIG. 21.; Fig. 7; [0045]: A Grok expression is… used by the log management server 702 to extract character strings (e.g., words, terms, and alphanumeric character strings) and parameters from log messages that match the format of the Grok expression).

Regarding Claim 12, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 11. 
Jha further teaches further comprising: receiving, from a user interface view associated with an administrator of the data management system, an indication of the stop words (Fig. 6A;  [0041]: The physical data center 604 comprises physical objects. including an administration computer system 608, any of various computers, such as PC 610, on which a virtual-data-center (“VDC”) management interface may be displayed to system administrators and other users; [0085]: FIG. 23 is a flow diagram illustrating an example implementation of the “filter disallowed character strings from the log message based on Grok patterns of the Grok expression” procedure performed in block 2202).

Regarding Claim 13, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1.
Jha further teaches further comprising: presenting, at user interface view associated with an administrator of the data management system, the error report ([0083]: In block 2105, the curated text statements output in block 2104 are displayed in a GUI as described above with reference to FIG. 20).

Regarding Claim 14, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1. Jha further teaches wherein the first database and the second database comprise a same database ([0082]: The methods described below with reference to FIGS. 21-25 are stored in one or more data-storage devices as machine-readable instructions and are executed by one or more processors of a computer system, such as the computer system shown in FIG. 26. See also paras [0088]-[0089]).

Regarding Claim 15, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1. 
Jha further teaches wherein the first database is different from the second database (Fig. 7; [0043]: In response to receiving a request via the GUI 704, the log management server 702 queries the log message database 706… The curated text database 710 includes a curated text DBMS and one or more data-storage devices; [0082]: The methods described below with reference to FIGS. 21-25 are stored in one or more data-storage devices as machine-readable instructions and are executed by one or more processors of a computer system, such as the computer system shown in FIG. 26).

Regarding Claim 16, Jha discloses an apparatus, comprising: 
a processor; memory coupled with the processor; and instructions stored in the memory and executable by the processor to cause the apparatus to: 
receive, from a first database, a set of error logs associated with a data management system (Fig. 7; [0043]: In response to receiving a request via the GUI 704, the log management server 702 queries the log message database 706 for log messages… The log DBMS responds to the request by reading a representative log message from each class of log messages with time stamps in the user-selected time interval from the application log files stored on the data-storage device); 
generate a set of natural language error strings and corresponding metadata for the set of error logs based at least in part on performing natural language processing on the set of error logs (Fig. 4; [0038]: The text strings and natural-language words and phrases of the log write instruction 302 also appear unchanged in the log message 402 and are used to describe the type of event (e.g., informative, warning, error, or fatal) that occurred during execution of the event source; Fig. 7; [0043]: The log management server 702 performs log message curation, as described below, on the representative log messages to obtain corresponding curated text statements; [0074]: the log management server uses a natural language processor (“NLP”) engine); 
wherein the performance of the natural language processing on the set of error logs extracts the corresponding metadata from the set of error logs (Figs. 3-4; [0034]-[0038]: the log parser includes log parsing rules to extract and format lines of the log message into log message fields described below… . For example, the time stamp 304, thread 306, and IP address 308 arguments of the log write instruction 302 are assigned corresponding numerical parameters 404, 406, and 408 in the log message 402 [extracted metadata type information (e.g., account, timestamp, pod ID, etc.)]; Fig. 7; [0043]-[0045]: A curated text statement of a log message are character strings extracted from the log message by the log management server 702… A Grok expression… is used by the log management server 702 to extract character strings (e.g., words, terms, and alphanumeric character strings) and parameters from log messages; [0062]: FIGS. 11A-11B show… assigning the character string 2019-07-31T10:13:03.1926 1108 in the log message 1104 to the variable identifier timestamp_iso8601 1110 [metadata type information (e.g., account, timestamp, pod ID, etc.)]); 
store the set of natural language error strings and corresponding extracted metadata for the set of error logs in a second database (Fig. 7; [0043]: The log management server 702… stores the curated text statements in a curated text statements database 710); 
However, Jha does not explicitly teach “identify a set of unique errors associated with the set of error logs based at least in part on comparison of the set of natural language error strings generated by the natural language processing, and the corresponding metadata that is extracted based at least in part on performing the natural language processing, wherein identifying the set of unique errors comprises: determining whether two or more error logs of the set of error logs correspond to a single unique error of the set of unique errors based at least determining whether two or more natural language error strings generated by performance of the natural language processing on the two or more error logs satisfy a similarity threshold; and generating an error report based on the set of natural language error strings and corresponding extracted metadata stored in the second database, wherein the error report identifies the set of unique errors.”
On the other hand, in the same field of endeavor, Goldberg teaches 
identify a set of unique errors associated with the set of error logs based at least in part on comparison of the set of natural language error strings generated by the natural language processing (Figs. 1-3; [0017-0018]: The log scanning component 135 determines whether log entries are duplicates of one another by comparing the machine-encoded text 127 of the entries. For example, the log scanning component 135 can detect duplicates when log entries each contain some or all of the same words; [0024]-[0028]:The text processing device 232 includes a log scanning component 135 and a log analysis component 140, which monitor, deduplicate, analyze, and flag system logs… the log scanning component can locate redundant copies by comparing the machine-encoded text 127 of multiple log entries, and locating matching portions… all redundant copies can be removed; [natural language error strings generated by the natural language processing corresponds to machine-encoded text]), 
and the corresponding metadata that is extracted based at least in part on performing the natural language processing ([0011]: System log entries can be records of system events, changes, operations, etc. The log entries can contain information about a problem with the system, and analyzing the logs can help a user or technical support group determine the best solution to the problem [log entries must contain corresponding metadata, i.e. timestamps, information representing the date and time of the problem with the system]; [0030]: an email sent when a log entry is flagged because multiple copies of the log entry have been detected in a designated time span [based on extracted metadata, i.e. the timestamp]; [0036]: text processing component 130 include data in addition to instructions or statements).
Additionally, Yang teaches wherein identifying the set of unique errors comprises:
determining whether two or more error logs of the set of error logs corresponds to a single unique error of the set of unique errors ([0016]-[0017]: Referring initially to FIG. 1, cluster system 100 is illustrated. The cluster system receives a set of strings as input…  the string can correspond to an event message from a diagnostic log… the preprocess component can be configured to filter out duplicate messages such that the resulting output are unique strings) based at least in part on determining whether two or more natural language error strings generated by performance of the natural language processing on the two or more error logs satisfy a similarity threshold (Figs. 5-7; [0040]-[0044]: Referring to FIG. 5… At reference numeral 510, a set of strings is assigned to one of a plurality of clusters based on similarity… the method continues at reference 740, where a determination is made concerning whether the pattern length is less than or equal to a threshold length); and
Furthermore, Cohen teaches generate an error report based on the set of natural language error strings and corresponding extracted metadata stored in the second database, wherein the error report identifies the set of unique errors (Fig. 1a; [0024]: An event 104 typically has a timestamp 105; [0036]: For the following purposes, each log event, e, will be denoted by a tuple (t,msg), where t is the timestamp 105 of the message; [0071]: The table in FIG. 8 summarizes the results of running each of the datasets through the template generator module 412. For every data set the table shows the timeframe of the messages in the log, how many messages were processed (number of messages), how many distinct messages were in the logs (number of unique messages) [The timestamps correspond to the corresponding extracted metadata]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Jha to incorporate the teachings of Goldberg, Yang, and Cohen to include identifying a set of unique errors associated with the set of error logs and generating an error report.
The motivation for doing so would be to remove redundant copies of log entries, as recognized by Goldberg ([0018] of Goldberg: In some embodiments, all redundant copies of log entries are removed during deduplication), analyze clusters based on string similarity, as recognized by Yang ([Abstract of Yang]: clusters can be analyzed based on the similarity or difference of strings in a cluster), and generate an error report based on the set of natural language error strings ([0012] of Cohen]: FIG. 8 is a table showing results of applying an embodiment of the invention to various source computer system log files).

Regarding Claim 18, the combined teachings of Jha, Goldberg, and Cohen  disclose the apparatus of claim 16. 
Jha further teaches wherein the instructions to generate the error report are executable by the at least one processor to cause the apparatus to: generate the error report indicating a quantity of instances each unique error occurred in the set of error logs ([0076]: FIGS. 17A-17C show an example of the process for forming a set of curated text in FIG. 16 applied to three character strings of the log message 1402 in FIG. 14 that satisfy the condition in Equation (1)… [0081]: FIG. 20 shows an example GUI 2002 that displays the curated text statements in FIG. 19… In this example. a user has clicked on boxes 2008-2010 and counts of the corresponding log messages generated by event sources of the application at different time stamps are plotted in the window 2006).

Regarding Claim 19, the combined teachings of Jha, Goldberg, and Cohen  disclose the apparatus of claim 16.
Yang further teaches wherein the instructions are further executable by the at least one processor to cause the apparatus to: identify that two or more natural language error strings correspond to a same error based on a quantity of characters that are the same between the two or more natural language error strings satisfying a threshold (Figs. 5-7; [0040]-[0044]: Referring to FIG. 5… At reference numeral 510, a set of strings is assigned to one of a plurality of clusters based on similarity… the method continues at reference 740, where a determination is made concerning whether the pattern length is less than or equal to a threshold length).

Regarding Claim 20, Jha discloses a non-transitory computer-readable medium storing code, the code comprising instructions executable by a processor to: 
receive, from a first database, a set of error logs associated with a data management system (Fig. 7; [0043]: In response to receiving a request via the GUI 704, the log management server 702 queries the log message database 706 for log messages… The log DBMS responds to the request by reading a representative log message from each class of log messages with time stamps in the user-selected time interval from the application log files stored on the data-storage device); 
generate a set of natural language error strings and corresponding metadata for the set of error logs based at least in part on performing natural language processing on the set of error logs (Fig. 4; [0038]: The text strings and natural-language words and phrases of the log write instruction 302 also appear unchanged in the log message 402 and are used to describe the type of event (e.g., informative, warning, error, or fatal) that occurred during execution of the event source; Fig. 7; [0043]: The log management server 702 performs log message curation, as described below, on the representative log messages to obtain corresponding curated text statements; [0074]: the log management server uses a natural language processor (“NLP”) engine),
wherein performing the natural language processing on the set of error logs extracts the corresponding metadata from the set of error logs (Figs. 3-4; [0034]-[0038]: the log parser includes log parsing rules to extract and format lines of the log message into log message fields described below… . For example, the time stamp 304, thread 306, and IP address 308 arguments of the log write instruction 302 are assigned corresponding numerical parameters 404, 406, and 408 in the log message 402 [extracted metadata type information (e.g., account, timestamp, pod ID, etc.)]; Fig. 7; [0043]-[0045]: A curated text statement of a log message are character strings extracted from the log message by the log management server 702… A Grok expression… is used by the log management server 702 to extract character strings (e.g., words, terms, and alphanumeric character strings) and parameters from log messages; [0062]: FIGS. 11A-11B show… assigning the character string 2019-07-31T10:13:03.1926 1108 in the log message 1104 to the variable identifier timestamp_iso8601 1110 [metadata type information (e.g., account, timestamp, pod ID, etc.)]); 
store the set of natural language error strings and corresponding extracted metadata for the set of error logs in a second database (Fig. 7; [0043]: The log management server 702… stores the curated text statements in a curated text statements database 710); 
However, Jha does not explicitly teach “identify a set of unique errors associated with the set of error logs based at least in part on comparison of the set of natural language error strings generated by the natural language processing, and the corresponding metadata that is extracted based at least in part on performing the natural language processing, wherein identifying the set of unique errors comprises: determining whether two or more error logs of the set of error logs correspond to a single unique error of the set of unique errors based at least determining whether two or more natural language error strings generated by performance of the natural language processing on the two or more error logs satisfy a similarity threshold; and generating an error report based on the set of natural language error strings and corresponding extracted metadata stored in the second database, wherein the error report identifies the set of unique errors.”
On the other hand, in the same field of endeavor, Goldberg teaches 
identify a set of unique errors associated with the set of error logs based at least in part on comparison of the set of natural language error strings generated by the natural language processing (Figs. 1-3; [0017-0018]: The log scanning component 135 determines whether log entries are duplicates of one another by comparing the machine-encoded text 127 of the entries. For example, the log scanning component 135 can detect duplicates when log entries each contain some or all of the same words; [0024]-[0028]:The text processing device 232 includes a log scanning component 135 and a log analysis component 140, which monitor, deduplicate, analyze, and flag system logs… the log scanning component can locate redundant copies by comparing the machine-encoded text 127 of multiple log entries, and locating matching portions… all redundant copies can be removed; [natural language error strings generated by the natural language processing corresponds to machine-encoded text]), 
and the corresponding metadata that is extracted based at least in part onperforming the natural language processing ([0011]: System log entries can be records of system events, changes, operations, etc. The log entries can contain information about a problem with the system, and analyzing the logs can help a user or technical support group determine the best solution to the problem [log entries must contain corresponding metadata, i.e. timestamps, information representing the date and time of the problem with the system]; [0030]: an email sent when a log entry is flagged because multiple copies of the log entry have been detected in a designated time span [based on extracted metadata, i.e. the timestamp]; [0036]: text processing component 130 include data in addition to instructions or statements).
Additionally, Yang teaches wherein identifying the set of unique errors comprises:
determining whether two or more error logs of the set of error logs corresponds to a single unique error of the set of unique errors ([0016]-[0017]: Referring initially to FIG. 1, cluster system 100 is illustrated. The cluster system receives a set of strings as input…  the string can correspond to an event message from a diagnostic log… the preprocess component can be configured to filter out duplicate messages such that the resulting output are unique strings) based at least in part on determining whether two or more natural language error strings generated by performance of the natural language processing on the two or more error logs satisfy a similarity threshold (Figs. 5-7; [0040]-[0044]: Referring to FIG. 5… At reference numeral 510, a set of strings is assigned to one of a plurality of clusters based on similarity… the method continues at reference 740, where a determination is made concerning whether the pattern length is less than or equal to a threshold length); and
Furthermore, Cohen teaches generate an error report based on the set of natural language error strings and corresponding extracted metadata stored in the second database, wherein the error report identifies the set of unique errors ([0071]: The table in FIG. 8 summarizes the results of running each of the datasets through the template generator module 412. For every data set the table shows the timeframe of the messages in the log, how many messages were processed (number of messages), how many distinct messages were in the logs (number of unique messages)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Jha to incorporate the teachings of Goldberg, Yang, and Cohen to include identifying a set of unique errors associated with the set of error logs and generating an error report.
The motivation for doing so would be to remove redundant copies of log entries, as recognized by Goldberg ([0018] of Goldberg: In some embodiments, all redundant copies of log entries are removed during deduplication), analyze clusters based on string similarity, as recognized by Yang ([Abstract of Yang]: clusters can be analyzed based on the similarity or difference of strings in a cluster), and generate an error report based on the set of natural language error strings ([0012] of Cohen]: FIG. 8 is a table showing results of applying an embodiment of the invention to various source computer system log files).

Regarding Claim 21, the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1. 
Goldberg further teaches wherein identifying the set of unique errors is based at least in part on a triggering condition for generation of the error report ([0018]-[0020]: Instructions directing the extent of deduplication can be input by a user or organization. For example, a user may specify that log entries should be deduplicated when the number of log entries surpasses a threshold number… If the user is alerted to the presence of a significant log entry, the user can examine the deduplicated system logs and decide what, if any, actions to take).

Regarding Claim 23, Jha the combined teachings of Jha, Goldberg, and Cohen  disclose the method of claim 1. 
Cohen further teaches wherein the natural language error strings are simplified with respect to the set of error logs based at least in part on the performance of natural language processing on the set of error logs ([0022]: According to an embodiment of the invention, identifying groups of related events is achieved by applying a pattern-finding mechanism to provide a compressed/concise representation of processes represented in the logs… [0071]: The table in FIG. 8 summarizes the results of running each of the datasets through the template generator module 412. For every data set the table shows the timeframe of the messages in the log, how many messages were processed (number of messages), how many distinct messages were in the logs (number of unique messages)).

Conclusion
27.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIRLEY D. HICKS whose telephone number is (571)272-3304. The examiner can normally be reached Mon - Fri 7:30 - 4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached on (571) 272-4085. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/S.D.H./Examiner, Art Unit 2168  

/CHARLES RONES/Supervisory Patent Examiner, Art Unit 2168
Read full office action
Prosecution Timeline

Show 9 earlier events
Feb 13, 2025
Request for Continued Examination
Feb 14, 2025
Response after Non-Final Action
May 16, 2025
Non-Final Rejection mailed — §103
Aug 12, 2025
Response Filed
Nov 14, 2025
Final Rejection mailed — §103
Feb 17, 2026
Request for Continued Examination
Feb 24, 2026
Response after Non-Final Action
Apr 13, 2026
Non-Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/506,722
Patent 12639380
WORK INCOME VISUALIZATION AND OPTIMIZATION PLATFORM
4y 7m to grant Granted May 26, 2026
18/351,876
Patent 12596682
SYSTEM AND METHOD FOR OBJECT STORE FEDERATION
2y 8m to grant Granted Apr 07, 2026
18/218,986
Patent 12499102
HIERARCHICAL DELIMITER IDENTIFICATION FOR PARSING OF RAW DATA
2y 5m to grant Granted Dec 16, 2025
18/340,771
Patent 12499146
MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING (NLP)-BASED SYSTEM FOR SYSTEM-ON-CHIP (SoC) TROUBLESHOOTING
2y 5m to grant Granted Dec 16, 2025
18/396,455
Patent 12405818
BATCHING WAVEFORM DATA
1y 8m to grant Granted Sep 02, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

5-6
Expected OA Rounds
63%
Grant Probability
99%
With Interview (+54.2%)
2y 10m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 109 resolved cases by this examiner. Grant probability derived from career allowance rate.