Last updated: April 19, 2026
Application No. 18/468,962
MACHINE LEARNING-BASED FILTERING OF FALSE POSITIVE PATTERN MATCHES FOR PERSONALLY IDENTIFIABLE INFORMATION

Non-Final OA §103
Filed
Sep 18, 2023
Examiner
ALMEIDA, DEVIN E
Art Unit
2492
Tech Center
2400 — Computer Networks
Assignee
Palo Alto Networks Inc.
OA Round
3 (Non-Final)
Interview Optional

— +11.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 592 resolved cases, 2023–2026
Examiner Intelligence

ALMEIDA, DEVIN E View full profile →
Grants 71% — above average
Career Allow Rate
421 granted / 592 resolved
+13.1% vs TC avg
Moderate +11% lift
Without
With
+11.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
35 currently pending
Career history
627
Total Applications
across all art units
Statute-Specific Performance

§101
7.7%
-32.3% vs TC avg
§103
53.4%
+13.4% vs TC avg
§102
24.6%
-15.4% vs TC avg
§112
8.1%
-31.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 592 resolved cases
Office Action

§103
DETAILED ACTION
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 1/30/2026 has been entered.
 	Claims 1-6, 8-13, 15-19 and 21-23 are pending with claims 1, 3-6, 8, 10-13, 15 and 17-19 having been amended and claims 21-23 newly added.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 7/22/2025 have been fully considered.
A) Applicant's arguments with respect to the 101 rejection of claim 1 have been fully considered and are persuasive.  The 101 rejection of 1-5, 8-12 and 15-19 has been withdrawn. 
B) Applicant’s arguments, with respect to the rejection(s) of claim(s) 1, 8 and 15 under 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of different interpretation of the previously applied reference.
	
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 8-11, 15-18 and 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Ahmed et al (US 2020/0081978) in view of Hachey (US 2021/0256160).
With respect to claim 1 Ahmed teaches a method comprising: 
identifying one or more sub-documents of one or more documents that match one or more patterns for sensitive data (see figure 3 and paragraph 0052-0055 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312…. Classifier component 310 is configured to receive the text-based features from text-based feature extraction component 306 and the user-specific features from user-specific model component 308 to classify the extracted features as either PII data (PII) or non-PII data. In particular embodiments, classifier component 310 includes one or more of a DNN or a maximum entropy framework to classify the extracted features as either PII data or non-PII data…PII labelling component 312 is configured to label the PII elements within unstructured document(s) 304 based upon the classified extracted features. Application 302 is configured to output a labelled unstructured document 314 in which PII information in unstructured document(s) 304 that is detected and classified by application 302 is labelled as PII. Application 302 is further configured to output classification result 316 indicative of the results of classification of the PII data by classifier component 310. In one or more embodiments, application 302 is configured to further train classifier component 310 based upon classification result 316); 
inputting first data corresponding to the one or more patterns into a first language model to obtain first embeddings as output (see Ahmed figure 2 step 704 paragraph 0078 i.e. In block 702, application 105 receives an unstructured document for a known author. In block 704, application 105 extracts text-based features as a first set of feature using natural language processing) and
second data corresponding to context of the one or more patterns in the one or more sub-documents into a second language model to obtain second embeddings as output (see Ahmed figure 2 step 706 paragraph 0078 i.e. In block 706, application 105 extracts user-specific features of unstructured document as a second set of features based upon one or more past documents for the known author using a recurrent neural network);
concatenating the first embeddings and the second embeddings; providing the concatenated embeddings as input to a classification model to obtain one or more verdicts for each of the one or more sub-documents as output, wherein each verdict of the one or more verdicts indicates whether a corresponding sub- document in the one or more sub-documents comprises sensitive data (see Ahmed figure 2 step 708-710 paragraph 0079 i.e. In block 708, application 105 classifies the first set of features and the second set of features using a machine learning classifier to produce classified extracted features. In an, the classifier includes one or more of a DNN classifier and a maximum entropy classifier); and 
filtering, from the one or more documents, those documents that do not comprise a sub-document in the one or more sub-documents having a verdict in the one or more verdicts that indicates sensitive data (see Ahmed figure 2 step 710-712 paragraph 0079 i.e. In block 710, application 105 labels personally identifiable information in the unstructured document based upon the classified extracted features. In block 712, application 105 outputs the labelled personally identifiable information and paragraph 0068 i.e. As discussed above, one or more embodiments use a maximum entropy classifier to classify elements of an unstructured document as containing either PII data or non-PII data).
Ahmed does not disclose wherein the classification model comprises a gradient boosting model.
Hachey teaches wherein the classification model comprises a gradient boosting model (see Hachey paragraph 0035 i.e. model may be an AI machine learning model based on classifiers including an extreme gradient boost classifier, a light gradient boost machine, a gradient boosting classifier, naïve bayes, an ada boost classifier, a K-neighbors classifier, a decision tree classifier, a ridge classifier, natural language processing logic, recurrent neural networks (RNN), convolutional neural networks (CNN), multi-level perceptrons, feedforward neural networks, or a combination thereof. The AI model training component 128 may also store and execute the AI models (i.e. the zoning model) on the servers 120. The AI model training component 128 may train, store, and execute the machine learning tools).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of Hachey to have used a gradient boosting model as one of the many different AI models that could be used to identify personally identifiable information in a document (see Hachey paragraph 0035). Therefore one would have been motivated to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected.

With respect to claim 2 Ahmed in view of Hachey teaches the method of claim 1. Hachey further teaches wherein the sensitive data comprises fields in driver's license data (see Hachey paragraph 0016 i.e. It is one advantage of at least one embodiment of the present invention to reasonably accurately detect and mask the following personal health information data types: person's name, date of birth, consultation date, other dates, patient's age/carer's age, any personal identification number (social security number, driver's license, passport number, etc), any residential or postal address, any phone or fax number, any e-mail address, any website/URL address, or any text corresponding to a profession).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of Hachey to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected in a document (see Hachey paragraph 0016). Therefore one would have been motivated to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected.

With respect to claim 3 Ahmed in view of Hachey teaches the method of claim 1, wherein identifying the one or more sub-documents comprises, extracting text data from the one or more documents; and applying the one or more patterns to the text data to identify the one or more sub- documents as subsets of the text data (see Ahmed Paragraph 0053 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312. Text-based feature extraction component 306 is configured to extract text-based features from unstructured document(s) 304 using natural language processing (NLP) and paragraph 0057 i.e. In an embodiment, text-based feature extraction component 404 receives an unstructured document 402 potentially containing PII in which the author of unstructured document 402 is presumed to be known. Text-based feature extraction component 404 extracts natural language processing (NLP) features from unstructured document 402 using an array of text-based feature extractors. In the embodiment illustrated in FIG. 4, the text-based feature extractors include an n-gram feature extractor, a dictionary-based feature extractor, a word embedding feature extractor, and a part of speech (e.g., verb, noun, adjective, etc.)).

With respect to claim 4 Ahmed in view of Hachey teaches the method of claim 1, wherein the one or more sub-documents comprise fields of text data in the one or more documents encoded in a schema, wherein inputting the first data into the first language model and the second data into the second language model comprises: identifying fields in the one or more sub-documents to input into each of the first language model and second language model, wherein identifying fields in the one or more sub-documents is according to the schema where the fields are encoded; and for each sub-document in the one or more sub-documents, inputting identified fields for the sub-document into corresponding ones of the first language model and second language model (see Ahmed paragraph 0058 i.e. In the embodiment, text-based feature extraction component 404 provides the current document features to classifier component 408 and user-specific model component 406 provides the user-specific features to classifier component 408. Classifier component 408 detects, classifies, and labels the PII elements in the current text of unstructured document 402 using one or more classifiers to produce labeled unstructured document 410. In particular embodiments, the one or more classifiers include one or more of a deep neural network classifier or a maximum entropy classifier as further described herein).

With respect to claim 8 Ahmed teaches one or more non-transitory machine-readable media having program code stored thereon, the program code comprising instructions to: 
classify a document as comprising sensitive data, wherein the program code to classify the document as comprising sensitive data comprises instructions to identify one or more sub-documents of the document that comprise sensitive data, wherein each of the one or more sub-documents comprises text data that matches one or more patterns of sensitive data (see Ahmed paragraph 0052-0055 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312. Text-based feature extraction component 306 is configured to extract text-based features from unstructured document(s) 304 using natural language processing (NLP) and paragraph 0060-0061 i.e. In the embodiment of FIG. 5, models are training using a number of different data sources. In the embodiment, a main source of domain-specific data is a set of unstructured documents including PII labels 502 is received by text-based feature extraction component 508. In addition, a historical set of unstructured documents with PII labels by a particular author is received by a user-specific model component 510 to develop patterns of word and/or phrase usage for a particular user. In the embodiment, sets of unlabeled unstructured text data 506 is also received by text-based feature extraction component 508 to provide the models with an understanding of a large number of different word and/or phrase characteristics including, but not limited to, word contexts, parts of speech, and grammar. Text-based feature extraction component 508 extracts natural language processing (NLP) features from unstructured documents with PII labels 502 using an array of text-based feature extractors based upon the unlabeled unstructured text data 506. In the embodiment illustrated in FIG. 5, the text-based feature extractors include an n-gram feature extractor, a dictionary-based feature extractor, a word embedding feature extractor, and a part of speech (e.g., verb, noun, adjective, etc.)); 
input at least a subset of each of the one or more sub-documents into an ensemble of a first language models, a second language model and a classification model to obtain one or more verdicts for each of the one or more sub-documents as output (see Ahmed paragraph 0058 i.e. In the embodiment, text-based feature extraction component 404 provides the current document features to classifier component 408 and user-specific model component 406 provides the user-specific features to classifier component 408. Classifier component 408 detects, classifies, and labels the PII elements in the current text of unstructured document 402 using one or more classifiers to produce labeled unstructured document 410. In particular embodiments, the one or more classifiers include one or more of a deep neural network classifier or a maximum entropy classifier as further described herein); 
wherein the first language model takes first data corresponding to the one or more patterns as input to output first embeddings (see Ahmed figure 2 step 704 paragraph 0078 i.e. In block 702, application 105 receives an unstructured document for a known author. In block 704, application 105 extracts text-based features as a first set of feature using natural language processing) and
wherein the second language model takes second data corresponding to context of the one or more patterns in the one or more sub-documents as input to output second embeddings (see Ahmed figure 2 step 706 paragraph 0078 i.e. In block 706, application 105 extracts user-specific features of unstructured document as a second set of features based upon one or more past documents for the known author using a recurrent neural network);
wherein the classification model takes concatenations of the first embeddings and second embeddings as input to output one or more verdicts (see Ahmed figure 2 step 708-710 paragraph 0079 i.e. In block 708, application 105 classifies the first set of features and the second set of features using a machine learning classifier to produce classified extracted features. In an, the classifier includes one or more of a DNN classifier and a maximum entropy classifier); and 
based on a determination that each of the one or more verdicts indicates that a corresponding sub-document of the one or more sub-documents does not comprise sensitive data, indicating the document as having a false positive sensitive data classification (see Ahmed Paragraph 0063-0065 i.e. Classifier component 512 detects and classifies the extracted features to produce a classification output 514. In particular embodiments, the one or more classifiers include one or more of a deep neural network classifier or a maximum entropy classifier as further described herein. In the embodiment, the classification output 514 is used to retrain classifier component 512. In a particular embodiment, application 105 calculates an error value based upon the classification results and adjusts a model of classifier component 512 based upon the error. In an embodiment, application 105 receives feedback associated with the classified extracted features indicated by classification output 514 from a PII subject matter expert (SME) and modifies the training of the classifier component 512 based upon the feedback. Accordingly, one or more embodiments provides for a framework that allows for a semi-supervised feedback loop in order to retrain the classifier. The framework allows for input from a PII subject matter expert, incorporating expert knowledge through the correcting of classification errors and paragraph 0068 i.e. As discussed above, one or more embodiments use a maximum entropy classifier to classify elements of an unstructured document as containing either PII data or non-PII data).
Ahmed does not disclose wherein the classification model comprises a gradient boosting model.
Hachey teaches wherein the classification model comprises a gradient boosting model (see Hachey paragraph 0035 i.e. model may be an AI machine learning model based on classifiers including an extreme gradient boost classifier, a light gradient boost machine, a gradient boosting classifier, naïve bayes, an ada boost classifier, a K-neighbors classifier, a decision tree classifier, a ridge classifier, natural language processing logic, recurrent neural networks (RNN), convolutional neural networks (CNN), multi-level perceptrons, feedforward neural networks, or a combination thereof. The AI model training component 128 may also store and execute the AI models (i.e. the zoning model) on the servers 120. The AI model training component 128 may train, store, and execute the machine learning tools).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of Hachey to have used a gradient boosting model as one of the many different AI models that could be used to identify personally identifiable information in a document (see Hachey paragraph 0035). Therefore one would have been motivated to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected.

With respect to claim 9 Ahmed in view of Hachey teaches the non-transitory machine-readable media of claim 8. Hachey further teaches wherein the sensitive data comprises fields in driver's license data (see Hachey paragraph 0016 i.e. It is one advantage of at least one embodiment of the present invention to reasonably accurately detect and mask the following personal health information data types: person's name, date of birth, consultation date, other dates, patient's age/carer's age, any personal identification number (social security number, driver's license, passport number, etc), any residential or postal address, any phone or fax number, any e-mail address, any website/URL address, or any text corresponding to a profession).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of Hachey to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected in a document (see Hachey paragraph 0016). Therefore one would have been motivated to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected.

With respect to claim 10 Ahmed in view of Hachey teaches the non-transitory machine-readable media of claim 8, wherein instructions to identify the one or more sub-documents comprise instructions to, extract text data from the document; and apply the one or more patterns to the text data to identify the one or more sub- documents as subsets of the text data (see Ahmed Paragraph 0053 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312. Text-based feature extraction component 306 is configured to extract text-based features from unstructured document(s) 304 using natural language processing (NLP) and paragraph 0057 i.e. In an embodiment, text-based feature extraction component 404 receives an unstructured document 402 potentially containing PII in which the author of unstructured document 402 is presumed to be known. Text-based feature extraction component 404 extracts natural language processing (NLP) features from unstructured document 402 using an array of text-based feature extractors. In the embodiment illustrated in FIG. 4, the text-based feature extractors include an n-gram feature extractor, a dictionary-based feature extractor, a word embedding feature extractor, and a part of speech (e.g., verb, noun, adjective, etc.)).

With respect to claim 11 Ahmed in view of Hachey teaches the non-transitory machine-readable media of claim 8, wherein the one or more sub-documents comprise fields of text data in the documents encoded in a schema, wherein the program code to input the at least subset of each of the document into the ensemble comprises instructions to, identify fields in the one or more sub-documents to input into each of the first language model and the second language model, wherein identifying fields in the one or more sub-documents is according to the schema where the fields are encoded; and for each sub-document in the one or more sub-documents, input identified fields for the sub-document into corresponding ones of first language model and the second language model (see Ahmed paragraph 0058 i.e. In the embodiment, text-based feature extraction component 404 provides the current document features to classifier component 408 and user-specific model component 406 provides the user-specific features to classifier component 408. Classifier component 408 detects, classifies, and labels the PII elements in the current text of unstructured document 402 using one or more classifiers to produce labeled unstructured document 410. In particular embodiments, the one or more classifiers include one or more of a deep neural network classifier or a maximum entropy classifier as further described herein).

With respect to claim 15 Ahmed teaches an apparatus comprising: 
a processor; and a machine-readable medium having instructions stored thereon that are executable by the processor to cause the apparatus to, 
preprocess documents to identify one or more documents comprising sensitive data for data loss prevention, wherein the instructions to identify the one or more documents comprise instructions executable by the processor to cause the apparatus to identify one or more patterns of sensitive data in text data of one or more sub- documents of the one or more documents (see Ahmed paragraph 0052-0055 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312. Text-based feature extraction component 306 is configured to extract text-based features from unstructured document(s) 304 using natural language processing (NLP) and paragraph 0060-0061 i.e. In the embodiment of FIG. 5, models are training using a number of different data sources. In the embodiment, a main source of domain-specific data is a set of unstructured documents including PII labels 502 is received by text-based feature extraction component 508. In addition, a historical set of unstructured documents with PII labels by a particular author is received by a user-specific model component 510 to develop patterns of word and/or phrase usage for a particular user. In the embodiment, sets of unlabeled unstructured text data 506 is also received by text-based feature extraction component 508 to provide the models with an understanding of a large number of different word and/or phrase characteristics including, but not limited to, word contexts, parts of speech, and grammar. Text-based feature extraction component 508 extracts natural language processing (NLP) features from unstructured documents with PII labels 502 using an array of text-based feature extractors based upon the unlabeled unstructured text data 506. In the embodiment illustrated in FIG. 5, the text-based feature extractors include an n-gram feature extractor, a dictionary-based feature extractor, a word embedding feature extractor, and a part of speech (e.g., verb, noun, adjective, etc.); 
input at least a subset of each of the one or more sub-documents into an ensemble of a first language models, a second language model and a classification model to obtain one or more verdicts for each of the one or more sub-documents as output (see Ahmed paragraph 0058 i.e. In the embodiment, text-based feature extraction component 404 provides the current document features to classifier component 408 and user-specific model component 406 provides the user-specific features to classifier component 408. Classifier component 408 detects, classifies, and labels the PII elements in the current text of unstructured document 402 using one or more classifiers to produce labeled unstructured document 410. In particular embodiments, the one or more classifiers include one or more of a deep neural network classifier or a maximum entropy classifier as further described herein); 
wherein the first language model takes first data corresponding to the one or more patterns as input to output first embeddings (see Ahmed figure 2 step 704 paragraph 0078 i.e. In block 702, application 105 receives an unstructured document for a known author. In block 704, application 105 extracts text-based features as a first set of feature using natural language processing) and
wherein the second language model takes second data corresponding to context of the one or more patterns in the one or more sub-documents as input to output second embeddings (see Ahmed figure 2 step 706 paragraph 0078 i.e. In block 706, application 105 extracts user-specific features of unstructured document as a second set of features based upon one or more past documents for the known author using a recurrent neural network);
wherein the classification model takes concatenations of the first embeddings and second embeddings as input to output one or more verdicts (see Ahmed figure 2 step 708-710 paragraph 0079 i.e. In block 708, application 105 classifies the first set of features and the second set of features using a machine learning classifier to produce classified extracted features. In an, the classifier includes one or more of a DNN classifier and a maximum entropy classifier); and 
filter, from the one or more documents, those documents that do not comprise a sub-document in the one or more sub-documents having a verdict in the one or more verdicts indicating sensitive data (see Ahmed Paragraph 0063-0065 i.e. Classifier component 512 detects and classifies the extracted features to produce a classification output 514. In particular embodiments, the one or more classifiers include one or more of a deep neural network classifier or a maximum entropy classifier as further described herein. In the embodiment, the classification output 514 is used to retrain classifier component 512. In a particular embodiment, application 105 calculates an error value based upon the classification results and adjusts a model of classifier component 512 based upon the error. In an embodiment, application 105 receives feedback associated with the classified extracted features indicated by classification output 514 from a PII subject matter expert (SME) and modifies the training of the classifier component 512 based upon the feedback. Accordingly, one or more embodiments provides for a framework that allows for a semi-supervised feedback loop in order to retrain the classifier. The framework allows for input from a PII subject matter expert, incorporating expert knowledge through the correcting of classification errors and paragraph 0068 i.e. As discussed above, one or more embodiments use a maximum entropy classifier to classify elements of an unstructured document as containing either PII data or non-PII data).
Ahmed does not disclose wherein the classification model comprises a gradient boosting model.
Hachey teaches wherein the classification model comprises a gradient boosting model (see Hachey paragraph 0035 i.e. model may be an AI machine learning model based on classifiers including an extreme gradient boost classifier, a light gradient boost machine, a gradient boosting classifier, naïve bayes, an ada boost classifier, a K-neighbors classifier, a decision tree classifier, a ridge classifier, natural language processing logic, recurrent neural networks (RNN), convolutional neural networks (CNN), multi-level perceptrons, feedforward neural networks, or a combination thereof. The AI model training component 128 may also store and execute the AI models (i.e. the zoning model) on the servers 120. The AI model training component 128 may train, store, and execute the machine learning tools).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of Hachey to have used a gradient boosting model as one of the many different AI models that could be used to identify personally identifiable information in a document (see Hachey paragraph 0035). Therefore one would have been motivated to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected.

With respect to claim 16 Ahmed in view of Hachey teaches the apparatus of claim 15. Hachey further teaches wherein the sensitive data comprises fields in driver's license data (see Hachey paragraph 0016 i.e. It is one advantage of at least one embodiment of the present invention to reasonably accurately detect and mask the following personal health information data types: person's name, date of birth, consultation date, other dates, patient's age/carer's age, any personal identification number (social security number, driver's license, passport number, etc), any residential or postal address, any phone or fax number, any e-mail address, any website/URL address, or any text corresponding to a profession).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of Hachey to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected in a document (see Hachey paragraph 0016). Therefore one would have been motivated to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected.

With respect to claim 17 Ahmed in view of Hachey teaches the apparatus of claim 15, wherein instructions to preprocess documents to identify the one or more documents comprise instructions executable by the processor to cause the apparatus to, extract text data from the one or more documents; and apply the one or more patterns to the text data to identify the one or more sub- documents as subsets of the text data (see Ahmed Paragraph 0053 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312. Text-based feature extraction component 306 is configured to extract text-based features from unstructured document(s) 304 using natural language processing (NLP) and paragraph 0057 i.e. In an embodiment, text-based feature extraction component 404 receives an unstructured document 402 potentially containing PII in which the author of unstructured document 402 is presumed to be known. Text-based feature extraction component 404 extracts natural language processing (NLP) features from unstructured document 402 using an array of text-based feature extractors. In the embodiment illustrated in FIG. 4, the text-based feature extractors include an n-gram feature extractor, a dictionary-based feature extractor, a word embedding feature extractor, and a part of speech (e.g., verb, noun, adjective, etc.)).

With respect to claim 18 Ahmed in view of Hachey teaches the apparatus of claim 15, wherein the one or more sub-documents comprise fields of text data in the documents encoded in a schema, wherein the instructions to input the at least subset of each of the one or more sub-document into the ensemble comprises instructions to, identify fields in the one or more sub-documents to input into each of the first language model and the second language model, wherein identifying fields in the one or more sub-documents is according to the schema where the fields are encoded; and for each sub-document in the one or more sub-documents, input identified fields for the sub-document into corresponding ones of first language model and the second language model (see Ahmed paragraph 0058 i.e. In the embodiment, text-based feature extraction component 404 provides the current document features to classifier component 408 and user-specific model component 406 provides the user-specific features to classifier component 408. Classifier component 408 detects, classifies, and labels the PII elements in the current text of unstructured document 402 using one or more classifiers to produce labeled unstructured document 410. In particular embodiments, the one or more classifiers include one or more of a deep neural network classifier or a maximum entropy classifier as further described herein).

With respect to claim 21 Ahmed in view of Hachey teaches the method of claim 1, wherein identifying the one or more sub-documents of the one or more documents that match the one or more patterns for sensitive data comprises: identifying sub-documents of the one or more documents that match patterns including the one or more patterns and corresponding confidence values of each matched patterns; and filtering sub-documents having low confidence of matched patterns according to the confidence values to obtain the one or more sub-documents (see figure 3 and paragraph 0052-0055 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312…. Classifier component 310 is configured to receive the text-based features from text-based feature extraction component 306 and the user-specific features from user-specific model component 308 to classify the extracted features as either PII data (PII) or non-PII data. In particular embodiments, classifier component 310 includes one or more of a DNN or a maximum entropy framework to classify the extracted features as either PII data or non-PII data…PII labelling component 312 is configured to label the PII elements within unstructured document(s) 304 based upon the classified extracted features. Application 302 is configured to output a labelled unstructured document 314 in which PII information in unstructured document(s) 304 that is detected and classified by application 302 is labelled as PII. Application 302 is further configured to output classification result 316 indicative of the results of classification of the PII data by classifier component 310. In one or more embodiments, application 302 is configured to further train classifier component 310 based upon classification result 316). 

With respect to claim 22 Ahmed in view of Hachey teaches the non-transitory machine-readable media of claim 8, wherein the instructions to identify the one or more sub-documents of the document that comprise sensitive data comprise instructions to: identify sub-documents of the document that match patterns including the one or more patterns and corresponding confidence values of each matched pattern; and filter sub-documents having low confidence of matched patterns according to the confidence values to obtain the one or more sub-documents (see figure 3 and paragraph 0052-0055 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312…. Classifier component 310 is configured to receive the text-based features from text-based feature extraction component 306 and the user-specific features from user-specific model component 308 to classify the extracted features as either PII data (PII) or non-PII data. In particular embodiments, classifier component 310 includes one or more of a DNN or a maximum entropy framework to classify the extracted features as either PII data or non-PII data…PII labelling component 312 is configured to label the PII elements within unstructured document(s) 304 based upon the classified extracted features. Application 302 is configured to output a labelled unstructured document 314 in which PII information in unstructured document(s) 304 that is detected and classified by application 302 is labelled as PII. Application 302 is further configured to output classification result 316 indicative of the results of classification of the PII data by classifier component 310. In one or more embodiments, application 302 is configured to further train classifier component 310 based upon classification result 316). 

With respect to claim 23 Ahmed in view of Hachey teaches the apparatus of claim 15, wherein the instructions to identify the one or more documents that comprise sensitive data comprise instructions executable by the processor to cause the apparatus to: identify sub-documents of the one or more documents that match patterns including the one or more patterns and corresponding confidence values of each matched pattern; and filter sub-documents having low confidence of matched patterns according to the confidence values to obtain the one or more sub-documents (see figure 3 and paragraph 0052-0055 i.e. Applicant 302 receives one or more unstructured documents 304. Application 302 includes a text-based feature extraction component 306, a user-specific model component 308, a classifier component 310, and a PII labelling component 312…. Classifier component 310 is configured to receive the text-based features from text-based feature extraction component 306 and the user-specific features from user-specific model component 308 to classify the extracted features as either PII data (PII) or non-PII data. In particular embodiments, classifier component 310 includes one or more of a DNN or a maximum entropy framework to classify the extracted features as either PII data or non-PII data…PII labelling component 312 is configured to label the PII elements within unstructured document(s) 304 based upon the classified extracted features. Application 302 is configured to output a labelled unstructured document 314 in which PII information in unstructured document(s) 304 that is detected and classified by application 302 is labelled as PII. Application 302 is further configured to output classification result 316 indicative of the results of classification of the PII data by classifier component 310. In one or more embodiments, application 302 is configured to further train classifier component 310 based upon classification result 316).

Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Ahmed et al (US 2020/0081978) in view of Hachey (US 2021/0256160) in view of Sahu et al (US 2023/0128136).
With respect to claim 5 Ahmed in view of Hachey teaches the method of claim 1, but does not disclose wherein the first language models comprise a one-dimensional convolutional neural network.
S Nayar teaches wherein the first language models comprise a one-dimensional convolutional neural network (See S Nayar paragraph 0051 i.e. FIG. 6 illustrates a pictorial representation 600 of classification of the input dataset 222 using the convolutional neural network modeler 226 of the identification classifier 140, according to an example embodiment of present disclosure. As mentioned above, the data manipulator 130 may obtain the input dataset 222 defined in a one-dimensional data structure and convert the input dataset 222 into a formatted dataset of a two-dimensional data structure, wherein the format of the formatted dataset may be defined in accordance to a type of a deep neural network component. Also, the neural network component selector 150 may select the deep neural network based on the identification of characteristics associated with the input dataset 222 using a predefined parameter 256. The predefined parameter 256 may be a parameter that may be used to determine a characteristic associated with data of the input dataset 222. In an example, the predefined parameter 256 may be defined by a user. In an example, the predefined parameter 256 may include a size of the input dataset 222 and/or a length of individual elements in the dataset. As mentioned above, the neural network component selector 150 may select a convolutional neural network component to be used for processing the input dataset 222 when the input dataset 222 is identified to be associated with the first characteristic data 258. The pictorial representation 600 may illustrate the processing of the input dataset 222 based on the selection of the convolutional neural network component. The pictorial representation 600 illustrates the processing of the input dataset 222 by the CNN modeler 226).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of S Nayar to have used a one-dimensional convolutional neural network modeler as the first language models to process the input data. Therefore one would have been motivated to have used a one-dimensional convolutional neural network.

	
With respect to claim 12 Ahmed in view of Hachey teaches the non-transitory machine-readable media of claim 8, but does not disclose wherein the first language model comprise a one-dimensional convolutional neural network.
S Nayar teaches wherein the first language models comprise a one-dimensional convolutional neural network (See S Nayar paragraph 0051 i.e. FIG. 6 illustrates a pictorial representation 600 of classification of the input dataset 222 using the convolutional neural network modeler 226 of the identification classifier 140, according to an example embodiment of present disclosure. As mentioned above, the data manipulator 130 may obtain the input dataset 222 defined in a one-dimensional data structure and convert the input dataset 222 into a formatted dataset of a two-dimensional data structure, wherein the format of the formatted dataset may be defined in accordance to a type of a deep neural network component. Also, the neural network component selector 150 may select the deep neural network based on the identification of characteristics associated with the input dataset 222 using a predefined parameter 256. The predefined parameter 256 may be a parameter that may be used to determine a characteristic associated with data of the input dataset 222. In an example, the predefined parameter 256 may be defined by a user. In an example, the predefined parameter 256 may include a size of the input dataset 222 and/or a length of individual elements in the dataset. As mentioned above, the neural network component selector 150 may select a convolutional neural network component to be used for processing the input dataset 222 when the input dataset 222 is identified to be associated with the first characteristic data 258. The pictorial representation 600 may illustrate the processing of the input dataset 222 based on the selection of the convolutional neural network component. The pictorial representation 600 illustrates the processing of the input dataset 222 by the CNN modeler 226).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of S Nayar to have used a one-dimensional convolutional neural network modeler as the first language models to process the input data. Therefore one would have been motivated to have used a one-dimensional convolutional neural network.

With respect to claim 19 Ahmed in view of Hachey teaches the apparatus of claim 15, but does not disclose wherein the first language model comprise a one-dimensional convolutional neural network. 
S Nayar teaches wherein the first language models comprise a one-dimensional convolutional neural network (See S Nayar paragraph 0051 i.e. FIG. 6 illustrates a pictorial representation 600 of classification of the input dataset 222 using the convolutional neural network modeler 226 of the identification classifier 140, according to an example embodiment of present disclosure. As mentioned above, the data manipulator 130 may obtain the input dataset 222 defined in a one-dimensional data structure and convert the input dataset 222 into a formatted dataset of a two-dimensional data structure, wherein the format of the formatted dataset may be defined in accordance to a type of a deep neural network component. Also, the neural network component selector 150 may select the deep neural network based on the identification of characteristics associated with the input dataset 222 using a predefined parameter 256. The predefined parameter 256 may be a parameter that may be used to determine a characteristic associated with data of the input dataset 222. In an example, the predefined parameter 256 may be defined by a user. In an example, the predefined parameter 256 may include a size of the input dataset 222 and/or a length of individual elements in the dataset. As mentioned above, the neural network component selector 150 may select a convolutional neural network component to be used for processing the input dataset 222 when the input dataset 222 is identified to be associated with the first characteristic data 258. The pictorial representation 600 may illustrate the processing of the input dataset 222 based on the selection of the convolutional neural network component. The pictorial representation 600 illustrates the processing of the input dataset 222 by the CNN modeler 226).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of S Nayar to have used a one-dimensional convolutional neural network modeler as the first language models to process the input data. Therefore one would have been motivated to have used a one-dimensional convolutional neural network.

Claims 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Ahmed et al (US 2020/0081978) in view of Hachey (US 2021/0256160) in view of Sahu et al (US 2023/0128136).
With respect to claim 6 Ahmed in view of Hachey teaches the method of claim 5, but does not disclose wherein the second language model comprises Sentence-Bidirectional Encoder Representations from Transformers.
	Sahu teached wherein the second language model comprises Sentence-Bidirectional Encoder Representations from Transformers (see Sahu paragraph 0076 i.e. In some implementations, the CCE 101 includes a bi-directional deep neural network. The CCE 101 includes a lexical parser to parse data from different sources. NLP techniques may also be utilized for some aspects of lexical parsing and validation, such as a Bi-directional Encoder Representations from Transformers (BERT)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of Sahu to have used a Sentence-Bidirectional Encoder Representations from Transformers as one of the many different AI models that could be used to identify personally identifiable information in data (see Sahu paragraph 0076). Therefore one would have been motivated to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected.

With respect to claim 13 Ahmed in view of Hachey teaches the non-transitory machine-readable media of claim 12, but does not disclose wherein the second language model comprises Sentence-Bidirectional Encoder Representations from Transformers.
	Sahu teaches wherein the second language model comprises Sentence-Bidirectional Encoder Representations from Transformers (see Sahu paragraph 0076 i.e. In some implementations, the CCE 101 includes a bi-directional deep neural network. The CCE 101 includes a lexical parser to parse data from different sources. NLP techniques may also be utilized for some aspects of lexical parsing and validation, such as a Bi-directional Encoder Representations from Transformers (BERT)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ahmed in view of Sahu to have used a Sentence-Bidirectional Encoder Representations from Transformers as one of the many different AI models that could be used to identify personally identifiable information in data (see Sahu paragraph 0076). Therefore one would have been motivated to have included driver’s license data as personally identifiable information as one of many different types of personally identifiable information that can be detected.

Prior Art
	Kim et al (US 2025/0094600) titled “MACHINE LEARNING-BASED FILTERING OF FALSE POSITIVE PATTERN MATCHES FOR PERSONALLY IDENTIFIABLE INFORMATION”.
	Gupta et al (US 2024/0289492) titled “MACHINE LEARNING MODELING TO IDENTIFY SENSITIVE DATA”.
	Sirhani et al (US 2023/0385407) titled “SYSTEM AND METHOD FOR INTEGRATING MACHINE LEARNING IN DATA LEAKAGE DETECTION SOLUTION THROUGH KEYWORD POLICY PREDICTION”.
	Dan (US 11,755,848) titled “Processing Structured And Unstructured Text To Identify Sensitive Information”.
	Dancewicz et al (US 2023/0259709) titled “ITERATIVE TRAINING FOR TEXT-IMAGE-LAYOUT DATA IN NATURAL LANGUAGE PROCESSING”.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DEVIN E ALMEIDA whose telephone number is (571)270-1018.  The examiner can normally be reached on Monday-Thursday from 7:30 A.M. to 5:00 P.M.  The examiner can also be reached on alternate Fridays from 7:30 A.M. to 4:00 P.M. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Rupal Dharia, can be reached on 571-272-3880. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
/DEVIN E ALMEIDA/Examiner, Art Unit 2492
Read full office action
Prosecution Timeline

Sep 18, 2023
Application Filed
Apr 17, 2025
Non-Final Rejection — §103
Jul 10, 2025
Interview Requested
Jul 22, 2025
Response Filed
Oct 28, 2025
Final Rejection — §103
Jan 30, 2026
Request for Continued Examination
Feb 09, 2026
Response after Non-Final Action
Feb 11, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/196,921
Patent 12580763
USE OF TENSILE SPHERES FOR EXTENDED SYMMETRIC CRYPTOGRAPHY
2y 5m to grant Granted Mar 17, 2026
18/603,564
Patent 12562886
Fast Polynomial Evaluation Under Fully Homomorphic Encryption by Products of Differences from Roots Using Rotations
2y 5m to grant Granted Feb 24, 2026
17/902,531
Patent 12556512
METHODS, SYSTEMS, AND COMPUTER READABLE MEDIA FOR AUTOMATIC CATEGORY 1 MESSAGE FILTERING RULES CONFIGURATION BY LEARNING TOPOLOGY INFORMATION FROM NETWORK FUNCTION (NF) REPOSITORY FUNCTION (NRF)
2y 5m to grant Granted Feb 17, 2026
18/206,756
Patent 12556393
SYSTEMS AND METHODS FOR REAL-TIME TRACEABILITY USING AN OBFUSCATION ARCHITECTURE
2y 5m to grant Granted Feb 17, 2026
18/256,314
Patent 12542682
AUTHENTICATING PACKAGED PRODUCTS
2y 5m to grant Granted Feb 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
71%
Grant Probability
82%
With Interview (+11.4%)
3y 9m
Median Time to Grant
High
PTA Risk
Based on 592 resolved cases by this examiner. Grant probability derived from career allow rate.