Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This is a reply to the application filed on 11/18/2024, in which, claim(s) 1-30 are pending.
Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.
Drawings
The drawings are objected to because in Figure 7, the line out of box 98, is behind box 102. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim(s) 5 recites the limitation “the computer-implemented method of claim 3, wherein the step of detecting further comprises applying a second machine learning model to the textual data and the second textual data for identifying the contextually hidden confidential information therein, wherein the second machine learning model generates second model data that includes the contextually hidden confidential information.” (emphasis added). There is insufficient antecedent basis for the term “the contextually hidden confidential information” limitation in the claim.
Dependent claims 6-13 are rejected for at least in part for incorporating the deficiency as stated above.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim(s) 14-22 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA the applicant regards as the invention.
Claim limitations “a data extraction unit for extracting…, a determination unit for determining…, a parsing unit having a parsing engine for parsing…, a text processing unit having a text processing engine for…, a confidential information detecting unit employing…, an anonymization unit for selectively…, an image processing unit for processing…” in claim 14 are limitations that invoke 35 U.S.C. 112, sixth paragraph. The written description only implicitly or inherently sets forth the corresponding structure, material, or acts that perform the claimed function and/or the specification fails to disclosed the corresponding structure including the steps for achieving any function or step.
Pursuant to 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181, applicant should:
(a) Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112, sixth paragraph; or
(b) Amend the written description of the specification such that it expressly recites the corresponding structure, material, or acts that perform the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or
(c) State on the record what corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function.
Dependent claim(s) 15-22 disclose the modules from claim 14, configured to perform additional features and thus is rejected under the same rationale.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-30 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Independent claims 1, 14 and 23 reciting “extracting one or more of textual data and document data from the input data, wherein the document data includes one or more of second textual data and image data,
processing the textual data and the second textual data to identify the confidential information therein…”. Based on figure 7 and the support in the specification, the extracting process either extract textual data or document data from the input. Applicant’s own claim language recites this. However, the subsequent steps are one particular branch of the processing (e.g., either for textual data or document data, but not both), yet they’re recited as if both have been extracted.
It is unclear what is the “textual data” vs “document data”. The initial input of text data is still a file document that contains text, whereas the “document data” is a file document that contains text and image; however, they are both a “document”.
It is unclear how both textual data and document data are processed together for both instances. As suggested by Fig. 7, step 160, determined if it is either textual data or document data, but not both.
In the event that only one is extracted (textual data), how does one process the image data, as the textual data only contains text and no image.
If only one textual data or document data is extracted as suggested by figure 7, step 160, how is it possible to processed both the textual data and the second textual data together?
For the purpose of advancing prosecution. Since the specification is unclear regarding textual data and document data. The Examiner’s BRI as followed. The textual data (file with text only), the document data (file with text and image), and the extraction process is done at a different time and not simultaneously.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1-4 and 23 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Beveridge et al. (US 20180260734 A1; hereinafter Beveridge).
Regarding claim 1, Beveridge discloses a computer-implemented method for identifying and anonymizing confidential information in input data, the method performed by at least one computer processor executing computer-readable instructions tangibly stored on at least one computer-readable medium, the method comprising the steps of:
extracting one or more of textual data and document data from the input data, wherein the document data includes one or more of second textual data and image data (inputting unredacted document comprises a plurality of objects arranged, unredacted document containing text in fig. 3A, and unredacted document containing both text and an image in fig. 6 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
processing the textual data and the second textual data to identify the confidential information therein (The unredacted document is parsed to identify objects either directly or relationally containing user sensitive information using a predetermined rule set, redacted document containing text in fig. 3B, and redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
anonymizing the confidential information (The user sensitive information within the unredacted document is substituted with placeholder information to generate a redacted document, in this instance, redacted document containing text in fig. 3B [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
processing the image data to identify image-based confidential information therein (The unredacted document is parsed to identify objects either directly or relationally containing user sensitive information using a predetermined rule set, which includes both texts and image in fig. 6 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]), and
anonymizing the image-based confidential information (The user sensitive information within the unredacted document is substituted with placeholder information to generate a redacted document, in this instance, redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]).
Regarding claim 2, Beveridge discloses the computer-implemented method of claim 1, wherein the step of extracting comprises determining whether the textual data or the document data forms part of the input data, and parsing with a parsing engine the document data into the second textual data and the image data (parsing the documents to obtain text and image [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]).
Regarding claim 3, Beveridge discloses the computer-implemented method of claim 2, wherein the step of processing the textual data and the document data comprises detecting and identifying the confidential information in the textual data and the second textual data, and anonymizing the confidential information (identify objects either directly or relationally containing user sensitive information using a predetermined rule set based. The user sensitive information within the unredacted document is substituted [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]).
Regarding claim 4, Beveridge discloses the computer-implemented method of claim 3, wherein the confidential information includes identifiable confidential information and contextually hidden confidential information, wherein the step of detecting and identifying comprises applying a first machine learning model to the confidential information to identify the identifiable confidential information (allows for an AI/machine learning model to be trained on a substantially similar redacted document absent user sensitive information [Beveridge; ¶12]).
Regarding claim 23, Beveridge discloses a non-transitory, computer readable medium comprising computer program instructions tangibly stored on the computer readable medium, wherein the computer program instructions are executable by at least one computer processor to perform a method for anonymizing confidential information in input data, the method comprising:
extracting one or more of textual data and document data from the input data, wherein the document data includes one or more of second textual data and image data, including determining whether the textual data or the document data forms part of the input data (inputting unredacted document comprises a plurality of objects arranged, unredacted document containing text in fig. 3A, and unredacted document containing both text and an image in fig. 6 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]), and
parsing with a parsing engine the document data into the second textual data and the image data (The unredacted document is parsed to identify objects either directly or relationally containing user sensitive information using a predetermined rule set, which includes both texts and image in fig. 6 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
processing the textual data and the second textual data to identify the confidential information therein, including detecting and identifying, with a transformer-type machine learning model, the confidential information in the textual data and the second textual data (The unredacted document is parsed to identify objects either directly or relationally containing user sensitive information using a predetermined rule set, redacted document containing text in fig. 3B, and redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]), and
anonymizing the confidential information (The user sensitive information within the unredacted document is substituted with placeholder information to generate a redacted document, in this instance, redacted document containing text in fig. 3B [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
processing the image data to identify image-based confidential information therein (The unredacted document is parsed to identify objects either directly or relationally containing user sensitive information using a predetermined rule set, which includes both texts and image in fig. 6 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]), and
anonymizing the image-based confidential information (The user sensitive information within the unredacted document is substituted with placeholder information to generate a redacted document, in this instance, redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 5-22 and 24-30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Beveridge et al. (US 20180260734 A1; hereinafter Beveridge) in view of Zakour (US 20200110902 A1).
Regarding claim 5, Beveridge does not explicitly discloses the computer-implemented method of claim 3, wherein the step of detecting further comprises applying a second machine learning model to the textual data and the second textual data for identifying the contextually hidden confidential information therein, wherein the second machine learning model generates second model data that includes the contextually hidden confidential information; however, in a related and analogous art, Zakour teaches this feature.
In particular, Zakour teaches automate redaction of data and metadata based on heuristic rules or machine learning techniques that utilize statistical models [Zakour; ¶50-55; Fig. 2 and associated texts]. It would have been obvious before the effective filing date of the claimed invention to modify Beveridge in view of Zakour redacting files using metadata parameters using heuristic rules or machine learning techniques with the motivation to identify hidden sensitive information.
Regarding claim 6, Beveridge-Zakour combination discloses the computer-implemented method of claim 5, wherein the step of detecting further comprises applying a hierarchical recursive segment analysis technique to the second model data for identifying in the contextually hidden confidential information data that contributes the most to the decisions of the second machine learning model (heuristic rules or machine learning techniques that utilize statistical models [Zakour; ¶50-55; Fig. 2 and associated texts]). The motivation to identify hidden sensitive information.
Regarding claim 7, Beveridge-Zakour combination discloses the computer-implemented method of claim 6, wherein the step of anonymizing the confidential information comprises applying a third machine learning model to the confidential information to anonymize the confidential information (using heuristic rules or machine learning technique for redaction sensitive information [Zakour; ¶50-55; Fig. 2 and associated texts]). The motivation to identify hidden sensitive information.
Regarding claim 8, Beveridge-Zakour combination discloses the computer-implemented method of claim 7, wherein the third machine learning model comprises a named entity recognition (NER) model, a regular expression (Regex) model, or a text classification model (a particular model, to facilitate automation of redaction flows that generate different work products with different information that is redacted, including named [Zakour; ¶41-44, 49-50]). The motivation to identify hidden sensitive information.
Regarding claim 9, Beveridge-Zakour combination discloses the computer-implemented method of claim 7, wherein the step of anonymizing comprises highlighting the confidential information (Using at least one AI model trained using a plurality of redacted documents, it is determined whether each document comprises malicious information based on the score and a degree of confidence associated with the score. Data identifying those documents determined to comprise malicious information is provided [Beveridge; ¶9, 34, 50], redact electronic information associated with blacklisted terms (e.g., confidential, sensitive, secret, top secret, classified, privileged, etc.), or rules to not redact electronic information associated with whitelisted terms [Zakour; ¶44]). The motivation to identify hidden sensitive information.
Regarding claim 10, Beveridge-Zakour combination discloses the computer-implemented method of claim 9, wherein the step of processing the image data comprises processing the image data with an optical character recognition engine for extracting third textual data from the image data and identifying confidential information therein (AI system that can analyze text, images, metadata, and/or graphic operations [Beveridge; ¶32; Fig. 2, 6-7 and associated texts], image exploitation tool that may analyze and process images. Image exploitation tools may perform object recognition on images, motion tracking on objects, as well as other types of image analysis [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 11, Beveridge-Zakour combination discloses the computer-implemented method of claim 10, wherein the step of processing the image data further comprises detecting fingerprint data in the image data, or detecting signature data in the image data, wherein the signature data and the fingerprint data form part of the image-based confidential information (a specialized template built on a set of pre-defined rules entered by a user, may redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 12, Beveridge-Zakour combination discloses the computer-implemented method of claim 11, wherein the step of anonymizing the image-based confidential information comprises redacting the image-based confidential information (redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts], redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 13, Beveridge-Zakour combination discloses the computer-implemented method of claim 12, further comprising anonymizing the confidential information or the image-based confidential information by replacing one or more portions thereof with synthetic data (Images can be redacted through the substitution of a placeholder image. When substituting in a placeholder image, instructions and/or parameters around that image can be left intact. Streams of the image can be padded such that the size of the object containing the image remains the same. Redacted component is an example output of a graphics operator stream redaction. A mask has been relocated and resized in redacted document using the techniques previously described [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts], redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 14, Beveridge discloses a system for identifying and anonymizing confidential information in input data, comprising
a data extraction unit for extracting one or more of the textual data and the document data from the input data, wherein the document data includes one or more of second textual data and image data (machine learning system receiving inputting unredacted document comprises a plurality of objects arranged, unredacted document containing text in fig. 3A, and unredacted document containing both text and an image in fig. 6 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
wherein the data extraction unit includes a determination unit for determining whether the textual data or the document data forms part of the input data, and a parsing unit having a parsing engine for parsing the document data into the second textual data and the image data (unredacted document containing text in fig. 3A, and unredacted document containing both text and an image in fig. 6 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
a text processing unit having a text processing engine for processing one or more of the textual data and the second textual data and for identifying confidential information therein and for anonymizing the confidential information (machine learning system processing unredacted document is parsed to identify objects either directly or relationally containing user sensitive information using a predetermined rule set, redacted document containing text in fig. 3B, and redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
wherein the text processing unit includes a confidential information detection unit employing a first machine learning model configured for detecting and identifying confidential information in the textual data and the second textual data (detected and redacted document containing text in fig. 3B, and redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]),
wherein the confidential information includes identifiable confidential information [[and contextually hidden confidential information]] (identify objects either directly or relationally containing user sensitive information using a predetermined rule [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]). Beveridge does not disclose contextually hidden confidential information; however, in a related and analogous art, Zakour teaches this feature.
In particular, Zakour teaches automate redaction of data and metadata based on heuristic rules or machine learning techniques that utilize statistical models [Zakour; ¶50-55; Fig. 2 and associated texts]. It would have been obvious before the effective filing date of the claimed invention to modify Beveridge in view of Zakour redacting files using metadata parameters using heuristic rules or machine learning techniques with the motivation to identify hidden sensitive information.
an anonymization unit for selectively anonymizing the confidential information (machine learning system redacting sensitive information in text and image [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts]), and
an image processing unit for processing the image data and for identifying image-based confidential information therein (AI system that can analyze text, images, metadata, and/or graphic operations [Beveridge; ¶32; Fig. 2, 6-7 and associated texts], image exploitation tool that may analyze and process images. Image exploitation tools may perform object recognition on images, motion tracking on objects, as well as other types of image analysis [Zakour; ¶23, 55]).
Regarding claim 15, Beveridge-Zakour combination discloses the system of claim 14, wherein the first machine learning model comprises a transformer-type model, including a natural language processing model (machine learning with a natural language processing (NLP) for detection [Zakour; ¶50]). The motivation to identify hidden sensitive information.
Regarding claim 16, Beveridge-Zakour combination discloses the system of claim 15, wherein the confidential information detection unit applies a second machine learning model to the textual data and the second textual data for identifying the contextually hidden confidential information therein, wherein the second machine learning model generates second model data that includes the contextually hidden confidential information (automate redaction of data and metadata based on heuristic rules or machine learning techniques that utilize statistical models [Zakour; ¶50-55; Fig. 2 and associated texts]). It would have been obvious before the effective filing date of the claimed invention to modify Beveridge in view of Zakour redacting files using metadata parameters using heuristic rules or machine learning techniques with the motivation to identify hidden sensitive information.
Regarding claim 17, Beveridge-Zakour combination discloses the system of claim 16, wherein the confidential information detection unit further applies a hierarchical recursive segment analysis technique to the second model data for identifying in the contextually hidden confidential information data that contributes the most to the decisions of the second machine learning model (heuristic rules or machine learning techniques that utilize statistical models [Zakour; ¶50-55; Fig. 2 and associated texts]). The motivation to identify hidden sensitive information.
Regarding claim 18, Beveridge-Zakour combination discloses the system of claim 17, wherein the image processing unit comprises a text recognition unit employing an optical character recognition engine for extracting third textual data from the image data, and then processing the third textual data with the confidential information detection unit to identify confidential information in the third textual data (AI system that can analyze text, images, metadata, and/or graphic operations [Beveridge; ¶32; Fig. 2, 6-7 and associated texts], image exploitation tool that may analyze and process images. Image exploitation tools may perform object recognition on images, motion tracking on objects, as well as other types of image analysis [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 19, Beveridge-Zakour combination discloses the system of claim 18, wherein the image processing unit further comprises a fingerprint detection unit for detecting fingerprint data in the image data, or a signature detection unit for detecting signature data in the image data, wherein the signature data and the fingerprint data form part of the image-based confidential information (a specialized template built on a set of pre-defined rules entered by a user, may redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 20, Beveridge-Zakour combination discloses the system of claim 19, further comprising a redaction unit for anonymizing the image-based confidential information (redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts], redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 21, Beveridge-Zakour combination discloses the system of claim 20, wherein the redaction unit is configured to redact the image-based confidential information (redacted documents containing both text and an image in fig. 7 [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts], redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 22, Beveridge-Zakour combination discloses the system of claim 20, wherein the anonymization unit or the redaction unit can be configured to replace at least portions of the confidential information and the image-based confidential information with synthetic data (Images can be redacted through the substitution of a placeholder image. When substituting in a placeholder image, instructions and/or parameters around that image can be left intact. Streams of the image can be padded such that the size of the object containing the image remains the same. Redacted component is an example output of a graphics operator stream redaction. A mask has been relocated and resized in redacted document using the techniques previously described [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts], redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 24, Beveridge discloses the computer readable medium of claim 23, wherein the confidential information includes identifiable confidential information and contextually hidden confidential information, wherein the step of detecting and identifying comprises applying a first machine learning model to the confidential information in the textual data and the second textual data for identifying the identifiable confidential information therein (allows for an AI/machine learning model to be trained on a substantially similar redacted document absent user sensitive information [Beveridge; ¶12]),
applying a second machine learning model to the textual data and the second textual data for identifying [[the contextually hidden confidential information therein]], wherein the second machine learning model generates second model data that includes the contextually hidden confidential information (allows for an AI/machine learning model to be trained on a substantially similar redacted document absent user sensitive information [Beveridge; ¶12]). Beveridge does not disclose contextually hidden confidential information; however, in a related and analogous art, Zakour teaches this feature.
In particular, Zakour teaches automate redaction of data and metadata based on heuristic rules or machine learning techniques that utilize statistical models [Zakour; ¶50-55; Fig. 2 and associated texts]. It would have been obvious before the effective filing date of the claimed invention to modify Beveridge in view of Zakour redacting files using metadata parameters using heuristic rules or machine learning techniques with the motivation to identify hidden sensitive information.
applying a hierarchical recursive segment analysis technique to the second model data for identifying in the contextually hidden confidential information data that contributes the most to the decisions of the second machine learning model (heuristic rules or machine learning techniques that utilize statistical models [Zakour; ¶50-55; Fig. 2 and associated texts]).
Regarding claim 25, Beveridge-Zakour combination discloses the computer readable medium of claim 24, wherein anonymizing the confidential information comprises applying a third machine learning model to the confidential information to anonymize the confidential information (using heuristic rules or machine learning technique for redaction sensitive information [Zakour; ¶50-55; Fig. 2 and associated texts]). The motivation to identify hidden sensitive information.
Regarding claim 26, Beveridge-Zakour combination discloses the computer readable medium of claim 25, wherein the third machine learning model includes a named entity recognition (NER) model, a regular expression (Regex) model, or a text classification model (a particular model, to facilitate automation of redaction flows that generate different work products with different information that is redacted, including named [Zakour; ¶41-44, 49-50]). The motivation to identify hidden sensitive information.
Regarding claim 27, Beveridge-Zakour combination discloses the computer readable medium of claim 26, wherein anonymizing the confidential information comprises highlighting the confidential information (Using at least one AI model trained using a plurality of redacted documents, it is determined whether each document comprises malicious information based on the score and a degree of confidence associated with the score. Data identifying those documents determined to comprise malicious information is provided [Beveridge; ¶9, 34, 50], redact electronic information associated with blacklisted terms (e.g., confidential, sensitive, secret, top secret, classified, privileged, etc.), or rules to not redact electronic information associated with whitelisted terms [Zakour; ¶44]). The motivation to identify hidden sensitive information.
Regarding claim 28, Beveridge-Zakour combination discloses the computer readable medium of claim 27, wherein processing the image data comprises processing the image data with an optical character recognition engine for extracting third textual data from the image data and identifying confidential information therein (AI system that can analyze text, images, metadata, and/or graphic operations [Beveridge; ¶32; Fig. 2, 6-7 and associated texts], image exploitation tool that may analyze and process images. Image exploitation tools may perform object recognition on images, motion tracking on objects, as well as other types of image analysis [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 29, Beveridge-Zakour combination discloses the computer readable medium of claim 28, wherein processing the image data further comprises detecting fingerprint data in the image data, or detecting signature data in the image data, wherein the signature data and the fingerprint data form part of the image-based confidential information (a specialized template built on a set of pre-defined rules entered by a user, may redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Regarding claim 30, Beveridge-Zakour combination discloses the computer readable medium of claim 29, further comprising anonymizing the confidential information or the image-based confidential information by replacing one or more portions thereof with synthetic data (Images can be redacted through the substitution of a placeholder image. When substituting in a placeholder image, instructions and/or parameters around that image can be left intact. Streams of the image can be padded such that the size of the object containing the image remains the same. Redacted component is an example output of a graphics operator stream redaction. A mask has been relocated and resized in redacted document using the techniques previously described [Beveridge; ¶46-47; Figs. 3s, 6-8 and associated texts], redact specific key words, phrases, objects associated with embedded images, and corresponding metadata from source artifacts stored in electronic format (e.g., images 110, images metadata 115, video 120, video metadata 125, documents 130, documents, metadata 135), including but not limited to, names, locations, titles, personally identifying information, dates, phrases, objects associated with an image or with frames of a video [Zakour; ¶23, 55]). The motivation to identify hidden sensitive information.
Internet Communications
Applicant is encouraged to submit a written authorization for Internet communications (PTO/SB/439, http:ljwww.uspto.gov/sites/default/files/documents/sb0439.pdf) in the instant patent application to authorize the examiner to communicate with the applicant via email. The authorization will allow the examiner to better practice compact prosecution. The written authorization can be submitted via one of the following methods only: (1) Central Fax which can be found in the Conclusion section of this Office action; (2) regular postal mail; (3) EFS WEB; or (4) the service window on the Alexandria campus. EFS web is the recommended way to submit the form since this allows the form to be entered into the file wrapper within the same day (system dependent). Written authorization submitted via other methods, such as direct fax to the examiner or email, will not be accepted. See MPEP § 502.03.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAO Q HO whose telephone number is (571)270-5998. The examiner can normally be reached on 7:00am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey Nickerson can be reached on (469) 295-9235. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/DAO Q HO/Primary Examiner, Art Unit 2432