Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
DETAILED ACTION
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/10/2024 is/are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-19 of U.S. Patent No. 12118087B2. Although the claims at issue are not identical, they are not patentably distinct from each other because the instant application while broader encompasses the subject matter of the US Patent as indicated in the table below.
Application 18/912413
US Patent No 12118087B2
1. A method of training a machine learning model to identify electronic documents with potential malware, the method comprising:
providing Portable Document Format (PDF) files known to contain malware in a malicious group;
providing PDF files known not to contain malware in a non-malicious group;
creating training data from PDF files in both groups, wherein creating the training data comprises:
determining a first feature associated with a file structure of a PDF file;
determining a second feature associated with a potential embedded image in the PDF file;
determining a third feature associated with a potential embedded resource locator in the PDF file; and
determining a fourth feature associated with text content in the PDF file; and
training a machine learning model with the training data, wherein the machine learning model is configured to receive input features associated with a new PDF file and output a probability that the new PDF file is associated with malware, wherein the machine learning model is configured to classify unknown malware at least because the first feature, the second feature, the third feature, and the fourth feature are based on file content or file structure.
1. A method of training a machine learning algorithm for addressing malware, the method comprising:
providing files known to contain malware in a malicious group;
providing files known not to contain malware in a non-malicious group;
performing content disarm and reconstruction (CDR) analysis on files in both groups by:
parsing the analyzed files into standard and non-standard components,
re-creating standard components from known-good data according to standardized specifications,
combining the re-created standard components with information from the non-standard components to create a substitute electronic file visually identical to the analyzed file, and
creating a machine readable summary report for each analyzed file, each machine readable summary report including multiple report items based on file content and structure;
training the machine learning algorithm with the machine readable summary reports and computing a probability function with the machine learning algorithm to estimate risk of the files that will generalize to new data and be useful for classifying unknown malware at least because the machine readable summary reports are based on file content and structure.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e. an abstract idea) without significantly more.
Step 1: This part of the eligibility analysis evaluates whether the claim falls within any statutory category. See MPEP 2106.03. The claims recites a method and system. These are directed to a series of steps or acts and machine, and falls within one of the statutory categories of invention. (Step 1: YES).
Step 2A, Prong One: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application. See MPEP 2106.04(d).
Claim 1, 7 and 14 are directed to an abstract idea because the following claim limitations recite an abstract idea:
A method and system for training a learning model to identify potentially malicious files comprising :
Creating training data from files comprising: determining a first feature associated with a file structure, determining a second feature associated with a potential embedded image, determining a third features associated with embedded resource locators, and determining a fourth feature associated with text content (mental process: a human-being reading documents and extracting or taking notes on specific structural and content based features);
Training a learning model with the training data (mental process: a human-being trained with training material that helps the human-being make decisions using a decision model)
Receiving input features associated with a new file and outputting a probability that the new file is associated with malware, wherein the learning model is configured to classify unknown malware (mental process: a human-being mentally considering the read data of a new document in light of the remembered decision model to predict the likelihood of the new document being malicious)
Claims 1, 7 and 14 recites the following additional elements:
Wherein the files are “Portable Document Format (PDF) files:;
Providing the files known to contain malware in a malicious group and providing files known not to contain malware in a non-malicious group;
Wherein the learning model is a “machine learning model”;
Wherein the system (Claims 7 and 14) comprises “a non-transitory data storage medium” and “one or more first computer hardware processors”
Step 2A, Prong Two: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application.
The claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware and classifying unknown malware. See MPEP 2106.04(d)(1) and 2106.05(a). The additional elements are recited at a high level of generality and amount to merely using computers as a tool to implement the abstract idea. Thus the additional elements are considered mere instruction to apply the abstract ideal See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to integrate the abstract idea into a practical application.
Step 2B:
This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05.
One way to determine integration into a practical application is when the claimed invention improves the functioning of a computer or improves another technology or technical field. To evaluate an improvement to a computer or technical field, the specification must set forth an improvement in technology and the claim itself must reflect the disclosed improvement. See MPEP 2106.04(d)(1) and 2106.05(a).
Likewise to step 2A prong 2, the claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware and classifying unknown malware. See MPEP 2106.04(d)(1) and 2106.05(a). The additional elements are recited at a high level of generality and amount to merely using computers as a tool to implement the abstract idea. Thus the additional elements are considered mere instruction to apply the abstract ideal See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to amount to significantly more than the abstract idea itself, even when the additional elements are considered alone and in combination with the abstract idea. (Step 2B: NO).
Therefore, the claims are directed to an abstract idea without significantly more and are unpatentable.
Claims 2, 8 and 15
Regarding claims 2, 8 and 15 the following claim limitations recites an abstract idea
wherein the learning model comprises at least one of a neural network, a random forest model, or a boosted gradient model. (mental process: the human-being’s learned decision model includes one of these algorithms/models.)
Claims 2, 8 and 15 recites the additional elements:
wherein the learning models are “machine learning” models.
Step 2A, Prong 2 and Step 2B
Claims 2, 8 and 15 fail to recite any new additional elements relative to base claims 1, 7 and 14. Thus, the analysis and findings for step 2A, prong 2 and step 2B incorporates the analysis and findings of claims 1, 7 and 14 however, the analysis and findings includes consideration of claims 2, 8 and 15 as a whole. Therefore, claims 2, 8 and 15 are directed to an abstract idea without significantly more and is unpatentable.
Claims 3, 9 and 16
Regarding claims 3, 9 and 16 the following claim limitations recites an abstract idea
determining first input features from an incoming file. (mental process: the human-being’s reading a document and extracting features.)
generating a first probability via a learning model that the incoming file is associated with malware based on the first input features. (mental process: the human-being’s determining a probability that the document is malicious based on the previously read features.)
Claims 3, 9 and 16 recites the additional elements:
wherein the file is a “PDF” file;
receiving the PDF file;
wherein the learning models are “machine learning” models.
Step 2A, Prong Two: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application.
The claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of receiving an incoming PDF file is recited at a high level of generality and amount to mere data gathering, which is insignificant extra-solution activity necessary to perform the abstract idea. See MPEP 2106.05(g). Thus the additional elements are considered mere instruction to apply the abstract ideal See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to integrate the abstract idea into a practical application.
Step 2B:
This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05.
One way to determine integration into a practical application is when the claimed invention improves the functioning of a computer or improves another technology or technical field. To evaluate an improvement to a computer or technical field, the specification must set forth an improvement in technology and the claim itself must reflect the disclosed improvement. See MPEP 2106.04(d)(1) and 2106.05(a).
Likewise to step 2A prong 2, the claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of receiving an incoming PDF file is recited at a high level of generality and amount to mere data gathering, which is insignificant extra-solution activity necessary to perform the abstract idea. See MPEP 2106.05(g). Thus the additional elements are considered mere instruction to apply the abstract ideal See MPEP 2106.05(f). Furthermore, receiving an electronic file is a well-understood, routine and conventional of data gathering within the art. The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity such as Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362. See MPEP 2106.05(d)(II). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to amount to significantly more than the abstract idea itself, even when the additional elements are considered alone and in combination with the abstract idea. (Step 2B: NO).
Therefore, the claims are directed to an abstract idea without significantly more and are unpatentable.
Claims 4, 10 and 17
Regarding claims 4, 10 and 17 recites following claim limitations recites an abstract idea
extracting a file from an incoming communication. (mental process: the human-being’s retrieving a document from a larger stack of documents.)
Claims 4, 10 and 17 recites the additional elements:
wherein the incoming communication is an “email”;
wherein the “email” has an “attachment” containing a file;
wherein the file is a “PDF” file;
intercepting the “email” via an “email system”.
Step 2A, Prong Two: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application.
The claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of an email system, an email, an attachment, and intercepting the email are recited at a high level of generality. They merely establish a generic technological environment or field of use for mere data gathering, which is insignificant extra-solution activity necessary to perform the abstract idea. See MPEP 2106.05(g) and 2106.05(h). The additional elements are recited at a high level of generality and amount to merely using computers as a tool to implement the abstract idea; thus the additional elements are considered mere instruction to apply the abstract ideal. See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to integrate the abstract idea into a practical application.
Step 2B:
This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05.
One way to determine integration into a practical application is when the claimed invention improves the functioning of a computer or improves another technology or technical field. To evaluate an improvement to a computer or technical field, the specification must set forth an improvement in technology and the claim itself must reflect the disclosed improvement. See MPEP 2106.04(d)(1) and 2106.05(a).
Likewise to step 2A prong 2, the claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of an email system, an email, an attachment, and intercepting the email are recited at a high level of generality. They merely establish a generic technological environment or field of use for mere data gathering, which is insignificant extra-solution activity necessary to perform the abstract idea. See MPEP 2106.05(g) and 2106.05(h). The additional elements are recited at a high level of generality and amount to merely using computers as a tool to implement the abstract idea; thus the additional elements are considered mere instruction to apply the abstract ideal. See MPEP 2106.05(f). Furthermore, intercepting emails with attachments is a well-understood, routine and conventional method of data gathering within the art. The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity such as Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362. See MPEP 2106.05(d)(II). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to amount to significantly more than the abstract idea itself, even when the additional elements are considered alone and in combination with the abstract idea. (Step 2B: NO).
Therefore, the claims are directed to an abstract idea without significantly more and are unpatentable.
Claims 5, 11 and 18
Regarding claims 5, 11 and 18 recites following claim limitations recites an abstract idea
identifying specific content in an incoming file (mental process: the human-being’s locating unwanted items in a document.)
removing the specific content (mental process: the human-being removing the unwanted items from the document)
Claims 5, 11 and 18 recites the additional elements:
wherein the specific content is “active content”;
wherein the file is a “PDF” file;
wherein the removal of active content results in “a sanitized file”.
Step 2A, Prong Two: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application.
The claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of active content and sanitized PDF are recited at a high level of generality and amounts to generic data objects being manipulated by the abstracted idea. Thus the additional elements are considered mere instruction to apply the abstract idea. See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to integrate the abstract idea into a practical application.
Step 2B:
This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05.
One way to determine integration into a practical application is when the claimed invention improves the functioning of a computer or improves another technology or technical field. To evaluate an improvement to a computer or technical field, the specification must set forth an improvement in technology and the claim itself must reflect the disclosed improvement. See MPEP 2106.04(d)(1) and 2106.05(a).
Likewise to step 2A prong 2, the claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of active content and sanitized PDF are recited at a high level of generality and amounts to generic data objects being manipulated by the abstracted idea. Thus the additional elements are considered mere instruction to apply the abstract idea. See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to amount to significantly more than the abstract idea itself, even when the additional elements are considered alone and in combination with the abstract idea. (Step 2B: NO).
Therefore, the claims are directed to an abstract idea without significantly more and are unpatentable.
Claims 12 and 19
Regarding claims 12 and 19 recites following claim limitations recites an abstract idea
wherein the content comprises instructions (mental process: the human-being’s categorizing/classifying a specific type of information or rule.)
Claims 12 and 19 recites the additional elements:
wherein the instruction is “JavaScript Instructions”.
Step 2A, Prong Two: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application.
The claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of JavaScript instructions are recited at a high level of generality and merely identifies the generic data format or programming language of the data being manipulated by the abstracted idea. Thus the additional elements are considered mere instruction to apply the abstract idea to a specific data format. See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to integrate the abstract idea into a practical application.
Step 2B:
This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05.
One way to determine integration into a practical application is when the claimed invention improves the functioning of a computer or improves another technology or technical field. To evaluate an improvement to a computer or technical field, the specification must set forth an improvement in technology and the claim itself must reflect the disclosed improvement. See MPEP 2106.04(d)(1) and 2106.05(a).
Likewise to step 2A prong 2, the claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of JavaScript instructions are recited at a high level of generality and merely identifies the generic data format or programming language of the data being manipulated by the abstracted idea. Thus the additional elements are considered mere instruction to apply the abstract idea to a specific data format. See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to amount to significantly more than the abstract idea itself, even when the additional elements are considered alone and in combination with the abstract idea. (Step 2B: NO).
Therefore, the claims are directed to an abstract idea without significantly more and are unpatentable.
Claims 6, 13 and 20
Regarding claims 6, 13 and 20 recites following claim limitations recites an abstract idea
determining performance of the learning model from a data set by applying a metric selected from the following group: recall, accuracy, precision, Area Under a Curve for a Receiver Operator Characteristic graph, and a harmonic mean of precision and recall (F 1 score). (mathematical concept: the human-being’s performing a mathematical calculations and computing statistics related to performance using mathematical formulas.)
Claims 6, 13 and 20 recites the additional elements:
wherein the data set is a “test set”.
Step 2A, Prong Two: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application.
The claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of a test set are recited at a high level of generality and amount merely using computers as a tool to implement the abstract idea. Thus the additional elements are considered mere instruction to apply the abstract ideal See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to integrate the abstract idea into a practical application.
Step 2B:
This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05.
One way to determine integration into a practical application is when the claimed invention improves the functioning of a computer or improves another technology or technical field. To evaluate an improvement to a computer or technical field, the specification must set forth an improvement in technology and the claim itself must reflect the disclosed improvement. See MPEP 2106.04(d)(1) and 2106.05(a).
Likewise to step 2A prong 2, the claims fails to achieve a technical solution to a technical problem. Thus the claim fail to provide an improvement to the function of a computer or to a technology itself. The claim culminate with outputting a probability that a new file is associated with malware. See MPEP 2106.04(d)(1) and 2106.05(a). The new additional elements of a test set are recited at a high level of generality and amount merely using computers as a tool to implement the abstract idea. Thus the additional elements are considered mere instruction to apply the abstract ideal See MPEP 2106.05(f). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).Therefore, the examiner must find that the claims fail to amount to significantly more than the abstract idea itself, even when the additional elements are considered alone and in combination with the abstract idea. (Step 2B: NO).
Therefore, the claims are directed to an abstract idea without significantly more and are unpatentable.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA the applicant regards as the invention.
Regarding claims 1, 7 and 14 the limitation “classify unknown malware at least because the first feature, the second feature, the third feature, and the fourth feature are based on file content or file structure” renders the claim indefinite because the phrase “at least because” renders the scope of the claim unclear since it’s unclear as to what are the specific acts, algorithms or structures are used to classify malware falls within the scope of “at least because.”
Dependent claims are rejected for inheriting the deficiencies from the base claims from which they depend.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4, 6-10, 12-17, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US 20130036231 to Smutz et al. (hereinafter “Smutz .”) in view of US 20160373917 to Zeppenfeld et al. (hereinafter “Zeppenfeld”)
Claim 1
Smutz teaches a method of training a machine learning model to identify electronic documents with potential malware, [e.g. Smutz; Page 239, Abstract – Smutz discloses a framework for detection of malicious documents through machine learning] the method comprising:
providing Portable Document Format (PDF) files known to contain malware in a malicious group; [e.g. Smutz; Page 239 Abstract, 241 3.2 Data Sources, , Table 1 – Smutz discloses a training set of equal parts benign and malicious Portable Document Format (PDF) files]
providing PDF files known not to contain malware in a non-malicious group; [e.g. Smutz; Page 239 Abstract, 241 3.2 Data Sources, , Table 1 – Smutz discloses a training set of equal parts benign and malicious Portable Document Format (PDF) files]
creating training data from PDF files in both groups, [e.g. Smutz; Page 239 Abstract, 241 3.2 Data Sources, , Table 1 – Smutz discloses a training set of equal parts benign and malicious Portable Document Format (PDF) files]creating the training data comprises:
determining a first feature associated with a file structure of a PDF file; [e.g. Smutz; Page 239 Abstract, Page 240 Introduction, Page 242 4.2 Feature Selection, , Table 1 – Smutz discloses extracting features based on the metadata and structure of the PDF]
determining a second feature associated with a potential embedded image in the PDF file; [e.g. Smutz; Page 239 Abstract, Page 240 Introduction, Page 242 4.2 Feature Selection, , Table 1 – Smutz discloses extracting features based on images in the document of the PDF]
determining a fourth feature associated with text content in the PDF file; [e.g. Smutz; Page 239 Abstract, Page 240 Introduction, Page 242 4.2 Feature Selection, , Table 1 – Smutz discloses extracting features based on text content] and
training a machine learning model with the training data, wherein the machine learning model is configured to receive input features associated with a new PDF file and output a probability that the new PDF file is associated with malware, wherein the machine learning model is configured to classify unknown malware at least because the first feature, the second feature, the third feature, and the fourth feature are based on file content or file structure. [e.g. Smutz; Page 239 Abstract, Page 240 Introduction, Page 242 4.2 Feature Selection, Page 243 4.4 Classification technique, Table 1 – Smutz discloses producing a classifier using the training data based on the feature (e.g. training a machine learning model with the training data) and predicting an output that is malicious or benign (e.g. probability the a new PDF file is associated with malware) using the classifier. ]
While Smutz teaches the method of claim 1 Smutz fails to explicitly teach extracting features related to embedded URLs for training data. More specifically Smutz fails to teach the claimed limitations of determining a third feature associated with a potential embedded resource locator in the PDF file; however, Zeppenfeld directed to machine learning-based classification of documents that are attached to electronic communications on a computer network teaches a machine learning-based model that is trained to recognize, and learns over time, combinations and patterns of features of electronic documents that are strong malware signals. In an embodiment, a pre-processor performs a static analysis of a document that is a message attachment. The machine learning-based model analyzes features (e.g. such as URLs) that are output by the pre-processor. The models (e.g. such as a tree-based algorithm, a random forest algorithm, a deep learning algorithm, a neural network, a deep convolutional neural network) that are trained to recognize malware signals that are common across multiple different document types and/or document characteristics.: [e.g. Zeppenfeld; Abstract, Col 4 Ln 51 – Col 11 Ln 11 – Zeppenfeld discloses detecting messages, such as emails, containing attachments and pre-processing the attachment for training the machine learning model various features such identifying URLs and images for classifying attachments as malicious or benign.]
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to include, the features above in the invention as disclosed by Smutz with the advantage of accurately and reliably detecting suspicious message attachments, and preemptively handling those message attachments while maintaining the operational requirements of a messaging system as disclosed by Zeppenfeld Col 2 Ln 30-33
Claim 2:
Smutz and Zeppenfeld teaches the method of claim 1, wherein the machine learning model comprises at least one of a neural network, a random forest model, or a boosted gradient machine learning model. [e.g. Smutz; Page 239 Abstract, Page 240 Introduction, Page 242 4.2 Feature Selection, Page 243 4.4 Classification technique, Table 1 – Smutz discloses using a random forest model. ] [e.g. Zeppenfeld; Abstract, Col 4 Ln 51 – Col 11 Ln 11 – Zeppenfeld discloses the machine learning model may be a tree-based algorithm, a random forest algorithm, a deep learning algorithm, a neural network, a deep convolutional neural network.]
Claim 3:
Smutz and Zeppenfeld teaches the method of claim 1, further comprising: receiving an incoming PDF file; determining first input features from the incoming PDF file; and generating, via the machine learning model, a first probability that the incoming PDF file is associated with malware from the first input features. [e.g. Smutz; Page 239 Abstract, Page 240 Introduction, Page 242 4.2 Feature Selection, Page 243 4.4 Classification technique, Table 1 – Smutz discloses receiving unknown PDF files (e.g. incoming PDF files), extracting features from the PDF (e.g. determining first input features) and inputting the features through the classifier to determine if file is malicious or benign. ] [e.g. Zeppenfeld; Abstract, Col 4 Ln 51 – Col 11 Ln 11 – Zeppenfeld discloses the machine learning model generating classification decisions (e.g. probability).]
Claim 4:
While Smutz teaches the method of claim 1 and identifying documents in SMTP traffic, Smutz fails to explicitly teach extracting features related to embedded URLs for training data. More specifically Smutz fails to teach the claimed limitations of intercepting, via an email system, an email that includes the incoming PDF file as an attachment and extracting the incoming PDF file from the email; however, Zeppenfeld directed to machine learning-based classification of documents that are attached to electronic communications on a computer network teaches a machine learning-based model that is trained to recognize, and learns over time, combinations and patterns of features of electronic documents that are strong malware signals. In an embodiment, a pre-processor performs a static analysis of a document that is a message attachment. The machine learning-based model analyzes features (e.g. such as URLs) that are output by the pre-processor. The models (e.g. such as a tree-based algorithm, a random forest algorithm, a deep learning algorithm, a neural network, a deep convolutional neural network) that are trained to recognize malware signals that are common across multiple different document types and/or document characteristics.: [e.g. Zeppenfeld; Abstract, Col 4 Ln 51 – Col 11 Ln 11 – Zeppenfeld discloses detecting messages, such as emails, containing attachments and pre-processing the attachment for training the machine learning model various features such identifying URLs and images for classifying attachments as malicious or benign.]
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to include, the features above in the invention as disclosed by Smutz with the advantage of accurately and reliably detecting suspicious message attachments, and preemptively handling those message attachments while maintaining the operational requirements of a messaging system as disclosed by Zeppenfeld Col 2 Ln 30-33
Claim 6:
Smutz teaches the method of claim 1, further comprising: determining performance of the machine learning model from a test set, wherein determining performance further comprises applying at least one metric selected from the following group: recall, accuracy, precision, Area Under a Curve for a Receiver Operator Characteristic graph, and a harmonic mean of precision and recall (F1 score). [e.g. Smutz; Page 239 Abstract, Page 240 Introduction, Page 242 4.2 Feature Selection, Page 243 4.4 Classification technique, Page 243-244 5.1 Classification & Detection Performance, Figure 3, Figure 4, Table 1 – Smutz discloses determining performance of the classifier using ROC graphs. ]
Regarding claims 9, 10, 13-17, and 20 they are system claims essentially corresponding to the above recitations, and they are rejected, at least, for the same reasons.
Claim 12:
Smutz teaches the system of claim 11, wherein the active content comprises JavaScript instructions. [e.g. Smutz; Page 240 2 Related Work, Page 241, Page 242 4.2 Feature Selection, , Table 1 – Smutz discloses identifying JavaScript features in a PDF where the PDF was identified in HTTP/SMTP traffic. Smutz further discloses in related work that identifying JavaScript in PDFs was well known at the time.]
Regarding claims 19 they are system claims essentially corresponding to the above recitations, and they are rejected, at least, for the same reasons.
Claims 5, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over US 20130036231 to Smutz et al. (hereinafter “Smutz .”) in view of US 20160373917 to Zeppenfeld et al. (hereinafter “Zeppenfeld”) and further in view of US 20090138972 to Scales.
Claim 5:
While Smutz teaches method of claim 3, further comprising: identifying active content in the incoming PDF file; [e.g. Smutz; Page 240 2 Related Work, Page 241, Page 242 4.2 Feature Selection, , Table 1 – Smutz discloses identifying JavaScript features in a PDF where the PDF was identified in HTTP/SMTP traffic. Smutz further discloses in related work that identifying JavaScript in PDFs was well known at the time.],.
Smutz and Zeppenfeld fails to explicitly teach sanitizing PDFs. More specifically the combination fails to teach the claimed limitations of removing, from the incoming PDF file, the active content that results in a sanitized PDF file; however, Scales directed to systems for resisting the spread of unwanted code and data teaches a method or system of receiving an electronic file containing content data in a predetermined data format, the method comprising the steps of: receiving the electronic file, determining the data format, parsing the content data, to determine whether it conforms to the predetermined data format, and if the content data does conform to the predetermined data format, regenerating the parsed data to create a regenerated electronic file in the data format. [e.g. Scales; Abstract, Para. 0004, 0012, 0024, 0042, 0047, 0115 – Scale discloses identifying active content such as code, macros and scripts and cleaning (e.g. sanitizing) the file by removal of the active content.]
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to include, the features above in the invention as disclosed by the combination with the advantage of resisting the spread of unwanted code and data as disclosed by Scales Para. 0001.
Regarding claims 11 and 18 they are system claims essentially corresponding to the above recitations, and they are rejected, at least, for the same reasons.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHRISTOPHER C HARRIS whose telephone number is (571)270-7841. The examiner can normally be reached Monday through Friday between 8:00 AM to 4:00 PM CST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey L Nickerson can be reached on (469) 295-9235. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHRISTOPHER C HARRIS/Primary Examiner, Art Unit 2432