DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 2, 9, and 16 are objected to because of the following informalities:
Claims 2, 9, and 16: “…second machine learning model being different from the first machine learning model”. Recommend deletion of intention use term, as to positively recite.
Appropriate correction is required.
Response to Amendment
This action is in response to the communications and remarks filed on 02/27/2026. Claims 2, 9, and 16 have been amended. Claim 22 is newly amended. Claims 1-22 have been examined and are pending.
Response to Arguments
Applicants’ arguments in the instant Amendment, filed on 02/27/2026, with respect to limitations listed below, have been fully considered but they are not persuasive.
Applicant’s arguments: “The Office Action rejects claims 2-7, 9 and 16 under 35 U.S.C. § 102(a)(1) as being anticipated by Moskovitch et al., "Unknown Malcode Detecting Using OPCODE Representation", LNCS 5376, 2008, pp 204-215, (hereinafter "Moskovitch"). The Office Action further rejects claims 8 and 15 under 35 U.S.C. § 102(a)(1) as being anticipated by Moskovitch in view of U.S. Publication No. 2018/0197089 to Krasser et al. (hereinafter "Krasser").
Claim 2 as amended recites, "providing the vector of weights as input to a first machine learning model for processing to generate a first likelihood output, the first likelihood output representing a first likelihood as predicted by the first machine learning model based on the vector of weights that the file is malicious; providing the first likelihood output and the vector of weights as input to a second machine learning model for processing to generate a second likelihood output, the second likelihood output representing a second likelihood as predicted by the second machine learning model based at least on the first likelihood that the file is malicious, the second
machine learning model being different from the first machine learning model." Applicant respectfully submits that the cited portion of Moskovitch does not disclose or suggest this combination of features recited in amended claim 2.
The Office Action cited portion of Moskovitch that reads as follows:
We employed four commonly used classification algorithms: Artificial Neural Networks (ANN) [14], Decision Trees (DT) [15], Naive Bayes (NB) [16], and
their boosted versions, BDT and BNB, respectively consisting of the Adaboost.Ml
[17]. We used the Weka [18] implementation of these methods.).
[ Moskovitch, 3.4 Classification Algorithms ]
In particular, the cited portion of Moskovitch does not disclose or suggest using two
different machine learning models that process different inputs, namely, "providing the vector of weights as input to a first machine learning model," and "providing the first likelihood output and the vector of weights as input to a second machine learning model."
Nor does the cited portion of Moskovitch disclose or suggest generating "a second likelihood as predicted by the second machine learning model based at least on the first likelihood."
Rather, the cited portion of Moskovitch merely discusses "four commonly used classification algorithms." Id.
Accordingly, Applicant respectfully submits that claim 2 and its dependent claims are in condition for allowance. Independent claims 9 and 16 and their respective dependent claims are allowable for corresponding reasons.”
The Examiner disagrees with the Applicant’s argument as the specification describes how the invention optionally selects a machine learning model from a group consisting of: generalized linear models, ordinary least squares, ridge regression, lasso, multi-task lasso, etc. [specification 0009]. Moskovitch does indeed teach multiple machine learning, classification algorithms to be applied. Examiner broadly interprets that there are different classifiers of the machine learning models are applied with different outputs. However, there is no suggestion in the claim language to when or how to select a model from a group of models to perform the statistical analysis of the vector of weights. Examiner recommends further clarification of the independent claims 2, 9, and 16. Thus, Moskowitz is maintained below.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 2-22 are rejected under 35 U.S.C. 103 as being unpatentable over Moskovitch et al, “Unknown Malcode Detecting Using OPCODE Representation”, LNCS 5376, 2008, pp204—215 was submitted in 10/30/2024 IDS, in view of Krasser et al, hereinafter (“Krasser”), US PG Publication 20180198800 A1, was submitted in 10/30/2024 IDS.
Regarding currently amended claims 2, 9, and 16, Moskovitch teaches a computer-implemented method, comprising; a non-transitory computer-readable medium containing instructions which, when executed, cause a computing device to perform operations comprising; and computer-implemented system, comprising: at least one hardware processor; and one or more computer-readable storage medium coupled to the at least one hardware processor and storing programming instructions for execution by the at least one hardware processor, wherein the programming instructions, when executed, cause the system to perform operations comprising: [Moskovitch Introduction ¶3]
extracting from a file, sequential data comprising discrete tokens, wherein the discrete tokens comprise syllables of machine language instructions within operation code (opcode); [Moskovitch 2.1 Detecting Unknown Malcode Using Byte Sequence N-Grams ¶4 study consisted of largest test collection includes more than 30,000 files represented by byte sequence n-grams]
generating n-grams of the discrete tokens, wherein n is a positive integer; [Moskovitch 2.1 Detecting Unknown Malcode Using Byte Sequence N-Grams ¶4 3 to 6 n-grams]
generating, by using a bag of words algorithm, a vector of weights based on inverse document frequencies of the n-grams; [Moskovitch 3.1 Text Categorization ¶1 vector space model to represent textual file as a bag of words; where extended representation calculated IDF = log(N/DF) where N is the number of documents in the entire file collection and DF is the Document Frequency (DF) is the number of files in which it appears; document frequency is used for “term weighting.”]
determining, based on a statistical analysis of the vector of weights, that the file is likely to be malicious [Moskovitch 3.3 Data Preparation and Feature Selection ¶¶1-3 classifying files using disassembler processing where malware writers often try to prevent the disassembly process; stu
providing the vector of weights as input to a first machine learning model for processing to generate a first likelihood output by , the first likelihood output representing a first likelihood as predicted by the first machine learning model based on the vector of weights that the file is malicious; [Moskovitch 3.2 Dataset Creation ¶1 acquired 7688 malicious files (5677 files after OpCode converted representation) and 22,735 benign file set (26093 after OpCode converted representation), respectively from the VX Heaven website and executables/DLL files gathered from Windows XP-based machines. 3.3 Data Preparation and Feature Selection ¶1 to classify the converted files into a vectorial representation in order to classify the two representations: n-grams that consists of byte sequences extracted from binary code. 3.3 Data Preparation and Feature Selection. 3.4 Classification Algorithms ¶1 Artificial Neural Networks (ANN), Decision Trees (DT), Naïve Bayes (NB), etc. ¶5 machine learning applications use to process binary files are dissembled and parsed, and n-grams are extracted. See Moskovitch 5.1.3 Classifiers Table 1 the accuracy of the results is depicted where ANN and Decision Tree classifiers with an accuracy of 92.13% and 93%, respectively and both having the low levels of false positive (FP).]
While Moskovitch teaches the vector of weights as input to a second machine learning model for processing to generate a second likelihood output, the second likelihood output representing a second likelihood as predicted by the second machine learning model based at least on the first likelihood that the file is malicious, the second machine learning model being different from the first machine learning model [See Moskovitch 3.2 Dataset Creation ¶1 acquired 7688 malicious files (5677 files after OpCode converted representation) and 22,735 benign file set (26093 after OpCode converted representation), respectively from the VX Heaven website and executables/DLL files gathered from Windows XP-based machines. 3.3 Data Preparation and Feature Selection ¶1 to classify the converted files into a vectorial representation in order to classify the two representations: n-grams that consists of byte sequences extracted from binary code. 3.3 Data Preparation and Feature Selection ¶4 The second form of representation extracted sequences of OpCode expressions termed OpCode-n-grams. 3.4 Classification Algorithms ¶1 Artificial Neural Networks (ANN), Decision Trees (DT), Naïve Bayes (NB), etc. ¶5 machine learning applications use to process binary files are dissembled and parsed, and n-grams are extracted. See Moskovitch 5.1.3 Classifiers Table 1 the accuracy of the results is depicted where ANN and Decision Tree classifiers with an accuracy of 92.13% and 93%, respectively and both having the low levels of false positive (FP).]; however, Moskovitch fails to explicitly teach but Krasser teaches providing the first likelihood output and the vector of weights as input to a second machine learning model for processing to generate a second likelihood output, the second likelihood output representing a second likelihood as predicted by the second machine learning model based at least on the first likelihood that the file is malicious, the second machine learning model being different from the first machine learning model; [Krasser ¶0038 and 0040 CM112 performs classification of training data stream 114 and trial data stream 116 data indicating likelihood of spyware or virus based on attributes. ¶¶0132 and 0135-0137 Fig. 8 shows process 800 for determining two computational models CM(s): CM 804 and second CM 816 where the training module 226 determines a second CM 816 based at least in part on the training set 610 and a second hyperparameter value 818, e.g., a float value or tuple, different from the first hyperparameter value 806. The second CM 816 can be determined as discussed herein with reference to, e.g., operation 802, operation 620, or operation 318. In some examples, the training module 226 can determine the second hyperparameter value 818 based at least in part on at least one of the model outputs. ¶0137 malware/non-malware classifier] and determining that the file is likely to be malicious based on the second likelihood output; [Krasser ¶0137provide a CM 622 whose outputs satisfy the predetermined completion criterion; malware/non-malware classifier] and initiating, in response to determining that the file is likely to be malicious, a corrective action. [Krasser ¶0033 determined parameter values of trained CM(s) 112 to, e.g., categorize a file with respect to malware type, and/or to perform other data analysis and/or processing; a request to computing device(s) 102 for an output of the CM(s) 112, receive a response, and take action based on that response (i.e. quarantine or delete file(s) indicated in the response as being associated with malware).]
Moskovitch does not explicitly teach providing the first likelihood output and the vector of weights as input to a second machine learning model for processing to generate a second likelihood output, the second likelihood output representing a second likelihood as predicted by the second machine learning model based at least on the first likelihood that the file is malicious, the second machine learning model being different from the first machine learning model; however, in an analogous art in, Krasser discloses a secondary CM that intake different hyperparameter values or value set for training (Krasser, ¶0132). It would have been obvious to one of ordinary skill in the art before the effective filing data of the invention, to modify the system of Moskovitch to incorporate the concept of predetermined completion criterion, along with classification secondary hyperparameter value; using mathematical techniques to traverse the hyperparameter space can provide a classifier, e.g., a malware/non-malware classifier, that performs effectively or that most effectively generalizes to new malware families or other data beyond the training set 610, as suggested by Kramer (Krasser, ¶¶0132 and 0135-0137).
Regarding claims 3, 10 and 17, the combination of Moskovitch and Krasser teach claim 2 as described above.
Moskovitch teaches wherein the vector of weights comprises normalized term frequencies (TFs). [Moskovitch 3.1 Text Categorization ¶1 normalized TF]
Regarding claims 4, 11, and 17, the combination of Moskovitch and Krasser teach claim 2 as described above.
Moskovitch teaches wherein the vector of weights comprises inverse document frequencies (IDFs). [Moskovitch 3.1 Text Categorization ¶1 TF IDF]
Regarding claims 5, 12, and 18, the combination of Moskovitch and Krasser teach claim 2 as described above.
Moskovitch teaches wherein generating the vector of weights comprises: generating a term frequency (TF) of each of the n-grams; [Moskovitch 3.1 Text Categorization ¶1 after parsing text and extracting words, vocabulary can be described by its frequency for term weighting – term frequency (TF)]
generating an inverse document frequency (IDF) of each of the n-grams; [Moskovitch 3.1 Text Categorization ¶1 IDF] and generating a dot product of the TF and the IDF for each of the n-grams. [Moskovitch 3.1 Text Categorization ¶1 TF and IDF]
Regarding claims 6, 13, and 20, the combination of Moskovitch and Krasser teach claim 2 as described above.
Moskovitch teaches wherein n is greater than or equal to 2. [Moskovitch 3.3 Data Preparation and Feature Selection ¶4 1-gram, 2-gram, etc.]
Regarding claims 7, 14, and 21, the combination of Moskovitch and Krasser teach claim 2 as described above.
Moskovitch teaches wherein the corrective action comprises at least one of quarantining the file, stopping execution of the file, providing a notification that the file likely is malicious, flagging the file, storing the file, generating a hash of the file, transmitting the file or a hash of the file, or reverting to an earlier version of the file.
[Moskovitch 6 Discussion and Conclusions ¶1 …maintaining low levels of false alarms]
Regarding new claim 22, the combination of Moskovitch and Krasser teach claim 2 as described above.
While Moskovitch teaches wherein the first machine learning model comprises a logistic regression mode [Moskovitch p. 206 k nearest neighbor (KNN)]; however, Moskovitch fails to explicitly teach but Krasser teaches wherein the first machine learning model comprises a logistic regression mode, and the second machine learning model comprises one of: a generalized linear model, a neural network, a Perceptron, a support vector machine, a decision tree model, or a random forest model. [Krasser ¶0030 e.g., of CM(s) 112 for determining signatures of files, classifying files, determining whether files contain malware, or other use cases noted herein, the computational model(s) 112 may include: multilayer perceptrons (MLPs)]
Moskovitch does not explicitly teach the second machine learning model comprises one of: a generalized linear model, a neural network, a Perceptron, a support vector machine, a decision tree model, or a random forest model; however, in an analogous art in, Krasser discloses a secondary CM that intake different hyperparameter values or value set for training (Krasser, ¶0132). It would have been obvious to one of ordinary skill in the art before the effective filing data of the invention, to modify the system of Moskovitch to incorporate the concept of predetermined completion criterion, along with classification secondary hyperparameter value; using mathematical techniques to traverse the hyperparameter space can provide a classifier, e.g., a malware/non-malware classifier, that performs effectively or that most effectively generalizes to new malware families or other data beyond the training set 610, as suggested by Kramer (Krasser, ¶¶0132 and 0135-0137).
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAKINAH WHITE-TAYLOR whose telephone number is (571)270-0682. The examiner can normally be reached Monday-Friday, 10:45a-6:45p.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CATHERINE THIAW can be reached at 571-270-1138. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
SAKINAH WHITE-TAYLOR
Primary Examiner
Art Unit 2407
/Sakinah White-Taylor/Primary Examiner, Art Unit 2407