DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Claims 1, 2, 4, 6, 7, 9, 11-13 are pending.
Claims 1, 2, 6, 7, and 11, 12 are amended.
Claims 3, 5, 8 and 10 are cancelled.
Claims 14-16 are newly added.
Response to Arguments
Applicant’s arguments, see page 5, filed 06/13/2025, with respect to the rejection(s) of claim(s) Claims 1, 2 and 6, 7, 9, 11-12 under 35 U.S.C. § 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of newly found prior art.
Applicant’s arguments, see page 8, filed 11/12/2025, with respect to claims 1-2, 4, 6-7, 9, 11-13 have been fully considered and are persuasive. The rejection under 35 U.S.C. 112(a) and 35 U.S.C. 112(b) of 0 has been withdrawn.
Applicant's arguments filed 11/12/2025 on page 6-8, with respect to claims 1-2, 4, 6-7, 9, 11-13 rejected under 35 U.S.C. 101 have been fully considered but they are not persuasive.
On pg. 6, applicant argues:
“The memorandum issued by the office on August 4, 2025 confirms that the mental process grouping is not without limit. The memorandum further explains that claims reciting an exception should be distinguished from claims that merely involve an exception, which are eligible and do not require further eligibility analysis. The claims recite claims do not recite any mathematical concept or mental process such as comparing or categorizing information that can be performed in the human mind. Moreover, the claims do not recite any method of organizing human activity such as a fundamental economic concept or managing interactions between people. Thus, claims are eligible because they do not recite a judicial exception.”
Examiner respectfully disagrees. Applicant’s reliance on the August 4, 2025 reminder memorandum on evaluating subject matter eligibility is misplaced because that memorandum does not narrow the judicial exceptions and does not support the conclusion that these claims avoid Step 2A simply because they are implemented in a technical field. The August 4, 2025 reminder and the USPTO’s subject matter eligibility guidance in MPEP 2106.04 explain that examiners must distinguish claims that merely involve a judicial exception from claims that recite a judicial exception, and a claim “recites” an abstract idea when a mathematical concept or mental process is set forth or described in the claim, even if it is phrased in words rather than in symbols. Under MPEP 2106.04(a)(2)(I), mathematical concepts include organizing information through mathematical correlations and series of mathematical calculations based on selected information, and MPEP 2106.04(a)(2) further explains that words operating on data to solve a problem can serve the same purpose as a formula for Step 2A Prong One.
In this application, the limitations that convert reconstructed disassembled code into a hash value, convert the hash value into N gram data, and perform ensemble machine learning with a decision tree to classify identifiers of attack techniques and attackers describe mathematical data transformations and calculations and therefore fall within the mathematical concepts grouping even though no explicit equation is written in symbolic form.
The same guidance also addresses mental processes. MPEP 2106.04(a)(2)(III) explains that concepts performed in the human mind, such as observations and evaluations, are a mental process abstract idea and that computer implemented sequences of collecting information, analyzing it, and displaying or using results are treated as abstract ideas when recited at a high level of generality. The August 4, 2025 reminder, as discussed in the Subject Matter Eligibility Declarations memorandum, uses examples where evidence may show that certain distributed processing cannot practically be performed in the human mind, but it does not state that computer-based data analysis in a technical field falls outside the mental process grouping, and it reaffirms that the Step 2A framework in MPEP 2106 still controls.
Here, the claim as a whole is directed to disassembling an executable file, converting that code into feature representations, filtering out certain code patterns, classifying the code into labels for attack technique and attacker using ensemble machine learning, and providing the classification results through a user interface. That is the type of collect, analyze, and present information pattern that the Federal Circuit has treated as abstract in decisions such as Electric Power Group and Recentive, which held that applying generic machine learning to specific data environments was directed to abstract ideas, even though the tasks could not realistically be carried out by a human at scale. Thus, the fact that a human could not practically process all the malware data does not remove the claim from the mathematical concepts or mental process groupings under MPEP 2106.04(a)(2), because the claim still describes what data is being transformed and classified at a functional level, rather than reciting a specific improvement to the underlying computer technology.
The December 4, 2025 Subject Matter Eligibility Declarations memorandum further explains that applicants may submit declarations to show, for example, how an invention improves the functioning of a computer or other technology, or how a judicial exception is integrated into a practical application, but it also stresses that the controlling analysis remains the Alice and Mayo framework and the guidance in MPEP 2103 through 2106.07. In this case, even accepting that the claimed subject matter resides in the cybersecurity field, the current claim language is still drafted in terms of generic data transformations and generic machine learning classification of code data implemented on conventional computing components, and does not recite a specific improvement to computer functionality or to another technology in the way contemplated by MPEP 2106.04(d) and the examples in the Subject Matter Eligibility Declarations memorandum. Accordingly, under the August 4, 2025 reminder memorandum, the Subject Matter Eligibility Declarations memorandum, and the MPEP 2106 guidance, the claims continue to recite at least one abstract idea at Step 2A Prong One and remain subject to further analysis under Step 2A Prong Two and Step 2B. Applicant’s assertion that the claims are automatically eligible because they do not literally recite a human mental step or a method of organizing human activity is not consistent with these authorities, and the rejection under 35 U.S.C. 101 is properly maintained.
On pg. 6-7, applicant argues:
“The elements recited by Applicant's independent claims integrate any allegedly abstract idea into a practical application. Claims 1, 6 and 11 have been amended to align with USPTO guideline and the memorandum. The memorandum confirms that in computer-technologies, claims are eligible in Step 2A Prong Two by finding that a claim reflects an improvement to the functioning of a computer or to another technology or technical field, integrating a recited judicial exception into a practical application of the exception. The memorandum further states that the specification does not need to explicitly set forth the improvement, but it must describe the invention such that the improvement would be apparent to one of ordinary skill in the art. The features of "performing ensemble machine learning on the code blocks based on a decision tree having one or more nodes with respect to the code blocks," "wherein two parts of labels are assigned to the code blocks," "wherein a first label of the labels is indexed to the identifier of the attack technique and a second label of the labels is indexed to an attacker implementing the attack technique," "classifying both an identifier of the attack technique and the attacker implementing the attack technique based on the two parts of labels" reflect technical improvements in the technical field of cyber threat information processing system. Unlike conventional cyber security processing system that requires stages for attack identification and attacker attribution, the system and method recited in the claims performs joint classification using a decision-tree-based ensemble trained on code blocks. See Abstract; Paragraphs 0004, 0006, 0650-0651 of the original specification. The system and method provide faster and more efficient profiling by reducing redundant analyses and correlating multiple code-analysis results in a unified step. Such a technical solution to a technical problem was found to be patent eligible. See MPEP §2016.05(a) ("An important consideration in determining whether a claim improves technology is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a desired outcome, as opposed to merely claiming the idea of a solution or outcome"); Alice Corp. v. CLS Bank Int'l, 573 U.S. 208, 225 (2014) (citing Diamond v. Diehr, 450 U.S. 175, 177-78 (1981)) ("The claims at issue do not, for example, purport to improve the functioning of the computer itself or effect an improvement in any other technology or technical field."); DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245, 1258-59, 113 USPQ2d 1097, 1106-07 (Fed. Cir. 2014) ("The claimed solution was necessarily rooted in computer technology in order to overcome a problem specifically arising in the realm of computer networks."); Finjan, Inc. v. Blue Coat Systems, Inc., 879 F.3d 1299, 125 USPQ2d at 1285-87 (Fed. Cir. 2018) ("[T]he method of claim 1 employs a new kind of file that enables a computer security system to do things it could not do before. The security profile approach allows access to be tailored for different users and ensures that threats are identified before a file reaches a user's computer"). Accordingly, the pending claims are not directed to the abstract idea and thus should be patentable.”
Examiner notes that applicant’s citation appears to contain a typographical error. The quoted language corresponds to MPEP 2106.05(a) and not MPEP 2016.05(a).
Examiner respectfully disagrees. Applicant’s argument conflates a technical field of use with a claimed technological improvement. The August 4, 2025 reminder memorandum and MPEP 2106.04(d) explain that at Step 2A Prong Two a claim integrates a judicial exception into a practical application when the additional elements reflect a specific asserted improvement to the functioning of a computer or to another technology or technical field, as shown in the claim language itself, rather than merely using a computer to perform generic data processing in a particular domain. The same guidance also explains that while the specification need not explicitly label something as an “improvement,” it must describe the invention such that a person of ordinary skill would recognize a concrete technological improvement, and the claims must “reflect the disclosed improvement.” Here the portions cited by applicant in the specification do discuss goals such as faster profiling, reduced redundant analyses, and joint classification of attack technique and attacker, but the independent claims are drafted in purely functional terms that state what is to be done, not how the computer or any specific component is improved.
The particular features applicant relies on, namely performing ensemble machine learning on code blocks using a decision tree, assigning two labels to the code blocks, indexing a first label to an attack technique identifier and a second label to an attacker, and classifying both the attack technique and attacker based on those labels, all describe a desired result of the data analysis and classification pipeline at a high level. Under MPEP 2106.04(a)(2), such steps fall within the mathematical concepts and mental process groupings, because they amount to organizing information through mathematical correlations and performing series of calculations on data, even though the data happens to be disassembled code and the labels happen to be cybersecurity identifiers. The August 4, 2025 reminder and the Subject Matter Eligibility Declarations memorandum make clear that merely performing such analysis in a technical field like cybersecurity, and asserting speed or efficiency benefits, is not enough for Prong Two unless the claim recites a particular technical solution in terms of specific structures or operations rather than an abstract goal such as “joint classification” or “reducing redundant analyses.” The current claims do not recite any specific modification to the computer architecture, memory structures, data paths, or model architecture beyond generic “ensemble machine learning” and “decision tree” components, which the Federal Circuit has treated as conventional ML tools when claimed functionally.
The authorities applicant cites underscore this distinction. In DDR Holdings the claims recited a specific, non-conventional web page structure and redirect mechanism that changed how the network operated, and in Finjan the claims recited a new type of security file that allowed the system to perform capabilities it previously could not, both of which are concrete technical implementations reflected in the claim language. In contrast, in Recentive and similar cases the Federal Circuit held that claims which simply apply generic machine learning models such as decision trees or neural networks to particular data sets, with results like optimized schedules or network maps, are directed to abstract ideas and do not integrate the exception into a practical application, even though they operate in technical environments and purport to provide faster or more accurate outcomes. MPEP 2106.05(a) itself cautions that “an important consideration” is whether the claim covers a particular way of achieving a result as opposed to merely claiming the idea of the solution. The present claims read at that high level of abstraction and do not recite a particular technical implementation that improves the functioning of the computer or the underlying cybersecurity technology.
Accordingly, under Step 2A Prong Two and the August 4, 2025 reminder memorandum, the additional elements identified by applicant do not integrate the abstract idea into a practical application. They reflect the use of conventional data processing and machine learning techniques to jointly classify attack techniques and attackers and to present those results to a user, which is analogous to the abstract data analytic patterns discussed in MPEP 2106.04(a)(2) and Recentive. Because the claims as drafted do not recite a specific improvement to computer functioning or to another technology or technical field, they remain directed to an abstract idea and must proceed to Step 2B, and the rejection under 35 U.S.C. 101 is appropriately maintained.
On pg. 7, applicant argues:
“Independent Claims 1, 6 and 11 which integrate the features into the practical application are sufficient to amount to significantly more than the judicial exception. The claimed invention provides a non-conventional and inventive combination of known elements of the cyber security processing, which constitutes "inventive concept" under Step 2B. See MPEP § 2106.05(d); BASCOM Global Internet v. AT&T Mobility LLC, 827 F.3d 1341, 1350-51, 119 USPQ2d 1236, 1243 (Fed. Cir. 2016) ("Although the individual components recited in the claims were well-known, the claims were held patent-eligible because they recited a specific, non-conventional arrangement of these components that provided an inventive solution to filtering Internet content."). The convention method of classifying malware has been highly subjective and inconsistent, depending on analysts or the expertise of the analysts. See Paragraphs 0008-0009 of the original specification. As an example, if a different attacker generates malware using similar techniques to existing ones, a7
separate and additional classification method is required to identify the attacker, which results in a time-consuming process and leads to incorrect results. As another example, in case when the same attacker creates a variant of malware modifying a known attack technique, the attacker remains the same, but the attack techniques differ, which makes a unified classification difficult and giving confusion to users. However, as noted in paragraphs 0650-0653 of the original specification, even when the same attacker produces variant malware or a different attacker uses similar techniques, two parts of labels are assigned to the extracted code blocks, and it enables to identity an exact attack technique and its attacker at the same time. The features lie in the concept that an attacker's technique is tied to its distinctive cyber actions much like a criminal's signature, which enables precise identification of both the attack technique and the attacker. Thus, the feature significantly reduces confusion in classification results and enables to provide immediate and meaningful improvements in cyber intelligence services through a user interface, which amounts to significantly more than the judicial exception according to the examples. Taking all the additional elements individually, and in combination, claims as a whole amount to significantly more than the abstract idea. Accordingly, the pending claims should be patentable in view of the additional elements of the pending claims. ”
Examiner respectfully disagrees. Applicant’s Step 2B argument does not identify an “inventive concept” beyond the abstract idea itself, and the combination of elements recited in the independent claims remains a conventional analytics pipeline rather than the kind of specific, non-conventional arrangement discussed in BASCOM and MPEP 2106.05(d). The specification explains that the building blocks of the invention are known techniques in cybersecurity and machine learning, including disassembly of executable files, extraction of opcode and assembly code, hashing and N-gram feature generation, fuzzy hashing, ensemble machine learning models such as decision trees, and use of standardized cyber threat taxonomies like MITRE ATT&CK. At Step 2B, MPEP 2106.05(d) instructs that using well understood, routine and conventional components in their ordinary capacities does not amount to significantly more, and a mere assertion that known elements have been combined in a “non-conventional” way is not sufficient without a specific, claim reflected departure from conventional practice.
In BASCOM the Federal Circuit found an inventive concept because the claims recited a particular architecture that placed the filtering system at a specific location in the network and used that location to provide user customizable filtering at the ISP server in a way the prior art did not, which is a concrete structural arrangement of known components. Here, the claims do not recite any new architecture or structural arrangement of cybersecurity components. Instead, they recite, in order, disassembling executable code, converting it to hash and N-gram data, removing certain patterns, performing ensemble machine learning using a decision tree, assigning two labels per code block, and displaying the classification results to a user. That sequence is a straightforward implementation of conventional feature extraction and supervised classification on code data, and the “two labels” concept reflects the business logic of what is being predicted, rather than a new technical arrangement comparable to moving filters to a specific server as in BASCOM.
The problems described in paragraphs 0008 and 0009 of the specification, such as subjectivity and inconsistency in analyst based malware classification, and the narrative in paragraphs 0650 to 0653 describing joint labeling of attack technique and attacker, may highlight the desirability of the claimed result, but Step 2B focuses on whether the claim adds significantly more than the abstract idea of analyzing code with ML to classify techniques and attackers, not on the importance of that result. Recentive confirms that applying generic machine learning models to domain specific data, even to solve long standing problems and even when the solution is faster and more accurate, does not by itself supply an inventive concept if the claim does no more than state “use a machine learning model” with standard training and prediction steps. Likewise, here, the claim language does not recite a new model architecture, a new training mechanism, or a modification to computer operation. It recites generic ensemble machine learning and decision trees applied to code blocks with labels, followed by presentation of the classification results in a user interface.
MPEP 2106.05(a) emphasizes that an important consideration is whether the claim covers a particular solution or a particular way to achieve a desired outcome, as opposed to just the idea of the solution. The current claims are drafted in functional terms that capture the idea of jointly classifying attack technique and attacker from malware code using ensemble ML, but they do not specify a particular unconventional way of structuring the computer system or the model beyond what is typical in the art. On this record, the additional elements, taken individually or in combination, reflect the use of well understood, routine and conventional ML and cybersecurity processing applied to code data, and therefore do not amount to significantly more than the underlying abstract idea under Step 2B. Accordingly, the argument that the claims recite a non-conventional and inventive combination is not persuasive, and the rejection under 35 U.S.C. 101 is maintained.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-2, 4, 6-7, 9, 11-13 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Applying the subject matter eligibility test, as outlined in MPEP 2106:
Step 1: Statutory Category
The claims fall within a statutory category. Claims 1, 2, 4 are considered “method” claims, claims 6, 7, 9 are considered “apparatus” and claim11-13 are considered “non-transitory computer-readable storage medium”. Thus apparatus and method are members of the statutory categories. Thus, the analysis moves towards step 2A, prong one of the subject matter eligibility test.
Step 2A, Prong One: Judicial Exception
The claims recite a judicial exception, specifically an abstract idea. For example, claims 1, 6 and 11 recite “disassembling an input executable file…”,” reconstructing the disassembled code…”; “converting the reconstructed disassembled code into a hash value…”, “converting the hash value into N-gram data…”; “removing code patterns unrelated to an identifier of an attack technique…”; “constructing one or more code blocks….”; “performing ensemble machine learning…”; “classifying both an identifier of the attack technique…”, “identifying the attacker…; “providing the cyber intelligence service with malicious activity information for a user through the user interface”. Such processes are akin to a mathematical concept which have been recognized as abstract ideas. Thus, the analysis moves towards step 2A, prong two.
Step 2A, Prong Two: Integration into a Practical Application
The claims do not integrate the abstract idea into a practical application. The additional elements, such as operating system (OS), ensemble machine learning (Claim 1, 6, 11), one or more nodes, a database (Claim 6), a processor (claim 6), a disassembly module (claim 6), a data conversion module (claim 6), a profiling module (Claim 6), “A non-transitory computer-readable storage medium (Claim 11), appears to be generic computer functions and using general machine learning techniques which do not constitute meaningful limitations that would amount to significantly more than the abstract idea. The combination of these additional element is no more than generic computer functions.
Thus, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limitations on practicing the abstract idea. In Recentive Analytics, Inc. v. Fox Corp., 2023-2437 (Fed. Cir. Apr. 18, 2025), the Federal Circuit held that applying generic machine learning techniques to a specific field without improving the underlying technology does not constitute a practical application. The court emphasized that claims must delineate how the machine learning technology achieves a technological improvement. Thus, the analysis moves towards step 2B.
Step 2B: Inventive concept
Claim is additionally analyzed under Step 2B to evaluates whether the claim as a whole amount to significantly more than the recited exception, whether any additional element, or combination of additional elements, adds an inventive concept to the claim. When claims evaluated under step 2B, it is no more than what is well-understood, routine, conventional activity in the field. The specification does not provide any indication anything other than a generic computer component. The mere “disassembling an input executable file…”,” reconstructing the disassembled code…”; “converting the reconstructed disassembled code into a hash value…”, “converting the hash value into N-gram data…”; “removing code patterns unrelated to an identifier of an attack technique…”; “constructing one or more code blocks….”; “performing ensemble machine learning…”; “classifying both an identifier of the attack technique…”, “identifying the attacker…; “providing the cyber intelligence service with malicious activity information for a user through the user interface”. is a well-understood, routing and conventional function when it is claimed in a merely generic manner as it is here.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claim 1-2, 4, 6-7, 9, 11-16 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 1 recites the amendment “cyber intelligence service” and Claim 14 recites “selectively combining the code”. The specification fails to provide adequate written description of what constitutes “cyber intelligence service” and “selectively combining the code”. However, the claim is not limited to such embodiments and thus encompasses a scope not adequately described. Furthermore, newly added claims or claim limitations must be supported in the specification through express, implicit, or inherent disclosure. An amendment to correct an obvious error does not constitute new matter where the ordinary artisan would not only recognize the existence of the error in the specification, but also recognize the appropriate correction.
New or amended claims which introduce elements or limitations that are not supported by the as-filed disclosure violate the written description requirement. See, e.g., In re Lukach, 442 F.2d 967, 169 USPQ 795 (CCPA 1971) (subgenus range was not supported by generic disclosure and specific example within the subgenus range); In re Smith, 458 F.2d 1389, 1395, 173 USPQ 679, 683 (CCPA 1972) (an adequate description of a genus may not support claims to a subgenus or species within the genus).
Appropriate corrections required.
Independent claim 6 and 11 contains identical limitations found within that of claim 1. For this reason the same grounds of rejection are applied to claim 1, 6 and 11 above.
Claims that depend on rejected base claims (i.e. claims 1, 6 and 11) inherit by the nature of their dependency all rejections that are applied to their corresponding base claims. Thus, claims 2, 4, 7, 9, 12-16 are, in addition to any separate rejection disclosed above, also rejected using the same grounds of rejection as indicated in the rejection of their corresponding base claims above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 2 and 6, 7, 9, 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Moskovitch, Robert, et al. "Unknown malcode detection using opcode representation." European conference on intelligence and security informatics. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008 (hereinafter “Unknown malcode detection using opcode representation”) in view of Kim et al (U. S. PGPub. No. 2009/0157716 A1) (hereinafter “Kim”), Rokka Chhetri et al. (U. S. PGPub. No. 2022/0253691 A1) (hereinafter “Rokka”); and further in view of Walkup et al. (U. S. PGPub. No. 2018/0349345 A1) (hereinafter “Walkup”) and TANIGUCHI et al. (U. S. PGPub. No. 2020/0065482 A1) (hereinafter “Taniguchi”).
Regarding Claim 1, “Unknown malcode detection using opcode representation” disclose:
A cybersecurity threat information processing method comprising (“Unknown malcode detection using opcode representation”: [Section 6], We presented a methodology for the representation of malicious and benign executables for the task of unknown malicious code detection through OpCode. This presentation, which has not previously been proposed, makes possible the highly accurate detection of unknown malicious code, based on previously seen examples, while maintaining low levels of false alarms. [section 3.2], We created a dataset of malicious and benign executables for the Windows operating system, which is the system most commonly used and most commonly attacked nowadays):
disassembling an input executable file to obtain disassembled code (“Unknown malcode detection using opcode representation”]: [section 3.3], The process of streamlining an executable starts with disassembling it. The disassembly process consists of translating the machine code instructions stored in the executable to a more human-readable language, namely, Assembly language. [Section 3.3], In our problem, binary files (executables) are disassembled and parsed, and n-grams are extracted. Each n-gram term in our problem is analogous to a word in the textual domain).
and reconstructing the disassembled code to obtain reconstructed disassembled code (“Unknown malcode detection using opcode representation”: [Section 3.2], The collection, which, to the best of our knowledge, is the largest ever assembled and used for research, [Page 209, Section 3.3,], The next, and final, step in streamlining the executable is achieved by extracting the sequence of OpCodes generated during the disassembly process, in the same logical order in which the OpCodes appear in the executable, disregarding the extra information available (e.g., memory location, registers, etcetera). [Section 3.3], In our problem, binary files (executables) are disassembled and parsed, and n-grams are extracted. Each n-gram term in our problem is analogous to a word in the textual domain),
wherein functions provided by an operating system (OS) are filtered out from the reconstructed disassembled code (“Unknown malcode detection using opcode representation”: [page 210, section 3.3], We used a filters approach, in which the measure was independent of any classification algorithm, to compare the performances of the different classification algorithms. In a filters approach, a measure is used to quantify the correlation of each feature to the class (malicious or benign) and estimate its expected contribution to the classification task);
The “Unknown malcode detection using opcode representation” does not explicitly disclose:
converting the reconstructed disassembled code into a hash value
However, in an analogous art, Kim disclose:
converting the reconstructed disassembled code into a hash value (Kim: [0010], and a hash calculation unit that calculates a hash value regarding the combined binary data and the case file head. [0034], The hash calculation unit 116 calculates a hash value regarding the combined binary data and the case file head and adds the hash value to the case file head. When the hash calculation unit 116 calculates the hash value, SHA1 and MD5 algorithms may be used. When a copy of the case file is used, the hash value is used to check the integrity of the copied case file),
It would be obvious to a person having ordinary skill in the art, before the effective filing date of the invention, to modify “Unknown malcode detection using opcode representation’s” method of disassembling the executable code and reconstructing disassemble code by applying Kim’s method of calculating a hash value of the combined code, in order to form a byte by combining binary code and check integrity by computing the hash value of combined code([Kim: 0009])
The “Unknown malcode detection using opcode representation” in view of Kim does not explicitly disclose:
and converting the hash value into N-gram data, wherein N is a natural number
However, in an analogous art, Rokka teaches:
and converting the hash value into N-gram data, wherein N is a natural number (Rokka: [0021] At stage A2, the second pre-processing pipeline learns the k most informative n-grams for malware detection based on the corpus 101. The n-gram generator 107 generates n-grams 121 from the cleaned/filtered text from the corpus 101. Domain knowledge will guide setting of the range of n for the n-gram generator 107 (e.g., 3<=n <=8). With this range for n, a dynamic malware analysis report can have more than 100,000 n-grams generated. To avoid the performance impact of calculating statistics for 100,000 n-grams for each report, information gain is calculated and then used to reduce the number of n-grams being considered. [0065], [0065] In the second trained pre-processing pipeline, the ensemble malware detector generates n-grams from the pre-processed report at block 613);
performing ensemble machine learning on the code blocks based on a decision tree having (Rokka: [0025], This description uses the term “boosting model” to refer generally to an ensemble of weak learners (e.g., decision trees) implemented according to a boosting algorithm (e.g., the Catboost algorithm, the gradient boosting machine algorithm, xgboost algorithm, etc….). [0028] FIG. 3 is an example diagram of an ensemble of trained machine learning models generating a malware classification from a dynamic malware analysis report. A deployed ensemble of trained machine learning models for malware detection (“ensemble malware detector” includes two pre-processing pipelines—a first pre-processing pipeline that feeds a trained neural network 309 having a trained embedding layer 305 and a second pre-processing pipeline that partially feeds a trained boosting model 319),
(Rokka: [0079] FIG. 7 depicts an example computer system with an ensemble dynamic malware analysis text based malware detector and/or a trainer for an ensemble dynamic malware analysis text based malware detector. The computer system includes a processor 701 (possibly including multiple processors, multiple cores, multiple nodes, and/or implementing multi-threading, etc.).
A person having ordinary skill in the art, before the effective filing date of the invention, would have found it obvious to modify “Unknown malcode detection using opcode representation” in view of Kim by applying the well-known technique as disclosed by Rokka of generating n-gram data using ensemble of trained machine learning model, in order to simplifying text data for analysis by breaking down complex sentence structures into smaller, more manageable parts. The motivation is to generate a malware detection output (Rokka: [abstract]).
The “Unknown malcode detection using opcode representation” in view of Kim and Rokka does not explicitly disclose:
removing code patterns unrelated an identifier of an attack technique from the N-gram data and constructing one or more code blocks based on N-gram data with the code patterns removed;
wherein two parts of labels are assigned to the block-unit code, wherein a first label is indexed to the identifier of the attack technique and a second label is indexed to an attacker implementing the attack technique;
However, in an analogous art, Walkup disclose:
removing code patterns unrelated an identifier of an attack technique (Walkup: [0013], markup language file may be parsed to remove matching repeated instances of the patterns (=removing code patterns) to generate a reduced code set (=constructing one or more code blocks). [0038], template reduce application removes one or more of all of the instances of the identified patterns from the template document to form a reduced template code object (block 324)));
A person having ordinary skill in the art, before the effective filing date of the invention, would have found it obvious to modify “Unknown malcode detection using opcode representation” in view of Kim and Rokka by applying the well-known technique as disclosed by Walkup of removing repeated patterns to generate a reduced code set. The motivation is to enhancing usability and complexity (e.g., relational associations among code items) of the program code utilized to generate the interfaces (Walkup: [0002])
The “Unknown malcode detection using opcode representation” in view of Kim and Rokka and Walkup does not explicitly disclose:
wherein two parts of labels are assigned to the code blocks, wherein a first label is indexed to the identifier of the attack technique and a second label is indexed to an attacker implementing the attack technique.
However, in an analogous art, Taniguchi teach:
(Taniguchi: [0060] The output unit 50 outputs the evaluation results output by the evaluation unit 30, that is, the output data 34 in which the weight (W) of the combination of the elements with respect to the subgroup is calculated to a file, a display, and the like. For example, the output unit 50 displays a display screen of the evaluation results output by the evaluation unit 30 on a display and the like. With this display, the user can confirm the content of the evaluation results):
wherein two parts of labels are assigned to the code blocks(Taniguchi: [0029] FIG. 2 is an explanatory diagram for explaining the cyber threat intelligence. As illustrated in FIG. 2, in a cyber threat intelligence 11, information on cyberattacks is described in a format such as the Structured Threat Information eXpression (STIX) and the like. For example, the STIX includes eight information groups including cyberattack activities (Campaigns), attackers (Threat_Actors), attack ways (TTPs), detection indicators (Indicators), observables (Observables), incidents (Incidents), counter measures (Courses_Of_Action), and attack targets (Exploit_Targets)),
wherein a first label is indexed to the identifier of the attack technique (Taniguchi: [0032] Furthermore, in an area 11c sandwiched by tags of “TTPs”, an attack method that is used, for example, spam mail, malware, a watering hole attack, and the like is described)
and a second label is indexed to an attacker implementing the attack technique (Taniguchi: [0034] Furthermore, in an area 11f sandwiched by tags of “Threat_Actors”, information regarding a person/organization for contributing to the cyberattack is individually described from viewpoints of a type of the attacker of the cyberattack, synchronization of the attacker, a skill of the attacker, an intention of the attacker, and the like)
and classifying both an identifier of the attack technique and the attacker implementing the attack technique with identifier based on the two parts of the labels (Taniguchi: [0029], the STIX includes eight information groups (=classifying) including cyberattack activities (Campaigns), attackers (Threat_Actors), attack ways (TTPs), detection indicators (Indicators), observables (Observables), incidents (Incidents), counter measures (Courses_Of_Action), and attack targets (Exploit_Targets).
A person having ordinary skill in the art, before the effective filing date of the invention, would have found it obvious to modify “Unknown malcode detection using opcode representation” in view of Kim, Rokka and Walkup by applying the well-known technique as disclosed by Taniguchi of assigning information group to the code and identifying attacker and attacks method (TTP). The motivation is to access history to a communication device to be monitored and determines a network attack based on the acquired access history (Taniguchi: [0003]).
Regarding Claim 2, “Unknown malcode detection using opcode representation” in view of Kim, Rokka , Walkup, Taniguchi teaches:
The method of claim 1 (see rejection of claim 1 above),
wherein the disassembled code includes opcode corresponding to a function included in the input executable file and assembly code, which is an operand of the function (“Unknown malcode detection using Opcode representation”: [Section 3.3], The process of streamlining an executable starts with disassembling it. The disassembly process consists of translating the machine code instructions stored in the executable to a more human-readable language, namely, Assembly language)
Regarding Claim 6, “Unknown malcode detection using Opcode representation” teaches:
A cybersecurity threat information processing apparatus, comprising (“Unknown malcode detection using Opcode representation”: [Section 6], We presented a methodology for the representation of malicious and benign executables for the task of unknown malicious code detection through OpCode. This presentation, which has not previously been proposed, makes possible the highly accurate detection of unknown malicious code, based on previously seen examples, while maintaining low levels of false alarms. [section 3.2], We created a dataset of malicious and benign executables for the Windows operating system, which is the system most commonly used and most commonly attacked nowadays)
“Unknown malcode detection using opcode representation” does not explicitly disclose:
a database configured to store classified malware
and a processor configured to process an input executable file, wherein the processor executes:
However, in an analogous art, Taniguchi teaches:
a database configured to store classified malware (Taniguchi: [0027], The cyber threat intelligence DB 20 is a database that stores the cyber threat intelligences collected by the cyber threat intelligence collection unit 10)
and a processor configured to process an input executable file, wherein the processor executes (Taniguchi: [0010] According to an aspect of the embodiments, a non-transitory computer-readable storage medium storing a program that cause a processor included in an information processing apparatus to execute a process,)
This claim contains identical limitations found within that of claim 1 above albeit directed to a different statutory category (apparatus medium). For this reason, the same grounds of rejection are applied to claim 6.
Regarding Claim 7, This claim contains identical limitations found within that of claim 2 above albeit directed to a different statutory category (apparatus medium). For this reason, the same grounds of rejection are applied to claim 7.
Regarding Claim 9, This claim contains identical limitations found within that of claim 4 above albeit directed to a different statutory category (apparatus medium). For this reason, the same grounds of rejection are applied to claim 9.
Regarding Claim 11, Taniguchi teaches:
A non-transitory computer-readable storage medium storing a cybersecurity threat information processing program, the program being configured to (Taniguchi: [0010] a non-transitory computer-readable storage medium storing a program that cause a processor included in an information processing apparatus to execute a process):
This claim contains identical limitations found within that of claim 1 above albeit directed to a different statutory category (non-transitory computer-readable storage medium). For this reason, the same grounds of rejection are applied to claim 11.
Regarding Claim 12, “Unknown malcode detection using opcode representation” in view of Kim, Rokka , Walkup Taniguchi teaches:
The method of claim 11 (see rejection of claim 11 above),
wherein the disassembled code includes opcode corresponding to a function included in the input executable file and assembly code, which is an operand of the function (“Unknown malcode detection using opcode representation”], The process of streamlining an executable starts with disassembling it. The disassembly process consists of translating the machine code instructions stored in the executable to a more human-readable language, namely, Assembly language)
Claim(s) 4 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Moskovitch, Robert, et al. "Unknown malcode detection using opcode representation. "European conference on intelligence and security informatics. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008 (hereinafter “Unknown malcode detection using opcode representation”) in view of Kim et al (U. S. PGPub. No. 2009/0157716 A1) (hereinafter “Kim”), Rokka Chhetri et al. (U. S. PGPub. No. 2022/0253691 A1) (hereinafter “Rokka”) and Walkup et al. (U. S. PGPub. No. 2018/0349345 A1) (“hereinafter “Walkup”), TANIGUCHI et al. (U. S. PGPub. No. 2020/0065482 A1) (hereinafter “Taniguchi”) and Craioveanu et al. (U. S. 2010/0235913 A1) (hereinafter “Craioveanu”); and further in view of Shabtai et al (U. S. PGPub. No. 2022/0230070 A1) (hereinafter “Shabtai”).
Regarding Claim 4, the “Unknown malcode detection using opcode representation” in view of Kim, Rokka , Walkup, Taniguchi and Craioveanu teaches:
The method of claim 1 (see rejection of claim 1 above),
The combination of the “Unknown malcode detection using opcode representation” in view of Kim, Rokka , Walkup, Taniguchi and Craioveanu does not explicitly disclose:
wherein the converting the hash value into N-gram data includes: converting the hash value into byte data and converting the byte data into 2-gram data.
However, in an analogous art, Shabtai disclose:
wherein the converting the hash value into N-gram data includes (Shabtai: [0182] Additionally, a PE file can be represented using its actual binary data. For example, using byte n-grams (an n-gram is a data structure, originated in computational linguistics, represented by a contiguous sequence of n items usually drawn from a text or speech) to classify malwares has been suggested by [17]. Thus, instead of generating n-grams out of words or characters, [17] suggested generating n-grams out of bytes, while examining different sizes of n-grams ranging from 3 to 6, as well as three feature selection methods. They conducted numerous experiments with four types of models: artificial neural network (ANN), Decision Tree (DT), na?ve Bayes (NB) and Support Vector Machine (SVM). DT was able to achieve the best accuracy of 94.3% with less than 4% of false-positives):
converting the hash value into byte data (Shabtai: [0181] The most common and simple way of representing a PE file is by calculating its hash value [9]. Hash values are generated using special function, namely hash functions, that maps data of arbitrary size onto data of a fixed size (commonly represented by numbers and letters). This method is frequently used by anti-virus engines to “mark” and identify malware, as computing hashes is considered fast and efficient);
and converting the byte data into 2-gram data (Shabtai: [0133] opcode2g: This detector uses features based on the disassembly of the PE file [16]. First, it disassembles the file and extracts the opcode of each instruction. Secondly, it generates bigrams (2-grams) representation of the opcodes. Thirdly, both the TF and DF values are computed for each bigram. Lastly, once again it selects the 300 features with the highest DF values. Using the selected features, a Random Forest classifier with 100 trees was trained).
A person having ordinary skill in the art, before the effective filing date of the invention, would have found it obvious to modify “Unknown malcode detection using opcode representation” in view of Kim, Rokka , Walkup, Taniguchi and Craioveanu by applying the well-known technique as disclosed by Shabtai of converting hash value into 2-gram data. The motivation is to deal with the continuously evolving threats led to significant developments in the malware detection field (Shabtai: [0006]).
Regarding claim 13, “Unknown malcode detection using opcode representation” in view of Kim, Rokka , Walkup, Taniguchi and Craioveanu teaches:
The method of claim 11 (see rejection of claim 11 above),
The non-transitory computer-readable storage medium according to The non-transitory computer-readable storage medium according to wherein the program is configured to: (Taniguchi: [0010] a non-transitory computer-readable storage medium storing a program that cause a processor included in an information processing apparatus to execute a process),
convert the hash value into byte data (Shabtai: [0181] The most common and simple way of representing a PE file is by calculating its hash value [9]. Hash values are generated using special function, namely hash functions, that maps data of arbitrary size onto data of a fixed size (commonly represented by numbers and letters). This method is frequently used by anti-virus engines to “mark” and identify malware, as computing hashes is considered fast and efficient), and convert the byte data into 2-gram data (Shabtai: [0133] opcode2g: This detector uses features based on the disassembly of the PE file [16]. First, it disassembles the file and extracts the opcode of each instruction. Secondly, it generates bigrams (2-grams) representation of the opcodes. Thirdly, both the TF and DF values are computed for each bigram. Lastly, once again it selects the 300 features with the highest DF values. Using the selected features, a Random Forest classifier with 100 trees was trained).
A person having ordinary skill in the art, before the effective filing date of the invention, would have found it obvious to modify “Unknown malcode detection using opcode representation” in view of Kim, Rokka , Walkup, Taniguchi and Craioveanu by applying the well-known technique as disclosed by Shabtai of converting hash values into 2-gram data. The motivation is to deal with the continuously evolving threats led to significant developments in the malware detection field (Shabtai: [0006]).
Claim(s) 14-16 are rejected under 35 U.S.C. 103 as being unpatentable over Moskovitch, Robert, et al. "Unknown malcode detection using opcode representation." European conference on intelligence and security informatics. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008 (hereinafter “Unknown malcode detection using opcode representation”) in view of Kim et al (U. S. PGPub. No. 2009/0157716 A1) (hereinafter “Kim”), Rokka Chhetri et al. (U. S. PGPub. No. 2022/0253691 A1) (hereinafter “Rokka”); and further in view of Walkup et al. (U. S. PGPub. No. 2018/0349345 A1) (hereinafter “Walkup”) and TANIGUCHI et al. (U. S. PGPub. No. 2020/0065482 A1) (hereinafter “Taniguchi”) and Craioveanu et al. (U. S. 2010/0235913 A1) (hereinafter “Craioveanu”).
Regarding Claim 14, “Unknown malcode detection using opcode representation” in view of Kim, Rokka and Walkup and Taniguchi disclose:
The method of claim 1 (see rejection of claim 1 above),
wherein the code blocks are selectively-combined using one or more identifiers of attack techniques (Taniguchi: [0032] Furthermore, in an area 11c sandwiched by tags of “TTPs”, an attack method that is used, for example, spam mail, malware, a watering hole attack, and the like is described);
and providing the cyber intelligence service for the user with proactive countermeasures against the predicted new malware through the user interface (Taniguchi: [0029] FIG. 2 is an explanatory diagram for explaining the cyber threat intelligence. As illustrated in FIG. 2, in a cyber threat intelligence 11, information on cyberattacks is described in a format such as the Structured Threat Information eXpression (STIX) and the like. For example, the STIX includes eight information groups including cyberattack activities (Campaigns), attackers (Threat_Actors), attack ways (TTPs), detection indicators (Indicators), observables (Observables), incidents (Incidents), counter measures (Courses_Of_Action), and attack targets (Exploit_Targets). [0030] That is, the cyber threat intelligence 11 is an example of cyberattack information. Furthermore, at the time of STIX version 1.1.1, the cyber threat intelligence 11 is described in an eXtensible Markup Language (XML) format as illustrated in FIG. 2.).
“Unknown malcode detection using opcode representation” in view of Kim, Rokka and Walkup and Taniguchi does not explicitly disclose:
selectively combining the code blocks and (Craioveanu: [0031], Applicants have recognized that the analysis of a data file may yield a more accurate indication of suspiciousness when bit patterns appearing to be multiple machine instructions are analyzed collectively and in relation to each other, as opposed to each bit pattern matching a machine instruction being analyzed in isolation. [0039], The malware detection system 200 may receive the data files 135 from the email client 110 and forward them to the application 120 only if and when they are deemed unsuspicious (i.e., unlikely to contain any malicious code))
predicting new malware based on the combined code blocks (Craioveanu: [0039], the data files 135 are analyzed before they are loaded by the application 120, as illustrated in FIG. 2. The malware detection system 200 may receive the data files 135 from the email client 110 and forward them to the application 120 only if and when they are deemed unsuspicious (=predicting new malware) (i.e., unlikely to contain any malicious code). If the data files 135 are deemed to be suspicious, the malware detection system 200 may issue a warning and/or solicit her input from the user before allowing the files to be accessed in a manner that may enable them to take any unauthorized action)
A person having ordinary skill in the art, before the effective filing date of the invention, would have found it obvious to modify “Unknown malcode detection using opcode representation” in view of Kim, Rokka, Walkup and Taniguchi by applying the well-known technique as disclosed by Craioveanu of analyzing data files before they are loaded by the application using malware detection system. The motivation is to detect malicious executable code in data collections (e.g., files) and to maximize the speed of the analysis without unduly sacrificing accuracy (Craioveanu: [0005])
Regarding claim 15, this claim contains identical limitations found within that of claim 14 above albeit directed to a different statutory category (apparatus medium). For this reason the same grounds of rejection are applied to claim 14.
Regarding claim 16, this claim contains identical limitations found within that of claim 14 above albeit directed to a different statutory category (non-transitory storage medium). For this reason the same grounds of rejection are applied to claim 16.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of References Cited for a listing of analogous art.
Yang (U. S. Pat. No. 11,574,051 B2): Systems and methods for malware detection using multiple neural networks are provided. According to one embodiment, for each training sample, a supervised learning process is performed, including: (i) generating multiple code blocks of assembly language instructions by disassembling machine language instructions contained within the training sample; (ii) extracting dynamic features corresponding to each of the code blocks by executing each of the code blocks within a virtual environment; (iii) feeding each code block into a first neural network and the corresponding dynamic features into a second neural network; (iv) updating weights and biases of the neural networks based on whether the training sample was malware or benign; and (v) after processing a predetermined or configurable number of the training samples, the neural networks criticize each other and unify their respective weights and biases by exchanging their respective weights and biases and adjusting their respective weights and biases accordingly.
Zou et al. (U. S. Pat. No. 11,463,473 B2): The present disclosure provides a large-scale malware classification system, comprising a client for unloading malware and a server for receiving and classifying the malware, wherein the server comprises a deep learning module for classifying the malware.
Compagna et al. (U. S. PGPub. No. 2017/0109534 A1): A security testing framework leverages attack patterns to generate test cases for evaluating security of Multi-Party Web Applications (MPWAs). Attack patterns comprise structured artifacts capturing key information to execute general-purpose attacker strategies. The patterns recognize commonalities between attacks, e.g., abuse of security-critical parameter(s), and the attacker's strategy relating to protocol patterns associated with those parameters. A testing environment is configured to collect several varieties of HTTP traffic. User interaction with the MPWA while running security protocols, is recorded. An inference module executes the recorded symbolic sessions, tagging elements in the HTTP traffic with labels. This labeled HTTP traffic is referenced to determine particular attack patterns that are to be applied, and corresponding specific attack test cases that are to be executed against the MPWA. Attacks are reported back to the tester for evaluation. Embodiments may be implemented with penetration testing tools, in order to automate execution of complex attacker strategies.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RUPALI DHAKAD whose telephone number is (571)270-3743. The examiner can normally be reached M-F 8:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexander Lagor can be reached at 5712705143. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/R.D./Examiner, Art Unit 2437
/ALEXANDER LAGOR/Supervisory Patent Examiner, Art Unit 2437