DETAILED ACTION
This communication is responsive to the Applicant’s arguments filed on 12/23/2025. Claims 1,3,6,7, 9,12,13,15, 17,18 are pending examination.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to arguments
REJECTIONS UNDER 35 U.S.C. 103:
Applicants’ arguments regarding rejection of claims 1,7,13,15,17, 18 and 3,5,6,9,11,12 have been fully considered and are not persuasive. Applicant argues that the claimed features have distinguishable features than the cited prior art references.
Examiner disagrees.
Applicant argues (a) generating the adjacency matrix comprises updating the adjacency matrix in response to an API that is executed as the APIs included in the source data are sequentially executed being associated with another API.
Examiner relies on Anderson and Cavazos to disclose this argument. Applicant states that Anderson discloses only a fully pre-built, static matrix. However, Anderson discloses representing a sequence-derived graph as a Markov chain, where the edge weight between states corresponds to the transition probability from state i to state j, and further teaches representing that graph with an adjacency matrix. Anderson also separately discloses using API-call information as a malware relevant view. Deriving transition probabilities from an instruction sequence necessarily entails tallying or updating the relevant matrix/edge entries as the sequence is processed. Cavazos further places such graph construction in a malware-detection pipeline and expressly states that the call graphs need not be limited to decompiled binaries, because they can also be derived from source code. Anderson does not explicitly need to state “real-time” to suggest the claim limitation. The cited limitation is satisfied by sequential construction/updating of graph relationships during analysis of the program/API sequence. Anderson’s sequence-to-transition graph along with malware detection framework using code-derived call graphs and machine learning of Cavazos, combinedly disclose the incremental adjacency-matrix population.
Applicant argues (b) activating a region corresponding to APIs connected to each other in the adjacency matrix: and (c) classifying the adjacency matrix using the activated region as an input value for the machine-learning-based analysis mode.
Examiner disagrees. Examiner relies on Yin to disclose the argument. Yin teaches reordering vertices so that the connection information elements are concentrated in a diagonal region of a second adjacency matrix, then applying diagonal convolution/filtering with an activation function to extract local graph features from that region. Yin further explains that higher activation values correspond to a higher probability that the filter structure appears at the corresponding position in the graph, and gives an example in which the region corresponding to an activation value of 0.99 maps to a specific subgraph region defined by particular vertices. That is a region-specific activation of connected graph structure, not merely a cosmetic spatial preprocessing step.
Applicant also argues that Yin’s classifier uses the whole rearranged matrix rather than the activated region. Examiner disagrees. Yin discloses that the second adjacency matrix is input into a feature generation module, which uses filter matrices and activation functions to generate feature vectors corresponding to subgraph classification. These extracted features are then fed into convolution/pooling modules and a class-labeling module, which classifies based on those pooling results. Hence, the operative classifier input in not untouched matrix but it is the extracted/ activated local feature representation derived from the relevant regions of the matrix. This suggests “classifying the adjacency matrix using the activated region as an input value”. Applicants’ argument of “only activated regions” is not valid.
Applicants’ argument that Yin is concerned solely with computation reduction is also inaccurate. Yin discloses both concentrating connection information reduces computation and that the graph-classification system can capture larger subgraph structures and deep implicit correlation features, thereby improving classification accuracy. Hence, Yin provides not just an efficiency justification but also a performance rationale directly tied to classification quality.
Regarding the proposed combination of references:
Applicant argues that the combination of Cavazos, Anderson, and Yin combined, would result in, at best, a static call-graph similarity system with generic matrix reordering and that no reference, individually or in combination, teaches or suggests a real-time-updated adjacency matrix reflecting sequential API execution order, followed by semantic activation of interconnected API clusters and classification based specifically on those activated regions.
Examiner disagrees. Cavazos provides the overall malware-detection/classification framework based on call graphs and machine learning, including DNN-based classification of graph-derived inputs. Anderson provides graph/adjacency matrix representation grounded in sequence-derived transitions and malware-relevant API information. Yin provides adjacency-matrix regularization, localized convolution/activation over concentrated connected regions, and graph classification based on extracted local subgraph features. These teachings are aimed at the same problem which is machine-learning classification of graph-structured program information. Yin’s adjacency matrix feature extraction/classification techniques on the call-graph representations used in Cavazos and Anderson to obtain the advantages Yin identifies such as reduced computation and improved extraction of discriminative local graph patterns within Cavazos’ malware-detection pipeline. Yin is not generic matrix reordering but it is adjacency-matrix based local feature extraction plus CNN-style classification. Anderson is not a just a mere graph image but is a sequential-derived transition representation disclosed as an adjacency matrix. Cavazos is not limited to non-source inputs but it discloses source code derived call graphs and machine learning classification of those graphs. Hence, the references combined accurately suggest the claimed invention and applicants’ arguments are not persuasive. Hence, the rejection is maintained.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1,3,6,7, 9,12,13,15, 17,18 are rejected under 35 U.S.C. 103 as being unpatentable over Cavazos et al. (US 20170068816 A1), hereinafter referred to as Cavazos, in view of Anderson et al. (US 20160306971 A1), hereinafter referred to as Anderson in further view of Yin et al. (US 20200110777 A1), hereinafter referred to as Yin
As per claim 1, an apparatus including a machine-learning unit for detecting and classifying malicious code comprising:
one or more processors; and (Independent processors on a multi-core processor, Cavazos, para [0041]).
a memory for storing one or more instructions, wherein the one or more processors execute the stored one or more instructions to perform operations comprising, (A memory, Cavazos, para [0022])
generating graph information from source data including a plurality of nodes corresponding to APIs included in the source data and one or more edges connecting between the plurality of nodes; (At block 406, call graphs are constructed for the subject
executable, Cavazos, para [0047]. The call graphs can include library calls, system calls and other functions. This does not limit the graph to OS APIs and instead uses a complete call graph capturing system and non-system APIs alike).
generating an adjacency matrix between the APIs included in the source data using the graph information; and (At block 408, similarity vectors are generated. The similarity vectors may be generated using a graph kernel (such as a parallelized graph kernel) that represents the similarity between the subject call graphs and the call graphs for each of the plurality of malware binary executables and the plurality of good ware binary executables, Cavazos, para [0048]).
wherein the graph information is generated using only APIs built into an operating system in order to maintain a constant size of the adjacency matrix to be generated, (The call graphs are constructed by disassembling the executable code and identifying the function calls between procedures, Cavazos, para [0027]).
wherein only call graph information is used as the graph information and the call graph information is text data written in a graph modeling language, (Formally, a call graph (CG) can be represented as G= [V, E], where V is a set of nodes and each node V E V represents one of the functions. E E VxV denotes the directed edges, where an edge e.sub. i, j= (v.sub. i, v.sub. j) represents a call from the caller function represented by v.sub.i to the callee function represented by v.sub.j. Each vertex may be labeled with a feature vector representing a histogram of the instructions in the function, Cavazos, para [0030]. This indicates that the call graph information is structured as data with nodes labeled by feature vectors which can be interpreted as text data or encoded in a graph modeling language for further processing).
However, Cavazos does not explicitly disclose the limitations:
detecting malicious code included in the source data using the adjacency matrix as an input value for a machine-learning-based analysis model,
wherein generating the adjacency matrix comprises updating the adjacency matrix in response to an API that is executed as the APIs included in the source data are sequentially executed being associated with another API,
Anderson discloses:
detecting malicious code included in the source data using the adjacency matrix as an input value for a machine-learning-based analysis model, (Each subroutine is automatically labeled in a function call graph and the use of a probabilistic approach to find signatures of malware, Anderson, para [0028]).
wherein generating the adjacency matrix comprises updating the adjacency matrix in response to an API that is executed as the APIs included in the source data are sequentially executed being associated with another API, (A multiview approach may be used to construct the subroutine kernel (or similarity) matrix for use in the classification method. The different views may include the instructions contained within each subroutine, the Application Programming Interface (API) calls contained within each subroutine, and the subroutine's neighbor information. The API calls are performed within the subroutine. the edge weights for edges originating at v.sub. i are required to sum to 1, E.sub.i.fwdarw.je.sub.j=1. A nxn (n=|V|) adjacency matrix is used to represent the graph, where for each entry a.sub. ij in the matrix, a.sub. ij=e.sub.ij, Anderson, para [0036]. This implies that the graph structure and thus adjacency matrix evolves as APIs are analyzed in their sequential execution context within subroutines. Because the adjacency matrix encodes API call relationships within subroutines and accounts for neighboring subroutines, it naturally updates to represent the sequential execution and associations between APIs as discovered. The kernel matrix constructed via this process is used for classification (i.e. malware detection), which requires incremental or dynamic updates to accurately capture changing API call relationships as the source data is processed).
A person of ordinary skill in the art before the effective filing date of the claimed invention would have combined Cavazos and Anderson by malware detection methods and systems (Cavazos) and automated malware identification (Anderson). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Cavazos and Anderson in order to effectively identify a subroutine as potentially indicative of malware in a call graph (See Anderson, para [0036]).
Cavazos in view of Anderson does not explicitly disclose:
wherein detecting the malicious code included in the source data comprises,
activating a region corresponding to APIs connected to each other in the adjacency matrix; and
classifying the adjacency matrix using the activated region as an input value for the machine-learning-based analysis model.
Yin discloses:
wherein detecting the malicious code included in the source data comprises, activating a region corresponding to APIs connected to each other in the adjacency matrix; and (The connection information is concentrated in the adjacency matrix into the diagonal region of the adjacency matrix, and further uses the filter matrix to extract the subgraph structure of the graph in the diagonal direction, greatly reducing the computational complexity. The amount of computation can be reduced to 25% compared with latter, Yin, para [0028])
classifying the adjacency matrix using the activated region as an input value for the machine-learning-based analysis model. (At the same time, the stacked CNN is used for feature extraction to capture large multi-vertex subgraph structures and deep features of the topological structure through smaller windows size, Yin, para [0028]).
A person of ordinary skill in the art before the effective filing date of the claimed invention would have combined Cavazos and Anderson by malware detection methods and systems (Cavazos) and automated malware identification (Anderson) with graph feature extraction based on adjacency matrix (Yin). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Cavazos and Anderson with Yin in order to improve the accuracy of graph classification (See Yin, para [0063]).
As per claim 3, Cavazos, Anderson and Yin disclose the apparatus of claim 1, wherein
Furthermore, Yin discloses:
the adjacency matrix is a two- dimensional matrix containing one or more columns corresponding to the API included in the source data and one or more rows corresponding to the API included in the source data (The connection information regularization module is configured to reorder all the vertices in the first adjacency matrix of the graph to obtain a second adjacency matrix, and the connection information elements in the second adjacency matrix are mainly distributed in a diagonal area of n of second adjacency, where n is a positive integer, n≥2 and n<|V|, IVI is the number of rows or columns of the second adjacency matrix, preferably, the diagonal region refers to the diagonal region from the upper left corner to the lower right corner of the matrix. For example, the shaded region in FIG. 1 is a diagonal region with a width of 3 in a 6x6 adjacency matrix, Yin, para [0063]).
A person of ordinary skill in the art before the effective filing date of the claimed invention would have combined Cavazos and Anderson by malware detection methods and systems (Cavazos) and automated malware identification (Anderson) with graph feature extraction based on adjacency matrix (Yin). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Cavazos and Anderson with Yin in order to improve the accuracy of graph classification (See Yin, para [0063])
As per claim 6, Cavazos, Anderson and Yin disclose the apparatus of claim [[5]] 1, wherein
Furthermore, Yin discloses:
classifying the adjacency matrix is performed by a convolution neural network algorithm using the activated region as an input image (The invention first concentrates the connection information elements in the adjacency matrix into a specific diagonal region of the adjacency matrix which reduces the non-connection information elements in advance. Then the subgraph structure of the graph is further extracted along the diagonal direction using the filter matrix. Further, it uses a stacked convolutional neural network to extract a larger subgraph structure, Yin, para [0029]- [0030]).
A person of ordinary skill in the art before the effective filing date of the claimed invention would have combined Cavazos and Anderson by malware detection methods and systems (Cavazos) and automated malware identification (Anderson) with graph feature extraction based on adjacency matrix (Yin). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Cavazos and Anderson with Yin in order to improve the accuracy of graph classification (See Yin, para [0063]).
As per claim 7, Cavazos discloses a method for detecting and classifying malicious code comprising:
generating, by a malicious code detection and classification apparatus, graph information from source data including a plurality of nodes corresponding to APIs included in the source data and one or more edges connecting between the plurality of nodes; (At block 406, call graphs are constructed for the subject executable, Cavazos, para [0047]. The call graphs can include library calls, system calls and other functions. This does not limit the graph to OS APIs and instead uses a complete call graph capturing system and non-system APIs alike)
generating, by the malicious code detection and classification apparatus, an adjacency matrix between the APIs included in the source data using the graph information; and (At block 408, similarity vectors are generated. The similarity vectors may be generated using a graph kernel (such as a parallelized graph kernel) that represents the similarity between the subject call graphs and the call graphs for each of the plurality of malware binary executables and the plurality of good ware binary executables, Cavazos, para [0048]).
wherein the graph information is generated using only APIs built into an operating system in order to maintain a constant size of the adjacency matrix to be generated, (The call graphs are constructed by disassembling the executable code and identifying the function calls between procedures, Cavazos, para [0027]).
wherein only call graph information is used as the graph information and the call graph information is text data written in a graph modeling language, (Formally, a call graph (CG) can be represented as G= [V, E], where V is a set of nodes and each node V E V represents one of the functions. E E VxV denotes the directed edges, where an edge e.sub. i, j= (v.sub. i, v.sub. j) represents a call from the caller function represented by v.sub.i to the callee function represented by v.sub.j. Each vertex may be labeled with a feature vector representing a histogram of the instructions in the function, Cavazos, para [0030]. This indicates that the call graph information is structured as data with nodes labeled by feature vectors which can be interpreted as text data or encoded in a graph modeling language for further processing).
However, Cavazos does not explicitly disclose the limitations:
detecting, by the malicious code detection and classification apparatus, malicious code included in the source data using the adjacency matrix as an input value for a machine-learning- based analysis model,
wherein generating the adjacency matrix comprises updating, by the malicious code detection and classification apparatus, the adjacency matrix in response to an API that is executed as the APIs included in the source data are sequentially executed being associated with another API,
Anderson discloses:
detecting, by the malicious code detection and classification apparatus, malicious code included in the source data using the adjacency matrix as an input value for a machine-learning- based analysis model, (Each subroutine is automatically labeled in a function call graph and the use of a probabilistic approach to find signatures of malware, Anderson, para [0028]).
wherein generating the adjacency matrix comprises updating, by the malicious code detection and classification apparatus, the adjacency matrix in response to an API that is executed as the APIs included in the source data are sequentially executed being associated with another API, (A multi view approach may be used to construct the subroutine kernel (or similarity) matrix for use in the classification method. The different views may include the instructions contained within each subroutine, the Application Programming Interface (API) calls contained within each subroutine, and the subroutine's neighbor information. The API calls are performed within the subroutine. the edge weights for edges originating at v.sub. i are required to sum to 1, E.sub.i.fwdarw.je.sub.ij=1. A nxn (n=|V|) adjacency matrix is used to represent the graph, where for each entry a.sub. in the matrix, a.sub.ij=e.sub.ij, Anderson, para [0036]. This implies that the graph structure and thus adjacency matrix evolves as APIs are analyzed in their sequential execution context within subroutines. Because the adjacency matrix encodes API call relationships within subroutines and accounts for neighboring subroutines, it naturally updates to represent the sequential execution and associations between APIs as discovered. The kernel matrix constructed via this process is used for classification (i.e. malware detection), which requires incremental or dynamic updates to accurately capture changing API call relationships as the source data is processed).
A person of ordinary skill in the art before the effective filing date of the claimed invention would have combined Cavazos and Anderson by malware detection methods and systems (Cavazos) and automated malware identification (Anderson). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Cavazos and Anderson in order to effectively identify a subroutine as potentially indicative of malware in a call graph (See Anderson, para [0036]).
Cavazos in view of Anderson does not explicitly disclose the limitations:
wherein the malicious code included in the source data comprises, activating, by the malicious code detection and classification apparatus, a region corresponding to APIs connected to each other in the adjacency matrix by a filter; and
classifying, by the malicious code detection and classification apparatus, the adjacency matrix using the activated region as an input value for the machine-learning based analysis model.
Yin discloses:
wherein the malicious code included in the source data comprises, activating, by the malicious code detection and classification apparatus, a region corresponding to APIs connected to each other in the adjacency matrix by a filter; and (The connection information is concentrated in the adjacency matrix into the diagonal region of the adjacency matrix, and further uses the filter matrix to extract the subgraph structure of the graph in the diagonal direction, greatly reducing the computational complexity. The amount of computation can be reduced to 25% compared with latter, Yin, para [0028])
classifying, by the malicious code detection and classification apparatus, the adjacency matrix using the activated region as an input value for the machine-learning based analysis model. (At the same time, the stacked CNN is used for feature extraction to capture large multi-vertex subgraph structures and deep features of the topological structure through smaller windows size, Yin, para [0028]).
A person of ordinary skill in the art before the effective filing date of the claimed invention would have combined Cavazos and Anderson by malware detection methods and systems (Cavazos) and automated malware identification (Anderson) with graph feature extraction based on adjacency matrix (Yin). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Cavazos and Anderson with Yin in order to improve the accuracy of graph classification (See Yin, para [0063]).
As per claim 9, Cavazos, Anderson and Yin disclose the method of claim 7, wherein
Furthermore, Yin discloses:
generating the adjacency matrix comprises generating, by the malicious code detection and classification apparatus, a two- dimensional matrix containing one or more columns corresponding to the API included in the source data and one or more rows corresponding to the API included in the source data (The connection information regularization module is configured to reorder all the vertices in the first adjacency matrix of the graph to obtain a second adjacency matrix, and the connection information elements in the second adjacency matrix are mainly distributed in a diagonal area of n of second adjacency, where n is a positive integer, n≥2 and n<|V|, |V| is the number of rows or columns of the second adjacency matrix, preferably, the diagonal region refers to the diagonal region from the upper left corner to the lower right corner of the matrix. For example, the shaded region in FIG. 1 is a diagonal region with a width of 3 in a 6x6 adjacency matrix, Yin, para [0063]).
A person of ordinary skill in the art before the effective filing date of the claimed invention would have combined Cavazos and Anderson by malware detection methods and systems (Cavazos) and automated malware identification (Anderson) with graph feature extraction based on adjacency matrix (Yin). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Cavazos and Anderson with Yin in order to improve the accuracy of graph classification (See Yin, para [0063]).
As per claim 12, Cavazos, Anderson and Yin disclose the method of claim 11, wherein
Furthermore, Yin discloses:
classifying the adjacency matrix is performed by a convolutional neural network algorithm using the activated region as an input image (The invention first concentrates the connection information elements in the adjacency matrix into a specific diagonal region of the adjacency matrix which reduces the non-connection information elements in advance. Then the subgraph structure of the graph is further extracted along the diagonal direction using the filter matrix. Further, it uses a stacked convolutional neural network to extract a larger subgraph structure, Yin, para [0029]- [0030])
A person of ordinary skill in the art before the effective filing date of the claimed invention would have combined Cavazos and Anderson by malware detection methods and systems (Cavazos) and automated malware identification (Anderson) with graph feature extraction based on adjacency matrix (Yin). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Cavazos and Anderson with Yin in order to improve the accuracy of graph classification (See Yin, para [0063]).
As per claim 13, Cavazos, Anderson and Yin disclose a computer-readable recording medium
Furthermore, Cavazos discloses:
storing a computer program for executing the malicious code detection and classification method according to claim 7 combined with hardware (Detection system 100 and heterogeneous system 102, Cavazos, para [0021])
As per claim 15, Cavazos and Anderson disclose a computer-readable recording medium
Furthermore, Cavazos discloses:
storing a computer program for executing the malicious code detection and classification method according to claim 9 combined with hardware (Detection system 100 and heterogeneous system 102, Cavazos, para [0021]).
As per claim 17, Cavazos, Anderson and Yin disclose a computer-readable recording medium
Furthermore, Cavazos discloses:
storing a computer program for executing the malicious code detection and classification method according to claim [[11]] 7 combined with hardware (Detection system 100 and heterogeneous system 102, Cavazos, para [0021])
As per claim 18, Cavazos and Anderson disclose a computer-readable recording medium
Furthermore, Cavazos discloses:
storing a computer program for executing the malicious code detection and classification method according to claim 12 combined with hardware (Detection system 100 and heterogeneous system 102, Cavazos, para [0021]).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAGHAVENDER CHOLLETI whose telephone number is (703) 756-1065. The examiner can normally be reached M-Th 7:30AM -4:30PM EST and variable Fridays.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, RUPAL DHARIA can be reached on (571) 272-3880. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patentcenter for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Respectfully Submitted,
/RAGHAVENDER NMN CHOLLETI/Examiner, Art Unit 2492
/RUPAL DHARIA/ Supervisory Patent Examiner, Art Unit 2492