Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 02/10/2026 has been entered.
Response to Amendments / Arguments
Regarding the rejection(s) of claims under 35 USC 103:
Applicant’s arguments, filed 02/10/2026, in view of the amended claims, have been fully considered and are not persuasive.
Applicant argues that "the Office has not identified a disclosure in Shuang or Yu of collecting 'function calls' from nodes visited during a static (i.e., without executing the binary file) random walk to form sentences."
In response, paragraphs [0055]-[0058] of Yu explicitly recite that "block 410 starts from running a binary analysis tool to extract a control-flow graph (CFG) for each function in the input program binary" and "The graph walking step 420 traverses the graphs produced by 410, and extracts system-call sequences corresponding to paths including only system-call nodes in those graphs. Block 420 employs different walking strategies, including random walk up to a specified length." Further, [0056] recites "After identifying all the function-call nodes, block 410 performs a CFG reduction... The resulting CFG becomes a function-call graph" and [0057] states "block 420 further performs random walks starting from each system-call node up to a user-specified length for up to a user-specified times to collect system-call sequences." Additionally, paragraphs [0022], [0027], and [0028] confirm this is performed without execution, reciting "embodiments of the present invention use lightweight binary analysis" and "effectively remove the need for runtime instrumentation." Accordingly, Yu teaches collecting function calls (system calls being function calls) from nodes visited during static random walks to form sequences.
Applicant argues that "Yu does not teach or suggest deduplicating sentences generated by static random walks through recursive functions or through a plurality of functions that 'produce substantially identical behaviors,' based on a determination that two sentences are 'substantially identical ... except for a difference in a count of repeated function calls.'"
In response, paragraph [0057] of Yu recites "The combination of n-gram enumeration and random walks addresses the path explosion problem where all target graphs combined become too large and complex to be fully traversed," which includes handling recursive function calls that generate repetitive patterns. Furthermore, paragraphs [0059]-[0060] recite training "an embedding model using system-call-sequence corpuses sampled from different user functions" where "similar sequences would be in close proximity to each other," and paragraph [0065] describes using "Jaccard similarity index" to measure similarity between sequences. Consequently, sequences differing only in repetition counts (e.g., "recvmsg recvmsg" versus "recvmsg recvmsg recvmsg") would be identified as similar through the embedding space proximity and Jaccard similarity measurements, thereby teaching deduplication based on substantial identity except for differences in repeated function call counts.
Applicant argues that Yu does not teach "a plurality of functions produce substantially identical behaviors."
In response, paragraphs [0058]-[0059] of Yu recite "Each system-call-sequence corpus contains one or more system-call sequence... Each corpus represents possible system-call patterns sampled from the related lower-level user function," and paragraph [0066] provides the example where "block 740 creates a candidate set C1 of lower-level user functions from which those identified closest sequences come from," explicitly teaching that multiple different functions can produce similar system-call sequences. The entire purpose of Yu's embedding model in [0059] is to handle this scenario: "similar sequences would be in close proximity to each other." Therefore, Yu teaches that multiple functions produce substantially identical observable behaviors (similar system-call sequences).
Therefore, the identified claim language is considered to be taught by the cited references, and the rejection is maintained. Further, since Applicant has not presented additional arguments concerning the dependent claims, their rejections are likewise maintained.
DETAILED ACTION
This is a reply to the arguments filed on 02/10/2026, in which, claims 1-20 are pending. Claims 1, and 11 are independent.
When making claim amendments, the applicant is encouraged to consider the references in their entireties, including those portions that have not been cited by the examiner and their equivalents as they may most broadly and appropriately apply to any particular anticipated claim amendments.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-2, 6-12 and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Shuang et al. (CN 113434858 A, referred to as Shuang), in view of Yu et al. (US 20220050895 A1, referred to as Yu) in further view of Klic et al. (US 20140258990 A1, referred to as Klic).
In reference to claim 1, A computer-implemented method for binary file analysis (Shuang: [0024]-[0029] Provides for a computer-implemented method for analyzing binary files (malware).) Receiving, by a computer system, a binary file, wherein the binary file comprises executable code; (Shuang: [0004], [0007], [0027] and [0047] Provides for analyzing received PE (portable executable) files.) Generating, by the computer system, assembly code from the binary file, wherein the assembly code comprises a sequence of instructions that can be executed on an other computing system (Shuang: [0004], [0007], [0027] and [0047] Provides for generating assembly code from binary files using a disassembly tool.) Identifying, by the computer system, one or more blocks in the assembly code, wherein each block of the one or more blocks comprises one or more instructions (Shuang: [0027], [0034], [0052] and [0055] Provides for parsing the assembly code to construct a control flow graph, which involves identifying code structures like functions as well as identifying code blocks within the assembly code.) Generating, by the computer system, a directed graph, wherein the directed graph comprises possible execution paths through the one or more blocks (Shuang: [0027]-[0034] and [0055] Provides for generating a control flow graph, which is a type of directed graph representing possible execution paths.) Determining, by the computer system using the directed graph, one or more execution paths through the one or more blocks (Shuang: [0027] and [0052]-[0055] Provides for extracting execution paths from the control flow graph.)
Generating, by the computer system, one or more sentences representing the one or more execution paths through the one or more blocks (Shuang: [0027]-[0034] and [0052]-[0055] Provides that assembly code sequence semantic features are combined with the control flow graph structure features as the basis of the classification of the malware family.) Determining, by the computing system using a language model, a vector representation for each sentence of the one or more sentences (Shuang: [0009], [0028]-[0032] and [0049]-[0052] Provides for using an LSTM model (a type of language model) to create vector representations of code sequences.) Wherein the computer system comprises a processor and memory (Shuang: [0004]-[0007] Provides for the method requiring a computer system with processor and memory to perform the computations and data processing described.) Shuang does not explicitly disclose identifying, by the computer system, one or more functions in the assembly code, wherein each function of the one or more functions comprises one or more instructions, identifying by the computer system, one or more blocks within the one or more functions in the assembly code and wherein determining the one or more execution paths comprises performing a random walk through the directed graph. However, Yu discloses:
Identifying, by the computer system, one or more functions in the assembly code, wherein each function of the one or more functions comprises one or more instructions; (Yu: [0046] and [0055]-[0057] identifying functions and their calls within the extracted CFG, which teaches identifying functions in assembly code.)
Identifying, by the computer system, one or more blocks within the one or more functions in the assembly code (Yu: [0055]-[0057] Provides for each node in the “CFG” represents a block of program instructions that end with an execution-flow change, where possible edges include functions calls.) Wherein determining the one or more execution paths comprises performing a random walk through the directed graph (Yu: [0055]-[0057] Provides for performing random walks through the graph to determine execution paths.)
The sentences comprising function calls collected from nodes visited during the random walk (Yu: [0056]-[0058] Provides for collecting function calls (system calls and user function calls) from nodes visited during random walk traversals, generating sequences that represent function call patterns.)
deduplicating, by the computer system at least one sentence of the one or more sentences, wherein the at least one sentence is generated by performing the random walk through a recursive function of the one or more functions or a plurality of functions of the one or more functions, wherein the plurality of functions produce substantially identical behaviors (Yu: [0042]-[0057] and [0073]-[0075] Provides for handling recursive functions in the exact same way as described in the claim, reducing redundancy in the representation of program behavior.)
To detect the at least one sentence (Yu: [0057]-[0058] Provides for detecting/collecting sentences (sequences) through the two-layer walking strategy.)
Determining that the at least one sentence is the substantially identical to another sentence of the one or more sentences except for a difference in a count of repeated function calls (Yu: [0057]-[0065] Provides for similarity-based comparison between sequences where near-identical sequences (differing only in repetition count) would be close in embedding space.)
Wherein the random walk is performed without executing the binary file (Yu: [0022]-[0027] Provides for static binary analysis without runtime execution.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang, which teaches a computer-implemented method for binary file analysis including generating assembly code from binary files, identifying blocks in the assembly code, generating a directed graph of execution paths, and using a language model to create vector representations of code sequences, with the teachings of Yu, which teaches identifying functions and blocks within functions in assembly code and performing random walks through a control flow graph to determine execution paths. One of ordinary skill in the art would recognize the ability to incorporate function-level analysis and random walk-based path exploration into Shuang's binary analysis method. One of ordinary skill in the art would be motivated to make this modification in order to the effectiveness of the malware classification system.
Shaung in view of Yu do not explicitly teach wherein the one or more heuristics comprise at least detecting a stack frame in the assembly code to identify the start of the one or more functions. However, Klic teaches:
Wherein the one or more heuristics comprise at least detecting a stack frame in the assembly code to identify the start of the one or more functions (Klic: [0026] and [0034] Provides for identifying functions in the code by determining function boundaries. Klic [0025] and [0033] Further provides for using stack frames with call instruction addresses and using the exception handling table to detect function boundaries including start offsets.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang in view of Yu, which together provide a computer-implemented method for binary file analysis with function-level analysis and random walk-based execution path exploration, with the teachings of Klic, which introduces stack frame detection as a heuristic for identifying function boundaries in assembly code. One of ordinary skill in the art would recognize the ability to incorporate Klic's stack frame detection technique into the combined binary analysis system to improve the accuracy of function identification. One of ordinary skill in the art would be motivated to make this modification in order to enhance the precision of function boundary detection in complex binary files.
In reference to claim 2, The method of claim 1, further comprising: determining, by the computing system using vector representations, a classification of the binary file, wherein the classification indicates that the file is malicious or that the file is not malicious (Shuang: [0026] and [0054]-[0057] Provides for using vector representations to classify binary files into malware families. Yu: [0076]-[0078] Provides for using the generated vector representations (in the form of provenance graphs) for intrusion detection.)
In reference to claim 6, The method of claim 1, wherein identifying the one or more functions in the assembly code comprises determining a target address of a call instruction (Shuang: [0054]-[0056] Provides for processing the assembly code to identify "call relationships" which are used as edges in the control flow graph. Yu: [0056] Provides for the process of identifying function calls and determining their target addresses.)
In reference to claim 7, The method of claim 1, wherein identifying the one or more code blocks comprises identifying a branching instruction (Shuang: [0027]-[0028] Provides for constructing a control flow graph where the edges represent "jump relationships" between nodes. Yu: [0055] Provides for identifying code blocks and their boundaries based on changes in execution flow.)
In reference to claim 8, The method of claim 7, wherein the branching instruction comprises a jump instruction (Shuang: [0026]-[0028] Provides for “jump relationships" in the context of constructing the control flow graph. Yu: [0055] Provides for jump instructions as a type of branching instruction used to identify changes in execution flow and, by extension, code block boundaries.)
In reference to claim 9, The method of claim 1, wherein identifying the one or more code blocks comprises identifying an address of a call instruction (Shuang: [0055] Provides for processing the assembly code to identify "call relationships" which are used as edges in the control flow graph. Yu: [0056] Provides for the process of identifying code blocks (nodes in the CFG) based on function calls.)
In reference to claim 10, The method of claim 1, further comprising: determining, by the computing system using the vector representation, a clustering of the binary file, wherein the clustering indicates a similarity of the binary file to a second binary file (Shuang: [0026], [0029] and [0034] Provides for using vector representations of binary files (based on both semantic and structural features) to classify them into malware families. Yu: [0059]-[0066] Provides for creating vector representations (embeddings) of system call sequences, with the property that similar sequences have similar vector representations.)
In reference to claim 11, A computing system comprising for binary file analysis comprising: a non-transitory computer-readable storage medium with instructions encoded thereon; and one or more processors (Shuang: [0024]-[0029] Provides for a computer-implemented method for analyzing binary files (malware).) Receive, by a computer system, a binary file, wherein the binary file comprises executable code;
Shuang: [0004], [0007], [0027] and [0047] Provides for analyzing received PE (portable executable) files. Generate, by the computer system using a decompiler, assembly code from the binary file, wherein the assembly code comprises a sequence of instructions that can be executed on an other computing system (Shuang: [0004], [0007], [0027] and [0047] Provides for generating assembly code from binary files using a disassembly tool.) Identify, by the computer system, one or more blocks in the assembly code, wherein each block of the one or more blocks comprises one or more instructions (Shuang: [0027], [0034], [0052] and [0055] Provides for parsing the assembly code to construct a control flow graph, which involves identifying code structures like functions as well as identifying code blocks within the assembly code.) Generate, by the computer system, a directed graph, wherein the directed graph comprises possible execution paths through the one or more blocks (Shuang: [0027]-[0034] and [0055] Provides for generating a control flow graph, which is a type of directed graph representing possible execution paths.) Determine, by the computer system using the directed graph, one or more execution paths through the one or more code blocks (Shuang: [0027] and [0052]-[0055] Provides for extracting execution paths from the control flow graph.)
Generate, by the computer system, one or more sentences representing the one or more execution paths through the one or more code blocks (Shuang: [0027]-[0034] and [0052]-[0055] Provides that assembly code sequence semantic features are combined with the control flow graph structure features as the basis of the classification of the malware family.) Determine, by the computing system using a language model, a vector representation for each sentence of the one or more sentences (Shuang: [0009], [0028]-[0032] and [0049]-[0052] Provides for using an LSTM model (a type of language model) to create vector representations of code sequences.) Wherein the computer system comprises a processor and memory (Shuang: [0004]-[0007] Provides for the method requiring a computer system with processor and memory to perform the computations and data processing described.) Shuang does not explicitly disclose identifying, by the computer system, one or more functions in the assembly code, wherein each function of the one or more functions comprises one or more instructions, identifying by the computer system, one or more blocks within the one or more functions in the assembly code and wherein determining the one or more execution paths comprises performing a random walk through the directed graph. However, Yu discloses:
Identifying, by the computer system, one or more functions in the assembly code, wherein each function of the one or more functions comprises one or more instructions; (Yu: [0046] and [0055]-[0057] identifying functions and their calls within the extracted CFG, which teaches identifying functions in assembly code.)
Identifying, by the computer system, one or more blocks within the one or more functions in the assembly code (Yu: [0055]-[0057] Provides for each node in the “CFG” represents a block of program instructions that end with an execution-flow change, where possible edges include functions calls.) Wherein determining the one or more execution paths comprises performing a random walk through the directed graph (Yu: [0055]-[0057] Provides for performing random walks through the graph to determine execution paths.)
The sentences comprising function calls collected from nodes visited during the random walk (Yu: [0056]-[0058] Provides for collecting function calls (system calls and user function calls) from nodes visited during random walk traversals, generating sequences that represent function call patterns.)
deduplicating, by the computer system at least one sentence of the one or more sentences, wherein the at least one sentence is generated by performing the random walk through a recursive function of the one or more functions or a plurality of functions of the one or more functions, wherein the plurality of functions produce substantially identical behaviors (Yu: [0042]-[0057] and [0073]-[0075] Provides for handling recursive functions in the exact same way as described in the claim, reducing redundancy in the representation of program behavior.)
To detect the at least one sentence (Yu: [0057]-[0058] Provides for detecting/collecting sentences (sequences) through the two-layer walking strategy.)
Determining that the at least one sentence is the substantially identical to another sentence of the one or more sentences except for a difference in a count of repeated function calls (Yu: [0057]-[0065] Provides for similarity-based comparison between sequences where near-identical sequences (differing only in repetition count) would be close in embedding space.)
Wherein the random walk is performed without executing the binary file (Yu: [0022]-[0027] Provides for static binary analysis without runtime execution.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang, which teaches a computer-implemented method for binary file analysis including generating assembly code from binary files, identifying blocks in the assembly code, generating a directed graph of execution paths, and using a language model to create vector representations of code sequences, with the teachings of Yu, which teaches identifying functions and blocks within functions in assembly code and performing random walks through a control flow graph to determine execution paths. One of ordinary skill in the art would recognize the ability to incorporate function-level analysis and random walk-based path exploration into Shuang's binary analysis method. One of ordinary skill in the art would be motivated to make this modification in order to the effectiveness of the malware classification system.
Shaung in view of Yu do not explicitly teach wherein the one or more heuristics comprise at least detecting a stack frame in the assembly code to identify the start of the one or more functions. However, Klic teaches:
Wherein the one or more heuristics comprise at least detecting a stack frame in the assembly code to identify the start of the one or more functions (Klic: [0026] and [0034] Provides for identifying functions in the code by determining function boundaries. Klic [0025] and [0033] Further provides for using stack frames with call instruction addresses and using the exception handling table to detect function boundaries including start offsets.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang in view of Yu, which together provide a computer-implemented method for binary file analysis with function-level analysis and random walk-based execution path exploration, with the teachings of Klic, which introduces stack frame detection as a heuristic for identifying function boundaries in assembly code. One of ordinary skill in the art would recognize the ability to incorporate Klic's stack frame detection technique into the combined binary analysis system to improve the accuracy of function identification. One of ordinary skill in the art would be motivated to make this modification in order to enhance the precision of function boundary detection in complex binary files.
In reference to claim 12, The computing system of claim 11, wherein the instructions are further configured to cause the computing system to: determine, using vector representations, a classification of the binary file, wherein the classification indicates that the file is malicious or that the file is not malicious. (Shuang: [0026] and [0054]-[0057] Provides for using vector representations to classify binary files into malware families. Yu: [0076]-[0078] Provides for using the generated vector representations (in the form of provenance graphs) for intrusion detection.)
In reference to claim 16, The computing system of claim 11, wherein to identify the one or more functions in the assembly code, the instructions are configured to cause the computing system to determine a target address of a call instruction. (Shuang: [0054]-[0056] Provides for processing the assembly code to identify "call relationships" which are used as edges in the control flow graph. Yu: [0056] Provides for the process of identifying function calls and determining their target addresses.)
In reference to claim 17, The computing system of claim 11, wherein to identify the one or more code blocks, the instructions are configured to cause the computing system to identify a branching instruction. (Shuang: [0027]-[0028] Provides for constructing a control flow graph where the edges represent "jump relationships" between nodes. Yu: [0055] Provides for identifying code blocks and their boundaries based on changes in execution flow.)
In reference to claim 18, The computing system of claim 17, wherein the branching instruction comprises a jump instruction. (Shuang: [0026]-[0028] Provides for “jump relationships" in the context of constructing the control flow graph. Yu: [0055] Provides for jump instructions as a type of branching instruction used to identify changes in execution flow and, by extension, code block boundaries.)
In reference to claim 19, The computing system of claim 11, wherein to identify the one or more code blocks, the instructions are configured to cause the computing system to identify an address of a call instruction. (Shuang: [0055] Provides for processing the assembly code to identify "call relationships" which are used as edges in the control flow graph. Yu: [0056] Provides for the process of identifying code blocks (nodes in the CFG) based on function calls.)
In reference to claim 20, The computing system of claim 11, wherein the instructions are further configured to cause the computing system to: determine, using the vector representation, a clustering of the binary file, wherein the clustering indicates a similarity of the binary file to a second binary file. (Shuang: [0026], [0029] and [0034] Provides for using vector representations of binary files (based on both semantic and structural features) to classify them into malware families. Yu: [0059]-[0066] Provides for creating vector representations (embeddings) of system call sequences, with the property that similar sequences have similar vector representations.)
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Shuang et al. (CN 113434858 A, referred to as Shuang), in view of Yu et al. (US 20220050895 A1, referred to as Yu) in further view of Klic et al. (US 20140258990 A1, referred to as Klic) in even further view of Svore et al. (US 20100332498 A1, referred to as Svore).
In reference to claim 3, Shuang in view of Yu do not explicitly disclose determining that a sentence of the one or more sentence is the same as another sentence of the one or more sentences; and deleting the sentence. However, Svore discloses: The method of claim 1, wherein generating the one or more sentences comprises: determining that a sentence of the one or more sentences is the same as another sentence of the one or more sentences; and deleting the sentence (Svore: [0002] and [0011] Provides for determining duplicate sentences and removing the one or more duplicate sentences.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang in view of Yu, which teaches a computer-implemented method for binary file analysis including generating assembly code from binary files, identifying blocks and functions in the assembly code, generating a directed graph of execution paths, and using a language model to create vector representations of code sequences, with the teachings of Svore, which teaches determining duplicate sentences and removing them. One of ordinary skill in the art would recognize the ability to incorporate duplicate sentence detection and removal into the sentence generation process of Shuang and Yu's binary analysis method. One of ordinary skill in the art would be motivated to make this modification in order to eliminate redundant information and reduce computational overhead.
In reference to claim 13, Shuang in view of Yu do not explicitly disclose determining that a sentence of the one or more sentence is the same as another sentence of the one or more sentences; and deleting the sentence. However, Svore discloses: The computing system of claim 11, wherein to generate the one or more sentences, the instructions are configured to cause the computing system to: determine that a sentence of the one or more sentence is the same as another sentence of the one or more sentences; and delete the sentence.
(Svore: [0002] and [0011] Provides for determining duplicate sentences and removing the one or more duplicate sentences.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang in view of Yu, which teaches a computer-implemented method for binary file analysis including generating assembly code from binary files, identifying blocks and functions in the assembly code, generating a directed graph of execution paths, and using a language model to create vector representations of code sequences, with the teachings of Svore, which teaches determining duplicate sentences and removing them. One of ordinary skill in the art would recognize the ability to incorporate duplicate sentence detection and removal into the sentence generation process of Shuang and Yu's binary analysis method. One of ordinary skill in the art would be motivated to make this modification in order to eliminate redundant information and reduce computational overhead.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 4-5 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Shuang et al. (CN 113434858 A, referred to as Shuang), in view of Yu et al. (US 20220050895 A1, referred to as Yu) in further view of Klic et al. (US 20140258990 A1, referred to as Klic) in further view of Dai et al. (“Efficient Virus Detection Using Dynamic Instruction Sequences”, referred to as Dai).
In reference to claim 4, Shuang in view of Yu do not explicitly disclose determining that a sentence comprises a sequence of adjacent instructions, wherein each instruction in the sequence of adjacent instructions is the same; and removing repeated instructions from the sequence of adjacent instructions. However, Dai discloses: The method of claim 1, wherein generating the one or more sentences comprises: determining that a sentence comprises a sequence of adjacent instructions, wherein each instruction in the sequence of adjacent instructions is the same; and removing repeated instructions from the sequence of adjacent instructions (Dai: Section 4 Fig. 4 Provides for duplicated code being removed to produce abstract assembly.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang in view of Yu, which teaches a computer-implemented method for binary file analysis including generating assembly code from binary files, identifying blocks and functions in the assembly code, generating a directed graph of execution paths, and using a language model to create vector representations of code sequences, with the teachings of Dai, which teaches identifying and removing duplicated code to produce abstract assembly. One of ordinary skill in the art would recognize the ability to incorporate the removal of repeated adjacent instructions into the sentence generation process of Shuang and Yu's binary analysis method. One of ordinary skill in the art would be motivated to make this modification in order to simplify the code representation, reduce redundancy in the analysis, and focus on the essential structure of the code.
In reference to claim 5, Shuang in view of Yu do not explicitly disclose wherein identifying one or more functions in the assembly code comprises identifying a stack frame. However, Dai discloses:
The method of claim 1, wherein identifying one or more functions in the assembly code comprises identifying a stack frame (Dai: Section 4 Provides for a stack frame that is represented by a pointer to the memory address of the frame.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang in view of Yu, which teaches a computer-implemented method for binary file analysis including generating assembly code from binary files, identifying blocks and functions in the assembly code, generating a directed graph of execution paths, and using a language model to create vector representations of code sequences, with the teachings of Dai, which teaches identifying stack frames in assembly code. One of ordinary skill in the art would recognize the ability to incorporate stack frame identification into the function identification process of Shuang and Yu's binary analysis method. One of ordinary skill in the art would be motivated to make this modification in order to gain deeper insights into the structure and behavior of functions within the assembly code.
In reference to claim 14, Shuang in view of Yu do not explicitly disclose determining that a sentence comprises a sequence of adjacent instructions, wherein each instruction in the sequence of adjacent instructions is the same; and removing repeated instructions from the sequence of adjacent instructions. However, Dai discloses: The computing system of claim 11, wherein to generate the one or more sentences, the instructions are configured to cause the computing system to: determine that a sentence comprises a sequence of adjacent instructions, wherein each instruction in the sequence of adjacent instructions is the same; and remove repeated instructions from the sequence of adjacent instructions. (Dai: Section 4 Fig. 4 Provides for duplicated code being removed to produce abstract assembly.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang in view of Yu, which teaches a computer-implemented method for binary file analysis including generating assembly code from binary files, identifying blocks and functions in the assembly code, generating a directed graph of execution paths, and using a language model to create vector representations of code sequences, with the teachings of Dai, which teaches identifying and removing duplicated code to produce abstract assembly. One of ordinary skill in the art would recognize the ability to incorporate the removal of repeated adjacent instructions into the sentence generation process of Shuang and Yu's binary analysis method. One of ordinary skill in the art would be motivated to make this modification in order to simplify the code representation, reduce redundancy in the analysis, and focus on the essential structure of the code.
In reference to claim 15, Shuang in view of Yu do not explicitly disclose wherein identifying one or more functions in the assembly code comprises identifying a stack frame. However, Dai discloses:
The computing system of claim 11, wherein to identify one or more functions in the assembly code, the instructions are configured to cause the computing system to identify a stack frame. (Dai: Section 4 Provides for a stack frame that is represented by a pointer to the memory address of the frame.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Shuang in view of Yu, which teaches a computer-implemented method for binary file analysis including generating assembly code from binary files, identifying blocks and functions in the assembly code, generating a directed graph of execution paths, and using a language model to create vector representations of code sequences, with the teachings of Dai, which teaches identifying stack frames in assembly code. One of ordinary skill in the art would recognize the ability to incorporate stack frame identification into the function identification process of Shuang and Yu's binary analysis method. One of ordinary skill in the art would be motivated to make this modification in order to gain deeper insights into the structure and behavior of functions within the assembly code.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AIDAN EDWARD SHAUGHNESSY whose telephone number is (703)756-1423. The examiner can normally be reached on Monday-Friday from 7:30am to 5pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey Nickerson, can be reached at telephone number (469) 295-9235. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center and the Private Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from Patent Center or Private PAIR. Status information for unpublished applications is available through Patent Center and Private PAIR for authorized users only. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/usptoautomated-interview-request-air-form.
/A.E.S./Examiner, Art Unit 2432
/Jeffrey Nickerson/Supervisory Patent Examiner, Art Unit 2432