DETAILED ACTION
Authorization for Internet Communications
The examiner encourages Applicant to submit an authorization to communicate with the examiner via the Internet by making the following statement (from MPEP 502.03):
“Recognizing that Internet communications are not secure, I hereby authorize the USPTO to communicate with the undersigned and practitioners in accordance with 37 CFR 1.33 and 37 CFR 1.34 concerning any subject matter of this application by video conferencing, instant messaging, or electronic mail. I understand that a copy of these communications will be made of record in the application file.”
Please note that the above statement can only be submitted via Central Fax (not Examiner's Fax), Regular postal mail, or EFS Web using PTO/SB/439.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/11/2025 has been entered.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1 – 6, 8 – 12, 14 – 17 and 19 - 22 are rejected under 35 U.S.C. 103 as being unpatentable over the prior art of record, Sydow et al., (US 2024/0265101 A1) (hereinafter “Sydow”) in view of the prior art of record, Kimball et al., (US 10,817,604 B1) (hereinafter “Kimball”) and Kurita et al., (US 2007/0074177 A1) (hereinafter “Kurita”).
Regarding claim 1, Sydow discloses; a computer-implemented method comprising:
receiving, by one or more processors, a source code file [i.e., a plurality of programming languages (page 5, para 0069) i.e., code (page 4, para 0057 – 0059) i.e., code snippet of the code (page 4, para 0061) i.e., “a given portion of the source code” (page 4, para 0050)];
matching, by the one or more processors, source code from the source code file [i.e., “a given portion of the source code” (page 4, para 0050)] to plurality of program slices [i.e., “a corpus of source code” (page 4, para 0050 and 0052 - 61)] by parsing the source code and mapping a portion of the source code to a program slice of the plurality of program slices [i.e., detecting changes between a code snippet of the code “lowest=np.abs(vec).max()” [interpreted as “source code from the one or more source code files”] to a source code “lowest=np.abs(vec).min()” [interpreted as “one or more program slices”] and determining that the code snippet “if (progressDialog.isShowingo && progressDialog!=null)” [interpreted as “source code from the one or more source code files”] should have been written as “if (progressDialog!=null && progressDialog.isShowingo)” [interpreted as “one or more program slices”] (page 4, para 0052 - 0061)], wherein the program slice comprises a program statement associated with a vulnerability [i.e., this pattern of changes corresponds to a copy-paste error (page 4, para 0060) i.e., a portions of code is labeled as anomalies (page 4, para 0050)];
generating, by the one or more processors and using a predictive machine learning model, a vulnerability prediction for the source code file [i.e., the second machine learning model is configured to detect one or more code anomalies across a plurality of programming languages (see reference 506 of figure 5), (page 5, para 0069), (page 4, para 0050 and 0052 - 0061)], the vulnerability prediction comprising: (a) a location of vulnerable code in the source code based on the matching [i.e., detecting changes between a code snippet of the code “lowest=np.abs(vec).max()” to a source code “lowest=np.abs(vec).min()” and determining that the code snippet “if (progressDialog.isShowingo && progressDialog!=null)” should have been written as “if (progressDialog!=null && progressDialog.isShowingo)” (page 4, para 0052 - 0061) i.e., a given portion of the source code (emphasis added) is an anomaly or not (page 4, para 0050)], and (b) a vulnerability class associated with the location of vulnerable code [i.e., a given portion of the source code is an anomaly or not (emphasis added) (page 4, para 0050)], wherein: (i) the predictive machine learning model is trained based on a training dataset [i.e., the second machine learning model is…trained based at least in part on a training dataset (see reference 506 of figure 5), (page 5, para 0069)], and (ii) the training dataset is generated by:
(1) receiving a plurality of training source code files and a plurality of vulnerability classes associated with the plurality of training source code files [i.e., the training datasets is generated based on one or more available applications (e.g., open source applications). Such applications include one or more know code anomalies (page 4, para 0050), (page 5, par 0071), (see reference 502 of figure 5), (page 5, para 0067)],
(2) receiving a plurality of syntax features corresponding to the plurality of vulnerability classes [i.e., the one ore more code anomalies correspond to at least one of a syntax anomaly (page 5, para 0071), (page 4, para 0050), (see reference 502 of figure 5), (page 5, para 0067)],
(3) determining a program slicing criterion based on the plurality of syntax features [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050), (page 4, para 0053 – 0061), and (see reference 502 of figure 5)],
(4) extracting a set of program slices from the plurality of training source code files based on the program slicing criterion [i.e., parsing the corpus of source code or the code snippet (para 0053 – 0055) to be trained is based on the vulnerability i.e., copy-paste error, variable name and standard programming practices (page 4, para 0053 – 0061), (see reference 502 of figure 5)], and
(5) labeling the set of program slices with the plurality of vulnerability classes [i.e., these portions of code can be labeled as anomalies (page 4, para 0050)]; and
initiating, by the one or more processors, one or more prediction-based actions based on the vulnerability prediction [i.e., causing one or more automated actions to be performed in response to the second machine learning model detecting at least one anomaly in the source code (page 5, para 0070, (see reference 508 of figure 5)].
Sydow does not disclose;
wherein: (i) the vulnerability prediction indicates whether the source code file increases a susceptibility to a malicious attack.
However, Kimball disclose;
Vulnerability prediction indicates whether a source code file increases a susceptibility to a malicious attack [i.e., In step 310, a computer is configured to process one or more annotated source code files using a software assurance tool to identify one or more security threats (anomalies) (emphasis added) Note; the “the software assurance tool identifying security threats (anomalies) in source code file” clearly reads on “vulnerability prediction indicating whether a source code file is susceptible to malicious attack”). Then the one or more security threat levels are ranked (Note; the ranking of the source code file in order of the security threat level clearly reads on “indicating whether a source code file increases a susceptibility to a malicious attack” because the ranking of the source code based on threat leave i.e., ranking source code 1 to threat level 1, source code 2 to thread level 2 reads on “indicating whether source code file increases a susceptibility to a malicious attack” )…in step 312, display ranked list of threat level (col. 14, lines 21 – 32), (See Ref. 308 – 312 of figure 3)].
Before the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow by adapting the teachings of Kimball to reduce the risk of false positives when identifying anomalies are related to malicious attacks versus non-malicious (See Kimball; col. 2, line 67 and Col. 3, lines 1 – 6).
Sydow and Kimball do not disclose;
wherin one or more program slices of the set of program slices is extracted by performing a forward slice or backward slice.
However, Kurita discloses;
one or more program slices of a set of program slices is extracted by performing a forward slice or backward slice [i.e., extracting particular program portions from the dependency graph include two methods, forward slice and backward slice methods…to extract a command statement (page 1, para 0006 - 0007) i.e., a program slice is a technique of using start point information on a program to extract a portion of the program which influences the start point or is influenced by the start point (page 1, para 0004)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004).
Regarding claim 2, Sydow discloses; the computer-implemented method of claim 1, wherein (i) determining the program slicing criterion further comprises determining a plurality of potential vulnerability candidates by performing static analysis on a plurality of program statements associated with the plurality of training source code files and matching the plurality of program statements associated with the plurality of training source code files with the plurality of syntax features [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies [interpreted as “one or more penitential vulnerability candidates”]…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050) Note; this is static analysis].
Sydow and Kimball do not disclose;
(ii) the program slicing criterion comprises a set of variables corresponding to a plurality of values that are required to be preserved, (iii) the forward slice comprises a program statement affected by the program slicing criterion; and (iv) the backward slice comprises a program statement that affects the program slicing criterion.
However, Kurita discloses;
(ii) a program slicing criterion [i.e., start point (page 1, para 0006 – 0007)] comprises a set of variables corresponding to a plurality of values that are required to be preserved [i.e., the slice uses a start point command statement, data dependencies are defined in terms of data referred to and referred by other statement (page 1, para 0005) Note; therefore, the “data referred to” are effectively variables, and the slice is concerned with the computation/flow that determines those variables at/around the start point], (iii) the forward slice comprises a program statement affected by the program slicing criterion [i.e., by the forward slice method, a command statement to be used as a start point of slice process is selected, and the dependency graph is traced in the program execution order to thereby extract a command statement which is influences by the start point (“affected by the program slicing criterion”) (page 1, para 0004, and 0006)]; and (iv) the backward slice comprises a program statement that affects the program slicing criterion [i.e., by the backward slice method, a command statement to be used as a start point of a slice process is selected and the dependency graph is traced in a reverse order of the program execution order to thereby extract a command statement which influences the start point (“affects the program slicing criterion”) (page 1, para 0004, and 0007)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004).
Regarding claim 3, Sydow discloses; the computer-implemented method of claim 2, wherein the static analysis comprises generating, for a training source code file of the plurality of training source code files [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies [interpreted as “one or more penitential vulnerability candidates”]…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050) Note; this is static analysis].
Sydow and Kimball do not disclose;
at least one of: a program dependency graph, a data dependency graph, and a control dependency graph.
However, Kurita discloses;
at least one of: a program dependency graph [i.e., a dependency graph (page 1, para 0005], a data dependency graph [i.e., a dependency graph (page 1, para 0005], and a control dependency graph [i.e., a control dependency relation (page 1, para 0005)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004).
Regarding claim 4, Sydow discloses; the computer-implemented method of claim 1, wherein the plurality of syntax features comprises operators in expression [i.e., These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050), (page 4, para 0053 – 0061), and (see reference 502 of figure 5)].
Regarding claim 5, Sydow discloses; the computer-implemented method of claim 1, wherein extracting the plurality of program slices comprises generating a source code subset, the source code subset comprising the program statements from the plurality of training source code files contributing to the vulnerabilities [i.e., parsing the corpus of source code or the code snippet (para 0053 – 0055) to be trained is based on the vulnerability i.e., copy-paste error, variable name and standard programming practices (page 4, para 0053 – 0061), (see reference 502 of figure 5)].
Regarding claim 6, Sydow discloses; the computer-implemented method of claim 1, wherein the training dataset comprises the plurality of program slices assigned with labels associated with the plurality of vulnerability classes [i.e., these portions of code can be labeled as anomalies (page 4, para 0050)].
Regarding claim 8, Sydow discloses; a system [i.e., the processing device 702-1 (page 6, para 0086), (see figure 7)] comprising:
one or more processor [i.e., (see figure 7)]: and
at least one memory storing processor-executable instructions [i.e., the processing device comprises a memory 712 (page 7, para 0090), (see figure 7)] that, when executed by any one or more of the one or more processor [i.e., the processing device comprises a processor 710 coupled with the memory (see figure 7)], causes the one or more processors to perform operations comprising:
receive a source code files [i.e., a plurality of programming languages (page 5, para 0069) i.e., code (page 4, para 0057 – 0059) i.e., code snippet of the code (page 4, para 0061) i.e., “a given portion of the source code” (page 4, para 0050)];
match source code from the source code file [i.e., “a given portion of the source code” (page 4, para 0050)] to a plurality of program slices [i.e., “a corpus of source code” (page 4, para 0050 and 0052 - 61)] by parsing the source code and mapping a portion of the source code to a program slice of the plurality of program slices [i.e., detecting changes between a code snippet of the code “lowest=np.abs(vec).max()” [interpreted as “source code from the one or more source code files”] to a source code “lowest=np.abs(vec).min()” [interpreted as “one or more program slices”] and determining that the code snippet “if (progressDialog.isShowingo && progressDialog!=null)” [interpreted as “source code from the one or more source code files”] should have been written as “if (progressDialog!=null && progressDialog.isShowingo)” [interpreted as “one or more program slices”] (page 4, para 0052 - 0061)], wherein the program slice comprises a program statement associated with a vulnerability [i.e., this pattern of changes corresponds to a copy-paste error (page 4, para 0060) i.e., a portions of code is labeled as anomalies (page 4, para 0050)];
generate, using a predictive machine learning model, a vulnerability prediction for the source code file [i.e., the second machine learning model is configured to detect one or more code anomalies across a plurality of programming languages (see reference 506 of figure 5), (page 5, para 0069), (page 4, para 0050 and 0052 - 0061)], the vulnerability prediction comprising: (a) a location of vulnerable code in the source code based on the matching [i.e., detecting changes between a code snippet of the code “lowest=np.abs(vec).max()” to a source code “lowest=np.abs(vec).min()” and determining that the code snippet “if (progressDialog.isShowingo && progressDialog!=null)” should have been written as “if (progressDialog!=null && progressDialog.isShowingo)” (page 4, para 0052 - 0061) i.e., a given portion of the source code (emphasis added) is an anomaly or not (page 4, para 0050)], and (b) a vulnerability class associated with the location of vulnerable code [i.e., a given portion of the source code is an anomaly or not (emphasis added) (page 4, para 0050)], wherein: (i) the predictive machine learning model is trained based on a training dataset [i.e., the second machine learning model is…trained based at least in part on a training dataset (see reference 506 of figure 5), (page 5, para 0069)], and (ii) the training dataset is generated by:
(1) receiving a plurality of training source code files and a plurality of vulnerability classes associated with the plurality of training source code files [i.e., the training datasets is generated based on one or more available applications (e.g., open source applications). Such applications include one or more know code anomalies (page 4, para 0050), (page 5, par 0071), (see reference 502 of figure 5), (page 5, para 0067)],
(2) receiving a plurality of syntax features corresponding to the plurality of vulnerability classes [i.e., the one ore more code anomalies correspond to at least one of a syntax anomaly (page 5, para 0071), (page 4, para 0050), (see reference 502 of figure 5), (page 5, para 0067)],
(3) determining a program slicing criterion based on the plurality of syntax features [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050), (page 4, para 0053 – 0061), and (see reference 502 of figure 5)],
(4) extracting a set of program slices from the plurality of training source code files based on the program slicing criterion [i.e., parsing the corpus of source code or the code snippet (para 0053 – 0055) to be trained is based on the vulnerability i.e., copy-paste error, variable name and standard programming practices (page 4, para 0053 – 0061), (see reference 502 of figure 5)], and
(5) labeling the set of program slices with the plurality of vulnerability classes [i.e., these portions of code can be labeled as anomalies (page 4, para 0050)]; and
initiate one or more processors, the performance of one or more prediction-based actions based on the vulnerability prediction [i.e., causing one or more automated actions to be performed in response to the second machine learning model detecting at least one anomaly in the source code (page 5, para 0070, (see reference 508 of figure 5)].
Sydow does not disclose;
wherein: (i) the vulnerability prediction indicates whether the source code file increases a susceptibility to a malicious attack.
However, Kimball disclose;
Vulnerability prediction indicates whether a source code file increases a susceptibility to a malicious attack [i.e., In step 310, a computer is configured to process one or more annotated source code files using a software assurance tool to identify one or more security threats (anomalies) (emphasis added) Note; the “the software assurance tool identifying security threats (anomalies) in source code file” clearly reads on “vulnerability prediction indicating whether a source code file is susceptible to malicious attack”). Then the one or more security threat levels are ranked (Note; the ranking of the source code file in order of the security threat level clearly reads on “indicating whether a source code file increases a susceptibility to a malicious attack” because the ranking of the source code based on threat leave i.e., ranking source code 1 to threat level 1, source code 2 to thread level 2 reads on “indicating whether source code file increases a susceptibility to a malicious attack” )…in step 312, display ranked list of threat level (col. 14, lines 21 – 32), (See Ref. 308 – 312 of figure 3)].
Before the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow by adapting the teachings of Kimball to reduce the risk of false positives when identifying anomalies are related to malicious attacks versus non-malicious (See Kimball; col. 2, line 67 and Col. 3, lines 1 – 6).
Sydow and Kimball do not disclose;
wherin one or more program slices of the set of program slices is extracted by performing a forward slice or backward slice.
However, Kurita discloses;
one or more program slices of a set of program slices is extracted by performing a forward slice or backward slice [i.e., extracting particular program portions from the dependency graph include two methods, forward slice and backward slice methods…to extract a command statement (page 1, para 0006 - 0007) i.e., a program slice is a technique of using start point information on a program to extract a portion of the program which influences the start point or is influenced by the start point (page 1, para 0004)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004).
Regarding claim 9, Sydow discloses; the system claim 8, wherein; (i) determining the program slicing criterion further comprises determining a plurality of potential vulnerability candidates by performing static analysis on a plurality of program statements associated with the plurality of training source code files and matching the plurality of program statements associated with plurality of training source code files with the plurality of syntax features [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies [interpreted as “one or more penitential vulnerability candidates”]…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050) Note; this is static analysis].
Sydow and Kimball do not disclose;
(ii) the program slicing criterion comprises a set of variables corresponding to a plurality of values that are required to be preserved, (iii) the forward slice comprises a program statement affected by the program slicing criterion; and (iv) the backward slice comprises a program statement that affects the program slicing criterion.
However, Kurita discloses;
(ii) a program slicing criterion [i.e., start point (page 1, para 0006 – 0007)] comprises a set of variables corresponding to a plurality of values that are required to be preserved [i.e., the slice uses a start point command statement, data dependencies are defined in terms of data referred to and referred by other statement (page 1, para 0005) Note; therefore, the “data referred to” are effectively variables, and the slice is concerned with the computation/flow that determines those variables at/around the start point], (iii) the forward slice comprises a program statement affected by the program slicing criterion [i.e., by the forward slice method, a command statement to be used as a start point of slice process is selected, and the dependency graph is traced in the program execution order to thereby extract a command statement which is influences by the start point (“affected by the program slicing criterion”) (page 1, para 0004, and 0006)]; and (iv) the backward slice comprises a program statement that affects the program slicing criterion [i.e., by the backward slice method, a command statement to be used as a start point of a slice process is selected and the dependency graph is traced in a reverse order of the program execution order to thereby extract a command statement which influences the start point (“affects the program slicing criterion”) (page 1, para 0004, and 0007)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004).
Regarding claim 10, Sydow discloses; the system of claim 9, wherein the static analysis comprises generating, for a training source code file of the plurality of training source code files [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies [interpreted as “one or more penitential vulnerability candidates”]…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050) Note; this is static analysis].
Sydow and Kimball do not disclose;
at least one of: a program dependency graph, a data dependency graph, and a control dependency graph.
However, Kurita discloses;
at least one of: a program dependency graph [i.e., a dependency graph (page 1, para 0005], a data dependency graph [i.e., a dependency graph (page 1, para 0005], and a control dependency graph [i.e., a control dependency relation (page 1, para 0005)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004).
Regarding claim 11, Sydow discloses; the system of claim 8, wherein the plurality of syntax features comprises operators in expression [i.e., These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050), (page 4, para 0053 – 0061), and (see reference 502 of figure 5)].
Regarding claim 12, Sydow discloses; the system of claim 8, wherein extracting the plurality of program slices comprises generating a source code subset, the source code subset comprising the plurality of program statement from the plurality of training source code files contributing to the vulnerability [i.e., parsing the corpus of source code or the code snippet (para 0053 – 0055) to be trained is based on the vulnerability i.e., copy-paste error, variable name and standard programming practices (page 4, para 0053 – 0061), (see reference 502 of figure 5)].
Regarding claim 14, Sydow discloses; the system of claim 8, wherein the training dataset is further generated by replacing names of functions and variables in the plurality of program slices with symbolic names [i.e., removing one or more comments from the source code, replacing portions of the source code corresponding to the syntax of the programming language of the source code with corresponding token and replacing at least one constant value in the source code with generic token (page 5, para 0071), (page 3, para 0038), (see figure 2)].
Regarding claim 15, Sydow discloses; one or more non-transitory computer-readable storage media including instructions that, when executed by one or more processors [i.e., the processing device comprises a memory 712 (page 7, para 0090), (see figure 7)], cause the one or more processors to:
receive a source code file [i.e., a plurality of programming languages (page 5, para 0069) i.e., code (page 4, para 0057 – 0059) i.e., code snippet of the code (page 4, para 0061) i.e., “a given portion of the source code” (page 4, para 0050)];
match source code from the source code files [i.e., “a given portion of the source code” (page 4, para 0050)] to a plurality of program slices [i.e., “a corpus of source code” (page 4, para 0050 and 0052 - 61)] by parsing the source code and mapping a portion of the source code to a program slices [i.e., detecting changes between a code snippet of the code “lowest=np.abs(vec).max()” [interpreted as “source code from the one or more source code files”] to a source code “lowest=np.abs(vec).min()” [interpreted as “one or more program slices”] and determining that the code snippet “if (progressDialog.isShowingo && progressDialog!=null)” [interpreted as “source code from the one or more source code files”] should have been written as “if (progressDialog!=null && progressDialog.isShowingo)” [interpreted as “one or more program slices”] (page 4, para 0052 - 0061)], wherein the program slices comprises a program statement associated with a vulnerability [i.e., this pattern of changes corresponds to a copy-paste error (page 4, para 0060) i.e., a portions of code is labeled as anomalies (page 4, para 0050)];
generate using a predictive machine learning model, a vulnerability prediction for the source code file [i.e., the second machine learning model is configured to detect one or more code anomalies across a plurality of programming languages (see reference 506 of figure 5), (page 5, para 0069), (page 4, para 0050 and 0052 - 0061)], the vulnerability prediction comprising: (a) a locations of vulnerable code in the source code based on the matching [i.e., detecting changes between a code snippet of the code “lowest=np.abs(vec).max()” to a source code “lowest=np.abs(vec).min()” and determining that the code snippet “if (progressDialog.isShowingo && progressDialog!=null)” should have been written as “if (progressDialog!=null && progressDialog.isShowingo)” (page 4, para 0052 - 0061) i.e., a given portion of the source code (emphasis added) is an anomaly or not (page 4, para 0050)], and (b) a vulnerability class associated with the location of vulnerable code [i.e., a given portion of the source code is an anomaly or not (emphasis added) (page 4, para 0050)], wherein: (i) the predictive machine learning model is trained based on a training dataset [i.e., the second machine learning model is…trained based at least in part on a training dataset (see reference 506 of figure 5), (page 5, para 0069)], and (ii) the training dataset is generated by:
(1) receiving a plurality of training source code files and a plurality of vulnerability classes associated with the plurality of training source code files [i.e., the training datasets is generated based on one or more available applications (e.g., open source applications). Such applications include one or more know code anomalies (page 4, para 0050), (page 5, par 0071), (see reference 502 of figure 5), (page 5, para 0067)],
(2) receiving a plurality of syntax features corresponding to the plurality of vulnerability classes [i.e., the one ore more code anomalies correspond to at least one of a syntax anomaly (page 5, para 0071), (page 4, para 0050), (see reference 502 of figure 5), (page 5, para 0067)],
(3) determining a program slicing criterion based on the plurality of syntax features [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050), (page 4, para 0053 – 0061), and (see reference 502 of figure 5)],
(4) extracting a set of program slices from the plurality of training source code files based on the program slicing criterion [i.e., parsing the corpus of source code or the code snippet (para 0053 – 0055) to be trained is based on the vulnerability i.e., copy-paste error, variable name and standard programming practices (page 4, para 0053 – 0061), (see reference 502 of figure 5)], and
(5) labeling the set of program slices with the plurality of vulnerability classes [i.e., these portions of code can be labeled as anomalies (page 4, para 0050)]; and
initiate one or more prediction-based actions based on the vulnerability prediction [i.e., causing one or more automated actions to be performed in response to the second machine learning model detecting at least one anomaly in the source code (page 5, para 0070, (see reference 508 of figure 5)].
Sydow does not disclose;
wherein: (i) the vulnerability prediction indicates whether the source code file increases a susceptibility to a malicious attack.
However, Kimball disclose;
Vulnerability prediction indicates whether a source code file increases a susceptibility to a malicious attack [i.e., In step 310, a computer is configured to process one or more annotated source code files using a software assurance tool to identify one or more security threats (anomalies) (emphasis added) Note; the “the software assurance tool identifying security threats (anomalies) in source code file” clearly reads on “vulnerability prediction indicating whether a source code file is susceptible to malicious attack”). Then the one or more security threat levels are ranked (Note; the ranking of the source code file in order of the security threat level clearly reads on “indicating whether a source code file increases a susceptibility to a malicious attack” because the ranking of the source code based on threat leave i.e., ranking source code 1 to threat level 1, source code 2 to thread level 2 reads on “indicating whether source code file increases a susceptibility to a malicious attack” )…in step 312, display ranked list of threat level (col. 14, lines 21 – 32), (See Ref. 308 – 312 of figure 3)].
Before the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow by adapting the teachings of Kimball to reduce the risk of false positives when identifying anomalies are related to malicious attacks versus non-malicious (See Kimball; col. 2, line 67 and Col. 3, lines 1 – 6).
Sydow and Kimball do not disclose;
wherin one or more program slices of the set of program slices is extracted by performing a forward slice or backward slice.
However, Kurita discloses;
one or more program slices of a set of program slices is extracted by performing a forward slice or backward slice [i.e., extracting particular program portions from the dependency graph include two methods, forward slice and backward slice methods…to extract a command statement (page 1, para 0006 - 0007) i.e., a program slice is a technique of using start point information on a program to extract a portion of the program which influences the start point or is influenced by the start point (page 1, para 0004)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004).
Regarding claim 16, Sydow discloses; the one or more non-transitory computer-readable storage media of claim 15, wherein (i) determining the program slicing criterion further comprises determining a plurality of potential vulnerability candidates by performing static analysis on a plurality of program statements associated with the a plurality of training source code files and matching the a plurality of program statements associated with the a plurality of training source code files with the a plurality of syntax features [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies [interpreted as “one or more penitential vulnerability candidates”]…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050) Note; this is static analysis].
Sydow and Kimball do not disclose;
(ii) the program slicing criterion comprises a set of variables corresponding to a plurality of values that are required to be preserved, (iii) the forward slice comprises a program statement affected by the program slicing criterion; and (iv) the backward slice comprises a program statement that affects the program slicing criterion.
However, Kurita discloses;
(ii) a program slicing criterion [i.e., start point (page 1, para 0006 – 0007)] comprises a set of variables corresponding to a plurality of values that are required to be preserved [i.e., the slice uses a start point command statement, data dependencies are defined in terms of data referred to and referred by other statement (page 1, para 0005) Note; therefore, the “data referred to” are effectively variables, and the slice is concerned with the computation/flow that determines those variables at/around the start point], (iii) the forward slice comprises a program statement affected by the program slicing criterion [i.e., by the forward slice method, a command statement to be used as a start point of slice process is selected, and the dependency graph is traced in the program execution order to thereby extract a command statement which is influences by the start point (“affected by the program slicing criterion”) (page 1, para 0004, and 0006)]; and (iv) the backward slice comprises a program statement that affects the program slicing criterion [i.e., by the backward slice method, a command statement to be used as a start point of a slice process is selected and the dependency graph is traced in a reverse order of the program execution order to thereby extract a command statement which influences the start point (“affects the program slicing criterion”) (page 1, para 0004, and 0007)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004).
Regarding claim 17, Kurita disclose; the one or more non-transitory computer-readable storage media of claim 16, wherein the static analysis comprises generating, for a training source code file of the plurality of training source code files [i.e., the training dataset is generated based on one or more open source application that include one or more known code anomalies [interpreted as “one or more penitential vulnerability candidates”]…a search can be performed to identify specific tokens on a corpus of source code of a given application to identify basic patterns associated with code anomalies, and a combination of syntax and logical errors is added to the code. These portions of code is labeled as anomalies so that the machine learning framework can learn to recognize patterns and contextual clues to determine whether a given portion of the source code is an anomaly or not (page 4, para 0050) Note; this is static analysis].
Sydow and Kimball do not disclose;
at least one of: a program dependency graph, a data dependency graph, and a control dependency graph.
However, Kurita discloses;
at least one of: a program dependency graph [i.e., a dependency graph (page 1, para 0005], a data dependency graph [i.e., a dependency graph (page 1, para 0005], and a control dependency graph [i.e., a control dependency relation (page 1, para 0005)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004)., at least one of: a program dependency graph, a data dependency graph, a data dependency graph, and a control dependency graph.
Regarding claim 19, Sydow discloses; the one or more non-transitory computer-readable storage media of claim 15, wherein extracting the a plurality of program slices comprises generating a source code subset, the source code subset comprising the a plurality of program statements from the a plurality of training source code files contributing to the a plurality of vulnerabilities [i.e., parsing the corpus of source code or the code snippet (para 0053 – 0055) to be trained is based on the vulnerability i.e., copy-paste error, variable name and standard programming practices (page 4, para 0053 – 0061), (see reference 502 of figure 5)].
Regarding claim 20, Sydow discloses; the one or more non-transitory computer-readable storage media of claim 15, wherein the training dataset is further generated by replacing names of functions and variables in the a plurality of program slices with symbolic names [i.e., removing one or more comments from the source code, replacing portions of the source code corresponding to the syntax of the programming language of the source code with corresponding token and replacing at least one constant value in the source code with generic token (page 5, para 0071), (page 3, para 0038), (see figure 2)].
Regarding claim 21, Sydow discloses; the computer-implemented method of claim 1, wherein location of vulnerable code in the source code further comprises one or more of an indication of a class or a function associated with the vulnerable code or an identifier associated with a source code file comprising the source code [i.e., detecting changes between a code snippet of the code “lowest=np.abs(vec).max()” to a source code “lowest=np.abs(vec).min()” and determining that the code snippet “if (progressDialog.isShowingo && progressDialog!=null)” should have been written as “if (progressDialog!=null && progressDialog.isShowingo)” (page 4, para 0052 - 0061) i.e., a given portion of the source code (emphasis added) is an anomaly or not (page 4, para 0050)].
Regarding claim 22, Sydow discloses; the computer-implemented method of claim 3 and training source code file [i.e., (see claim 3 above)].
Sydow and Kimball do not disclose;
wherein the program dependency graph comprises a first set of edges representative of data dependencies between one or more program statement in the training source code file and a second set of edges representative of one or more control dependencies between the one or more program statement in the training source code file.
However, Kurita discloses;
the program dependency graph [i.e., a dependency graph (page 1, para 0005), (page 3, para 0068), (see figure 14)] comprises a first set of edges representative of data dependencies between one or more program statement in the source code file [i.e., data dependency information table (see figure 13) i.e., data dependency table fields, dependency source cmd stmt number, dependency destination cmd, stmt number, data item (page 4, para 0070) i.e., broken lines indicate data dependency (page 3, para 0068), (see figure 14)] and a second set of edges representative of one or more control dependencies between the one or more program statement in the training source code file [i.e., control dependency information table (see figure 12) i.e., data dependency table fields, dependency source cmd stmt number, dependency destination cmd, stmt number, data item (page 4, para 0070)].
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of Sydow and Kimball by adapting the teachings of Kurita to be capable of extracting logic designated by a user from a program constituting an already existing system (legacy system) (See Kurita; page 1, para 0004)., at least one of: a program dependency graph, a data dependency graph, a data dependency graph, and a control dependency graph.
Allowable Subject Matter
Claim 23 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
Regarding claim 23;
the prior art of record, Sydow, Kimball and Kurita discloses the computer-implemented method of claim 1,
However, Sydow, Kimball and Kurita do not disclose “initiating the one or more prediction-based actions further comprises: performing one or more load balancing operation to set a number of allowed computing entities used by a post-prediction systems based on the vulnerability prediction”.
These claimed limitations are not present in the prior arts of record and would not have been obvious. They in combination with other elements cited present subject matter that is novel and nonobvious. Thus, claim 23 is objected being dependent upon a rejected base claim.
Response to Arguments
Applicant’s arguments with respect to pending claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SYED A RONI whose telephone number is (571)270-7806. The examiner can normally be reached M-F 9:00-5:00 pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey L Nickerson can be reached at (469) 295-9235. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SYED A RONI/Primary Examiner, Art Unit 2432