DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 4/30/2024 and 8/8/2025 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner.
Response to Arguments
Applicant's arguments, see pages 6-8, filed 12/30/2025, with respect to the interpretation of claim 14 under 35 USC 112(f) have been fully considered but they are not persuasive.
Regarding the argument:
“In the Office Action, the Examiner interprets claim 14 under 35 U.S.C.§ 112(f) asserting that "scan module" and "finetuning module" are generic placeholders coupled with functional language without sufficient structure. Applicant respectfully traverses this interpretation and submits that these claim limitations do not invoke 35 U.S.C. §112(f) because they recite sufficient structure to perform the claimed functions.”
Examiner respectfully disagrees. In fact, the claim limitations are silent as to anything that could be interpreted as structure attributed to the “scan module” or “finetuning module.” In the case of the “scan module,” the claim merely recites the function of “scan[ning] a target system to obtain the sensitive information”. In the case of the “finetuning module,” the claim merely recites the function of “train[ing] the second language model to learn the sensitive information.” It is unclear to what the applicant is referring in their argument that the claim limitations “recite sufficient structure to perform the claimed functions.”
“… The specification provides detailed structural components and their interrelationships, which one of ordinary skill in the art would recognize as sufficient structure for performing the recited functions.”
Examiner notes that structure provided in the specification has no bearing on the interpretation of a claim under 35 USC 112(f). Whether structure is present in the specification for claims interpreted under this statute determines the validity of the claims under 35 USC 112(a) and (b).
Regarding the argument:
“The term "scan module" is not a generic placeholder, but rather a well-understood structural component in the cybersecurity testing arts. The specification discloses that scan module 112 interacts with to-be-tested target system 120, manages test tools (tool library), stores tool execution results and edge-specific data (Data Store), executes commands using the test tools (Command Manager), and generates prompts that it sends to local LM 104 (Command Manager). Spec., 33-34. This disclosure identifies multiple structural components—tool library, Data Store, and Command Manager—that work together to perform the scanning function.”
Examiner respectfully disagrees. A “scan module,” as understood in the art, is a generic term which may refer to any number of hardware and software implementations. As to applicant’s references to the specification, examiner first notes that information in the specification does not inform whether a claim is interpreted under 35 USC 112(f). As explained above, this interpretation is determined through the language of the claim. Examiner further notes that even if these elements were present in the claim, they are each also recitations of function (managing, storing, generating, etc.), not of structure.
Regarding the argument:
“The specification provides further structural detail regarding the tool library component. …
…
“The specification also describes the operational structure of the scan module. …”
Examiner respectfully disagrees and notes that these arguments are each directed to content of the specification, which has already been addressed above as irrelevant to consideration of whether the statute applies to a claim.
Regarding the argument:
“One of ordinary skill in the art would understand that a "scan module" … provides sufficient structure to perform the claimed function of scanning a target system to obtain sensitive information. The term "module" in the cybersecurity context commonly refers to a discrete functional unit with defined structural components, not a generic placeholder. …”
Examiner respectfully disagrees. There is no such common definition or interpretation of the term “module” in the art. On the contrary, in the computer science art, it is among the most common generic placeholders used in claim construction.
Examiner notes that additional arguments regarding the interpretation of claim 14 under 35 USC 112(f), directed to the “finetuning module,” broadly repeat the same arguments directed to the “scan module.” Examiner responds to these arguments in a like manner as found above.
Applicant’s arguments, see page 8, filed 12/30/2025, with respect to the objection(s) to claim(s) 16 have been fully considered and are persuasive. The associated objection(s) to the listed claim(s) has been withdrawn.
Applicant's arguments, see pages 9-11, filed 12/30/2025, with respect to the rejection of claims 1-20 under 35 USC 101 have been fully considered but they are not persuasive.
Regarding the argument:
“… Applicant submits that the claims are directed to patent-eligible subject matter because they recite a specific technological solution involving a dual language model architecture that addresses the technical problem of information leakage in automated security testing systems. …
“… This dual language model architecture provides a concrete technological improvement to automated security testing systems by solving the technical problem of maintaining security while leveraging powerful public language models.”
Examiner notes that the argument seems to assert that the claims amount to “significantly more” than the judicial exception (step 2B of the subject matter eligibility test). Examiner respectfully disagrees and notes that the argued “dual language model architecture,” besides not appearing in the disclosure, amounts to simply passing the output from one model into the input of another. Additionally, the claimed solution to the problem of maintaining security amounts to simply removing sensitive material from language model prompts. While this “architecture” and solution may achieve the claimed function, they do not amount to significantly more than generating a prompt, and filtering sensitive information from it, which can be performed in the human mind with the aid of pen and paper and generic computers and computer components.
Regarding the argument:
“Amended claim I recites specific technological steps including "using the target system information to train a first language model to recognize sensitive information in the target system information," "performing a filtering process to generate a second prompt that does not comprise the sensitive data." and "using the first model response and the target system information to generate test commands comprising the sensitive data." These steps may not be performed mentally or with pen and paper because they require the coordinated operation of multiple language models with different training datasets and specialized filtering capabilities.”
Examiner respectfully disagrees. These steps describe the use of well-known computer components performing their well-known functions. The mental processes which constitute the judicial exception (i.e. obtaining system information, identifying test patterns, creation and filtering of prompts) are merely applied using these known computer components and their functions (using generic large language models).
Regarding the argument:
“Claim 13 recites "a first language model …" and "a second language model …" Spec., 1 8. The system further comprises "a scan module configured to scan a target system …; a management server configured to generate the first prompt …; and a finetuning module configured to train the second language model …" Spec., 1 9. This specific system architecture may not be implemented mentally and requires specialized computer components working in coordination.”
Examiner notes that this argument lists further generic computer components and their functions, as addressed above. These functions are not given any distinction over “off-the-shelf” computers and their functions, and provide no inventive concept beyond those steps which constitute the judicial exception.
Regarding the argument:
“The claims address the technical problem of preventing exposure of sensitive information during cybersecurity testing. As defined in the specification, "sensitive information refers to … "names of assets …; network structure; physical structure; system user information; credentials; elements of assets; ….
“The specification demonstrates the technical solution through specific examples … like "Line-Kitting Server" and "192.168.XX.XX" to generic terms like "ServerX" and "1.2.3.4" …. This technical process requires specific computer components and may not be performed mentally.”
Examiner respectfully disagrees. Each of the examples listed in applicant’s arguments are easily processed mentally. Indeed, the provided example (transforming “sensitive information” such as “Line-Kitting Server” to “ServerX”) serves as an excellent demonstration of how a human would perform the claimed process using their mind and pen and paper. This example begs the question of what “specific computer components” are required to perform this “technical process.”
Regarding the argument:
“The ordered combination of steps in amended claim 1 and the specific system architecture in claim 13 provide significantly more than any alleged abstract idea by improving the functioning of computer security testing systems themselves. …”
Examiner respectfully disagrees. The above argument is merely an assertion of eligibility and does not provide any additional arguments against the rejection under 35 USC 101. The provided descriptions and excerpts from the specification do not describe any technological improvements; rather, they merely describe the claimed invention.
Applicant's arguments, see pages 11-16, filed 12/30/2025, with respect to the rejection of claims 1-20 under 35 USC 112(a) have been fully considered. The arguments directed to claims 1, 13, and 14 are persuasive, and the rejections directed to these claims are withdrawn. The arguments directed to claims 6 and 7 are not persuasive.
Regarding the argument:
“The specification enables claims 6 and 7 …. One of ordinary skill in the art would understand that the scan module's "current location" refers to its position within the network topology and physical deployment structure of the target system, as defined by the structured system information.”
Examiner respectfully disagrees and notes that the term “network topology” does not appear in the specification. It is also unclear how the location of the “scan module” can refer to both the “physical deployment structure of the target system” and its position within the network topology. Indeed, it is unclear what relationship the scan module’s position has with the “physical deployment structure of the target system” at all.
“… This is a well-understood concept in cybersecurity testing, where penetration testers routinely track their position within a target network to determine which systems they have access to and which attack vectors are available from their current position.”
Examiner respectfully disagrees and notes that, even assuming arguendo that it is well-known in the art to “track their position within a target network to determine which systems they have access to and which attack vectors are available from their current position”, this does not confer sufficient guidance to explain specifically how the claimed “test patterns” are determined using the combination of the location of the scan module and the “structured system data.”
Applicant's arguments, see pages 16-18, filed 12/30/2025, with respect to the rejection of claims 1-2 and 14-19 under 35 USC 112(b) have been fully considered.
Regarding arguments directed to claim 1:
While not all issues are addressed by the amendments, those issues that remain do not, by themselves, require rejection under 35 USC 112(b), and can be found in the below section titled, Claim Objections. The rejections to claim 1 are otherwise withdrawn.
Regarding arguments directed to claim 8:
The argument is persuasive, and this rejection is withdrawn. However, examiner notes that in so doing, this claim is functionally identical to claim 5, on which this claim depends, and is now subject to rejection under 35 USC 112(d). Further details can be found in the below section titled, Claim Rejections - 35 USC § 112.
Regarding arguments directed to claims 11 and 12:
The argument is persuasive, and these rejections are withdrawn. However, examiner notes that in so doing, amended claim 11 recites a problematic use of terminology. Further details can be found in the below section titled, Claim Objections.
Examiner notes that the rejection of claim 14 is not addressed in applicant arguments and is maintained.
Applicant's arguments, see pages 18-20, filed 12/30/2025, with respect to the rejection of claims 13 and 20 under 35 USC 102(a)(2) have been fully considered. As some of the arguments are persuasive, these rejections are withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of BANG (Doc ID US 20220027793 A1) and WANG et al (Doc ID US 20200226476 A1). Examiner answers to arguments are given below where mappings from the previous office action are maintained in the new rejection.
Regarding the argument:
“First, Gomez fails to disclose "a first language model that has been trained without using sensitive information of a target system" as recited by claim 13. … However, Gomez merely describes a privacy-preserving prompt engineering system where prompts are anonymized before being sent to an LLM, not a first language model specifically trained without using sensitive information of a target system. The LLM 360 in Gomez is described as a general-purpose large language model that receives anonymized prompts …”
Examiner respectfully disagrees. A reading of GOMEZ makes it abundantly clear that the entire inventive concept of GOMEZ is to provide a prompt free of sensitive information to a pre-trained LLM, just as is the main inventive concept of the instant application. GOMEZ provides embodiments that each disclose removal of sensitive information before inputting the prompt to the model. It would be nonsensical for the model of GOMEZ to be trained on sensitive data only for other steps to be explicitly taken to prevent sensitive data from being presented to the model. Additionally, paragraph 0061 of GOMEZ recites, “During training, the target or output sequence, which the model is learning to generate, is presented to the decoder 240.” As it is clear that the target output does not contain sensitive data, it is further clear that the model is not trained on sensitive data.
Regarding the argument:
“Gomez also fails to disclose "a second language model that has been trained using information comprising the sensitive information" as recited by claim 13. The Examiner cites Gomez paragraph 55, which describes machine learning approaches for sensitive data detection using supervised learning …. However, this describes a detector for identifying sensitive data, not a second language model trained using information comprising the sensitive information as claimed. The detector in Gomez is trained to recognize sensitive data patterns, but it is not a language model that generates responses or communicates with another language model as required by claim 13. See, e.g., Gomez, 1 55.”
Examiner respectfully disagrees. GOMEZ provides multiple embodiments for sensitive data detection, including natural language processing (0054) and rules-based systems (0056). In the citation provided. GOMEZ explicitly discloses the use of a separate language model in detecting sensitive data (Fig. 1 “Detector 132”) before sending an anonymized prompt to another language model for processing (Fig. 1 “Gen AI 160”).
Regarding the argument:
“Gomez fails to disclose "the second language model configured to receive a first prompt comprising the sensitive information and return non-sensitive information" as recited by claim 13. …”
Examiner notes that this argument is persuasive.
Regarding the argument:
“Gomez fails to disclose "the first language model configured to generate … a first model response" as recited by claim 13. The Examiner cites Gomez paragraphs 92-93 …. However, these paragraphs describe a single LLM (LLM 200 or 360) that receives anonymized prompts and generates replies, not a first language model that is part of a dual language model architecture …
“The Examiner's mapping conflates Gomez's privacy-preserving prompt engineering system with the claimed dual language model architecture. Gomez describes a system where sensitive data is detected and anonymized by separate components … before being sent to a general-purpose LLM, whereas claim 13 requires a specific dual language model architecture …”
Examiner respectfully disagrees. GOMEZ does indeed describe using multiple language models, as cited in the previous office action and detailed in above answers to previous arguments. Where applicant refers to the “specific dual language model architecture” of the instant application, it is noted that there is no teaching in the original disclosure which indicates that this “architecture” is anything more than passing the output of one model to the input of another. The models are not further integrated in any way which one skilled in the art would interpret as being distinct over the disclosure of GOMEZ, which uses one model to identify sensitive information in a prompt, and then passes an anonymized version of the prompt to another model. The entirety of the “specific architecture” of the instant application can be accurately summarized as “passing the output of one LLM to the input of another LLM.”
Regarding the argument:
“Regarding claim 20, Gomez fails to disclose "wherein the first language model comprises a greater number of parameters than the second language model" as recited by claim 20. The Examiner … acknowledges that "… it would be obvious to one skilled in the art that a model trained only to recognize sensitive data would utilize fewer parameters than a general LLM like the recited OpenAl GPT." However, this acknowledgment confirms that Gomez does not actually disclose the claimed parameter relationship. …”
Examiner respectfully disagrees and notes that a feature which is inherent is considered to be part of a prior art reference’s disclosure, even where an instant application may explicitly recite that otherwise inherent feature. Examiner maintains that one skilled in the art would consider a model trained to recognize sensitive data and a model trained for general purpose, and determine it to be and inherent property of the more specific model to use fewer parameters than the larger model. However, in the interest of advancing prosecution, examiner withdraws the rejection of claim 20 and provides additional grounds for rejection below.
Applicant's arguments, see pages 20-23, filed 12/30/2025, with respect to the rejection of claims 1-11 and 14-19 under 35 USC 103 have been fully considered. As some of the arguments are persuasive, these rejections are withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of BANG (Doc ID US 20220027793 A1) and GOEL et al (Doc ID US 20230077510 A1). Examiner answers to arguments are given below where mappings from the previous office action are maintained in the new rejection.
Regarding the argument:
“Gomez fails to disclose "using the target system information to train a first language model to recognize sensitive information in the target system information" as recited by claim 1, as amended. While Gomez describes a computer-implemented method for improved data security …, this describes a prompt processing system, not training a language model to recognize sensitive information in target system information. Gomez's detector 132 … is a detection component, not a trained language model as claimed.”
Examiner respectfully disagrees and answers with the same reasoning as provided in the above answer to the similar argument regarding the rejection of claim 13 under 35 USC 102(a)(2). In summary, GOMEZ explicitly discloses the use of a language model trained to recognize sensitive information.
Regarding the argument:
“Gomez also fails to disclose "performing a filtering process to generate a second prompt that does not comprise the sensitive data" as recited by amended claim 1. …”
Examiner notes that this argument is persuasive.
Regarding the argument:
“Gomez further fails to disclose "using the first model response and the target system information to generate test commands comprising the sensitive data" as recited by amended claim 1. …”
Examiner notes that the citation from GOMEZ provided teaches that a script may be generated based on the prompt. GOMEZ further teaches in its disclosure that the response may include a script or software code. While this meets the broadest reasonable interpretation of “test commands,” an additional grounds for rejection is provided in the interest of advancing prosecution.
Regarding the argument:
“Nakanishi fails to cure these deficiencies. … Nakanishi does not disclose training a language model to recognize sensitive information or filtering sensitive data from prompts before communicating with another language model. Nakanishi's neural networks are used for attack prediction and validation …. This describes attack path generation using neural networks, not the claimed dual language model architecture for filtering sensitive information from prompts.”
Examiner respectfully disagrees. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). NAKANISHI is not relied upon to teach the argued limitations.
Regarding the argument:
“The Examiner has not established that combining Gomez's prompt anonymization system with Nakanishi's neural network-based attack generation would teach or suggest the claimed method where a first language model trained to recognize sensitive information filters prompts before a second language model generates security test commands.”
Examiner respectfully disagrees and, absent any specific arguments or examples from the applicant, points to the motivation statement provided in the previous office action.
Examiner notes that additional arguments are directed to assertions of allowability of claims based on their dependence on claims previously argued, and will not be addressed further.
Claim Objections
Claims 1-12 are objected to because of the following informalities:
Regarding claim 1:
The claim teaches that a “set of test patterns” is obtained. The claim then teaches that for a test pattern (interpreted as “for each one of the set of test patterns”), a prompt is created comprising the “target system information.” It is not clear what difference, if any, would thus exist among the prompts created for the set of test patterns, as it seems each prompt would simply include the target information.
Regarding claim 11:
The claim recites, “The method according to claim 1, further comprising verifying the test commands and storing the test commands in a database, and storing the security test pattern associated with the test commands in the database.” Paragraph 0046 of the specification recites, “At step 1108, … the security test manager verifies the commands. … At step 1110, the commands are stored in a database as security test patterns …” Where the claim describes test commands and test patterns as separate elements, the specification describes them as equivocal, with their nomenclature changing depending on their use at a given time. As the claims must reflect subject matter as presented in the original disclosure, this makes the amended second limitation (“… storing the security test pattern associated with the test commands in the database.”) seem redundant. Applicant is encouraged to review and amend the claims as necessary so that terminology is consistent across the specification and the claims.
Regarding claims 2-10, and 12:
They are objected to for being dependent on one or more objected-to claims. These objections could be overcome by overcoming the objections to any claims upon which these claims depend, or by amending the claim such that they are no longer dependent on any objected-to claims.
Appropriate correction is required.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are:
Claim 14:
“a scan module [means] configured to scan a target system to obtain the sensitive information [function];”
“a finetuning module [means] configured to train the second language model to learn the sensitive information [function].”
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may:
amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or
present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f).
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Claim 1 recites:
Obtaining target system information.
Training a language model to recognize sensitive information.
Identifying test patterns.
Using a language model for creating a prompt containing “sensitive information.”
Using a language model for creating a second prompt which filters the “sensitive information” from the first prompt.
Using a language model for responding to the second prompt.
Performing test commands.
Claims 3 and 4 recite:
Formatting presentations of data.
Claim 5 recites:
Obtaining test conditions.
Claim 7 recites:
Determining a test pattern.
Claim 8 recites:
Providing input.
Claim 10 recites:
Filtering commands from a list, based on criteria.
Claim 13 recites:
Using a language model for creating a prompt containing “sensitive information.”
Using a language model for creating a second prompt which filters the “sensitive information” from the first prompt.
Using a language model for responding to the second prompt.
Claim 20 recites:
Creating the first prompt involves a “greater number of parameters” than creating the second prompt.
These are processes that, under their broadest reasonable interpretation, may be performed in the mind. Accordingly, the claims recite an abstract idea.
This judicial exception is not integrated into a practical application because other aspects of the claims’ limitations amount no more than mere instructions to apply the exception using a generic “language model” and generic computer components.
Regarding the ML model, patents may be directed to abstract ideas where they disclose the use of an already available technology, with its already available basic functions, to use as a tool in executing the claimed process.
Regarding the obtaining of system information, identifying test patterns, creation and filtering of prompts, and related limitations of the claims listed above: if a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Using system information to plan tests, generating AI prompts, and filtering them for sensitive information is a task which may be reasonably performed by a human using their mind, pen and paper, and/or the basic functions of a computer. That the claimed invention may perform this task with greater speed and/or efficiency than a human does not by itself render the claims patent eligible under 35 USC 101.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because there is nothing in the claims, whether considered individually or in their ordered combination, that would transform the application into something “significantly more” than the abstract idea of planning a test on a target system, generating a prompt, and filtering sensitive information from it. Further, the claims do not contain steps through which the ML model achieves an improvement, nor any improvement over generic ML models themselves. The claims are not patent eligible.
Regarding claims 2, 6, 9, 11, 12, and 14-19:
They are dependent on one or more rejected claims, and thus inherit those rejections. This rejection could be overcome by overcoming the rejection(s) to any claims upon which these claims depend, or by amending the claims such that they are no longer dependent on any rejected claim.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
Claims 6 and 7 are rejected under 35 U.S.C. 112(a) as failing to comply with the enablement requirement. The claims contain subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention.
Regarding claims 6 and 7:
Claim 6 recites “… identifying the set of security test patterns further comprises obtaining a current location of a scan module in relation to the structured system data.” Claim 7 recites, “… using the set of test conditions and the current location to determine the security test pattern.” The limitations are not enabled; See MPEP § 2164.
The test of enablement is whether one reasonably skilled in the art could make or use the invention from the disclosures in the patent coupled with information known in the art without undue experimentation; United States v. Telectronics, Inc., 857 F.2d 778, 785, 8 USPQ2d 1217, 1223 (Fed. Cir. 1988). The factors to be considered when determining whether there is sufficient evidence to support a determination that a disclosure does not satisfy the enablement requirement and whether any necessary experimentation is “undue” include, but are not limited to: (a) the breadth of the claims; (b) the nature of the invention; (c) the state of the prior art; (d) the level of one of ordinary skill; (e) the level of predictability in the art; (f) the amount of direction provided by the inventor; (g) the existence of working examples; and (h) the quantity of experimentation needed to make or use the invention based on the content of the disclosure; In re Wands, 858 F.2d 731, 737, 8 USPQ2d 1400, 1404 (Fed. Cir. 1988).
As to (a) the breadth of the claims, the currently pending claims in the instant application encompass identifying the location of a scan module in relation to system data, and using the location of the scan module and a set of test conditions to determine security test patterns. The claims therefore encompass any possible location for any manifestation of a “scan module,” as well as any means for using that location to somehow determine “test patterns.” Using a location of a non-descript module, whether that location is physical or digital, to somehow inform the determination of equally non-descript “test patterns,” is unknown in the art, and would require undue experimentation by one skilled in the art to achieve the claimed result.
As to (b) the nature of the invention, (c) the state of the prior art, and (d) the level of skill in the art, while the current state of the art regarding the use of machine learning (ML) tools for penetration testing is expansive, the art does not generally incorporate the location of components within the testing system as useful in determining any factors involved in the tests themselves. This makes the amount of experimentation required by one skilled in the art undue.
As to (e) the level of predictability in the art and (f) the amount of direction provided by the inventor, the computer security arts are generally considered predictable. However, the fact that neither the claims nor specification so much as define whether the claimed “location” refers to a real-world location of the system, a physical location of circuitry within a computing device, or a digital location within memory, speaks to the fact that any predictability in the art cannot be applied, and that the level of guidance provided is insufficient to inform one skilled in the art how to make or use the invention without undue experimentation.
As to the (g) existence of working examples, there are no known example, either provided by the applicant or known in the art, of using the location of a module to inform the determination of test patterns, and so it would require undue experimentation by one skilled in the art to replicate the claimed results.
As to the (h) quantity of experimentation needed, there is no particular evidence in the record to indicate the quantity of experimentation that one of ordinary skill in the art would need to implement the present invention. However, analysis of this factor in light of the other factors present suggest the amount of experimentation required to make and use the invention is undue.
The majority of factors for which there is evidence suggest that undue experimentation is required. After weighing all of the factors and all the evidence of record, the totality of the evidence suggests that it would require undue experimentation to make and use the claimed invention.
Regarding claim(s) 8:
It is dependent on one or more rejected claims, and thus inherit those rejections. This rejection(s) could be overcome by overcoming the rejection(s) to any claims upon which these claims depend, or by amending the claim(s) such that they are no longer dependent on any rejected claim.
Claims 14-19 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, at the time the application was filed, had possession of the claimed invention.
Regarding claim 14:
Claim limitations, “a scan module configured to scan a target system to obtain the sensitive information …”, and “a finetuning module configured to train the second language model to learn the sensitive information”, lack sufficient written description in the claims. No specific physical structure of the “modules” is explicitly described.
Further, computer-implemented claims invoking 35 U.S.C. 112(f) require description of the algorithm or steps required to achieve the claimed functions.
This rejection could be overcome by amending the claim language such that 35 U.S.C. 112(f) is not invoked for these limitations.
Regarding claims 15-19:
They are dependent on one or more rejected claims, and thus inherit those rejections. This rejection could be overcome by overcoming the rejection(s) to any claims upon which these claims depend, or by amending the claims such that they are no longer dependent on any rejected claim.
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claims 14-19 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Regarding claim 14:
Claim limitations, “a scan module configured to scan a target system to obtain the sensitive information …”, and “a finetuning module configured to train the second language model to learn the sensitive information”, invoke 35 U.S.C. 112(f). However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. No specific physical structure of the “units” is explicitly described. Further, computer-implemented claims invoking 35 U.S.C. 112(f) require description of the algorithm or steps required to achieve the claimed functions.
Applicant may:
Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f);
Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or
Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either:
Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or
Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.
Regarding claims 15-19:
They are dependent on one or more rejected claims, and thus inherit those rejections. This rejection could be overcome by overcoming the rejection(s) to any claims upon which these claims depend, or by amending the claims such that they are no longer dependent on any rejected claim.
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.
Claim 8 is rejected under 35 U.S.C. 112(d) as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.
Regarding claim 8:
The claim fails to further limit the depended-on claim because it is functionally identical to claim 5, on which it depends. Specifically, claim 5 recites, “… obtaining the set of security test patterns further comprises obtaining a set of test conditions provided by a user.” Where claim 8 additionally recites, “… the security test patterns comprise a user-provided input.”, examiner notes that one skilled in the art would not attribute any functional difference between data which is ”provided” by a user versus data described as “user-provided input.”
Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), and further in view of NAKANISHI et al (Doc ID US 20220067171 A1), BANG (Doc ID US 20220027793 A1), and GOEL et al (Doc ID US 20230077510 A1).
Regarding claim 1:
GOMEZ teaches:
using the target system information to train a first language model to recognize sensitive information in the target system information ([0055] "... machine learning (ML) approaches can be used for sensitive data detection. One example ML technique is supervised learning, wherein ML algorithms trained on annotated data can learn … personal data or business-critical information.");
for a security test pattern, creating a first prompt that comprises the target system information and communicating the first prompt to the first language model to cause it to perform steps comprising ([0089] "At step 410, an original prompt intended to be sent to an LLM is entered through a user interface ..."):
evaluating the first prompt to determine whether it comprises sensitive data ([0090] "At step 420, sensitive data in the original prompt that violates a security protocol can be detected (e.g., by the detector 132).");
in response to determining that the first prompt comprises the sensitive data, performing a filtering process to generate a second prompt that comprises a filtered set of commands that does not comprise the sensitive data ([0091] "At step 430, a modified prompt which anonymizes the sensitive data detected in the original prompt can be generated (e.g., by the prompt modifier 136)."); and
Examiner notes that the anonymized prompt is not generated by the model. Additional reference BANG is provided as prior art which uses a model for both recognition and anonymization of sensitive information.
communicating the second prompt to a security test manager ([0092] "At 440, the modified prompt can be submitted to a large language model (e.g., the LLM 200 or 360).");
Examiner notes that the prior art does not pass the modified prompt through an explicit "security test manager." However, this element is essentially superfluous in nature, and the broadest reasonable interpretation of this term encompasses any part of a system which managers communication would perform an equivalent function, and would appropriately map to this limitation.
in response to receiving the second prompt, communicating the second prompt to a second language model to obtain a first model response comprising security test commands ([0093] "At 450, a reply generated by the LLM is received (e.g., by the anonymizer 130). The reply generated by the LLM may contain anonymized sensitive data …");
Examiner notes that model of GOMEZ does not explicitly generate test commands. Additional reference GOEL is provided as prior art which creates test commands from a machine learning prompt.
using the first model response and the target system information to generate test commands comprising the sensitive data ([0094] "At 460, a modified reply which deanonymizes the anonymized sensitive data can be generated …"); and
NAKANISHI teaches the following limitations not taught by GOMEZ:
A method for conducting cybersecurity testing, the method comprising: scanning a target system to obtain a result comprising target system information ([0020] "… the control apparatus … acquires network structure information, vulnerability information, and the like of a system under test.);
in a first phase of a security test, obtaining a set of security test patterns for assessing the result ([0023] "… the control apparatus with automated test suites plans a plurality of sequence of actions (attack paths) ...");
executing the test commands to initiate a security test session ([0025] "Thereafter, the control apparatus with automated test suites executes actions (attack steps) included in the attack path (step S5) …").
Utilizing a machine learning (ML) model to recognize sensitive information in a prompt, masking the sensitive information in a new prompt, passing the sanitized prompt to another ML model to request commands, and de-masking the response with the first model are known techniques in the art, as demonstrated by GOMEZ. Further, obtaining information about a system which is the target of a penetration test, utilizing a suite of attack paths or patterns, and executing instructions to use the attack patterns are known techniques in the art, as demonstrated by NAKANISHI. It would have been obvious to a person having ordinary skill in the art (PHOSITA) before the effective filing date of the claimed invention to modify the multi-model prompt sanitation of GOMEZ with the penetration testing of NAKANISHI with the motivation to utilize a machine learning model to generate prompts for obtaining penetration test instructions, and then execute the acquired instructions. It is obvious to look to existing methods of penetration testing when attempting to incorporate ML models to the testing.
BANG also teaches the following limitation(s) in addition to the combination of GOMEZ and NAKANISHI:
… performing a filtering process to generate a second prompt that does not comprise the sensitive data ([0029] "Referring to FIG. 1, the dedicated artificial intelligence system 100 includes ... an operation processor 130." and [0059] "Alternatively, referring to FIG. 6B, the operation processor 130 may ... generating the external query information by replacing sensitive information contained in the information input with meaningless information."); and
Using an ML model to sanitize sensitive information in a prompt and outputting a sanitized version of the prompt is a known technique in the art, as demonstrated by BANG. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the multi-model prompt generation and execution method of GOMEZ and NAKANISHI with the prompt sanitation of BANG with the motivation to utilize an ML model to generate the sanitized prompt for passing to the next model. It would be obvious to look to systems which offload functions from additional software or hardware modules to the model.
GOEL teaches the following limitation(s) not taught by the combination of GOMEZ, NAKANISHI, and BANG:
… obtain a first model response comprising security test commands ([0074] "In module 312, the artificial intelligence-based autonomous continuous testing platform generates a custom configuration model based on ... the system-specific model …" and [0077] "In module 316, the artificial intelligence-based autonomous continuous testing platform generates ... a plurality of autonomous test scripts.");
Using an ML model to output scripts for computer testing is a known technique in the art, as demonstrated by GOEL. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the multi-model prompt sanitation, generation, and execution method of GOMEZ, NAKANISHI, and BANG with the test command generation of GOEL with the motivation to utilize an ML model to generate the test commands to be executed. It would be obvious to look to systems which offload functions from additional software or hardware modules to the model.
Regarding claim 9:
The combination of GOMEZ, NAKANISHI, BANG, and GOEL teaches:
The method according to claim 1, wherein scanning the target system comprises generating and communicating commands to a tool library to operate a set of tools (NAKANISHI [0020] "... reads a network configuration (step S1)." and [0021] "… S1 is performed in such a manner that the control apparatus … calls other tools and acquires output of the other tools …").
Scanning a target system with a suite of tools is a known technique in the art, as demonstrated by NAKANISHI. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, and GOEL with the target scanning tools of NAKANISHI with the motivation to use established methods to acquire information about a target system. It is obvious to use methods long in use which are proven and known in the art.
Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), NAKANISHI et al (Doc ID US 20220067171 A1), BANG (Doc ID US 20220027793 A1), and GOEL et al (Doc ID US 20230077510 A1) as applied to claim 1 above, and further in view of CAREY et al (Doc ID US 20140137190 A1).
Regarding claim 2:
The combination of GOMEZ, NAKANISHI, BANG, and GOEL teaches:
The method according to claim 1,
CAREY teaches the following limitation(s) not taught by the combination of GOMEZ, NAKANISHI, BANG, and GOEL:
The method according to claim 1, wherein the target system information comprises at least one of a configuration information of the target system, network information of the target system, or component information of the target system ([0010] "... the one or more security metrics can include information related to a software, a hardware, or both a software and hardware configuration of the target computer device." and [0016] "... providing the one or more security metrics ... to determine a level of security vulnerability for the target computing device.").
Utilizing configuration and component information of a target system such as software and hardware information is a known technique in the art, as demonstrated by CAREY. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, and BANG with the system information of CAREY with the motivation to limit the types of information gathered about a target system to a select few which can be more accurately learned and analyzed by the system.
Claims 3 and 4 are rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), NAKANISHI et al (Doc ID US 20220067171 A1), BANG (Doc ID US 20220027793 A1), GOEL et al (Doc ID US 20230077510 A1), and CAREY et al (Doc ID US 20140137190 A1) as applied to claim 2 above, and further in view of DONGLE et al (Doc ID US 20250139252 A1).
Regarding claim 3:
The combination of GOMEZ, NAKANISHI, BANG, GOEL, and CAREY teaches:
The method according to claim 2,
DONGLE teaches the following limitation(s) not taught by the combination of GOMEZ, NAKANISHI, BANG, GOEL, and CAREY:
further comprising using the result to generate structured system data associated with the configuration information ([0099] "… This may include modules to perform any upfront, data transformation to consolidate the data into alternate forms by changing the value, structure, or format of the data …").
Collecting data in a uniform structure for further processing is a known technique in the art, as demonstrated by DONGLE. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, GOEL, and CAREY with the structured data of DONGLE with the motivation to pre-process the data prior to any transformation into valid training data to limit errors and more precisely control what aspects are kept, lost, and transformed.
Regarding claim 4:
The combination of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, and DONGLE teaches:
The method according to claim 3, further comprising converting the structured system data into a format that is recognizable by a finetuning module that comprises the first language model (DONGLE [0100] "In addition to improving the quality of the data, the data pre-processing engine 716 may implement feature extraction and/or selection techniques to generate training data 718.").
Converting structured data into data usable to train an ML model is a known technique in the art, as demonstrated by DONGLE. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, and DONGLE with the training data of DONGLE with the motivation to provide the appropriate format and labels to data so that it can be used to enable desired training outcomes in the ML model.
Claims 5-8 are rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), NAKANISHI et al (Doc ID US 20220067171 A1), BANG (Doc ID US 20220027793 A1), GOEL et al (Doc ID US 20230077510 A1), CAREY et al (Doc ID US 20140137190 A1), and DONGLE et al (Doc ID US 20250139252 A1) as applied to claim 3 above, and further in view of CACERES et al (Doc ID US 20030014669 A1).
Regarding claim 5:
The combination of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, and DONGLE teaches:
The method according to claim 3,
CACERES teaches the following limitation(s) not taught by the combination of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, and DONGLE:
wherein obtaining the set of security test patterns further comprises obtaining a set of test conditions provided by a user ([0057] "The console 105, shown in FIG. 2, compromises the security measures protecting the first target host 115 by executing a series of modules. The modules may be selected and initiated by the user.").
Obtaining testing techniques from a user is a known technique in the art, as demonstrated by CACERES. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, and DONGLE with the user-provided testing data of CACERES with the motivation to enable the system to accept testing data from a user who may provide more up-to-date or situation-appropriate data than what is kept in storage.
Regarding claim 6:
The combination of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, DONGLE, and CACERES teaches:
The method according to claim 5, wherein identifying the set of security test patterns further comprises identifying a current location of a scan module in relation to the structured system data (NAKANISHI [0023] "… the control apparatus with automated test suites plans a plurality of sequence of actions (attack paths) ...").
Examiner notes that in light of the rejection of this claim under 35 USC 112(a), provided prior art is limited to that which maps to the aspect of identifying test patterns are conditions.
Regarding claim 7:
The combination of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, DONGLE, and CACERES teaches:
The method according to claim 6, further comprising using the set of test conditions and the current location to determine the security test pattern (NAKANISHI [0023] "… the control apparatus with automated test suites plans a plurality of sequence of actions (attack paths) ... based on the initial state and the attack goal.").
Utilizing provided test conditions such as initial state and attack goals to determine test plans is a known technique in the art, as demonstrated by NAKANISHI. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, DONGLE, and CACERES with the test plans of NAKANISHI with the motivation to provide conditions from which attacks are planned. It is obvious to provide guidance for an attack so that the system can make informed decisions on the types of attacks to use.
Regarding claim 8:
The combination of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, DONGLE, and CACERES teaches:
The method according to claim 6, wherein at least some of the commands comprise a user-provided input (CACERES [0019] "... performing penetration testing … by installing a remote agent .… A user interface provided in the console and configured to send commands to and receive information from the local agent …").
Taking user-input for attack commands is a known technique in the art, as demonstrated by CACERES. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, GOEL, CAREY, DONGLE, and CACERES with the user-provided attack commands of CACERES with the motivation to allow user input to add commands to or supersede the commands of the system’s chosen attack plan.
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), NAKANISHI et al (Doc ID US 20220067171 A1), BANG (Doc ID US 20220027793 A1), and GOEL et al (Doc ID US 20230077510 A1) as applied to claim 1 above, and further in view of PICARD (Doc ID US 20210029154 A1).
Regarding claim 10:
The combination of GOMEZ, NAKANISHI, BANG, and GOEL teaches:
The method according to claim 1,
PICARD teaches the following limitation(s) not taught by the combination of GOMEZ, NAKANISHI, BANG, and GOEL:
further comprising, in response to determining that a command among the test commands deviates from a predetermined criterion, eliminating that command ([0017] "... determining if a module exists for ... potential attacks ..., potential attacks being invalid if no said module exists for said one of … potential attacks …" and [0018] "... determining if conditions for ... potential attack are present ..., said module-validated potential attack being invalid if said conditions are absent ...").
Removing invalid attacks from a penetration test is a known technique in the art, as demonstrated by PICARD. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, and GOEL with the attack trimming of PICARD with the motivation to ensure that only attacks which are valid, enabled, or safe to use are executed as part of the penetration test on the target system. It is obvious to remove attacks which are invalid for some reason.
Claims 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), NAKANISHI et al (Doc ID US 20220067171 A1), BANG (Doc ID US 20220027793 A1), and GOEL et al (Doc ID US 20230077510 A1) as applied to claim 1 above, and further in view of PICARD (Doc ID US 20210029154 A1) and OGURA et al (Doc ID US 20220156382 A1).
Regarding claim 11:
The combination of GOMEZ, NAKANISHI, BANG, and GOEL teaches:
The method according to claim 1,
PICARD teaches the following limitation(s) not taught by the combination of GOMEZ, NAKANISHI, BANG, and GOEL:
The method according to claim 1, further comprising verifying the test commands and storing the test commands in a database([0017] "... determining if a module exists for ... potential attacks ..., potential attacks being invalid if no said module exists for said one of … potential attacks …" and [0018] "... determining if conditions for ... potential attack are present ..., said module-validated potential attack being invalid if said conditions are absent ..."), and
Removing invalid attacks from a penetration test is a known technique in the art, as demonstrated by PICARD. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, and GOEL with the attack trimming of PICARD with the motivation to ensure that only attacks which are valid, enabled, or safe to use are executed as part of the penetration test on the target system. It is obvious to remove attacks which are invalid for some reason.
OGURA teaches the following limitation(s) not taught by the combination of GOMEZ, NAKANISHI, BANG, GOEL, and PICARD:
storing the security test pattern associated with the test commands in the database ([0032] "… For example, in some output form, the attack step information may be stored in a database …").
Storing attack commands in a database is a known technique in the art, as demonstrated by OGURA. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, GOEL, and PICARD with the stored attack commands of OGURA with the motivation maintain the planned commands in storage for later retrieval during testing.
Regarding claim 12:
The combination of GOMEZ, NAKANISHI, BANG, GOEL, PICARD, and OGURA teaches:
The method according to claim 11, wherein the set of test patterns is retrieved from the database (OGURA [0039] "The information processing device 100 retrieves one or more attack steps satisfying the precondition from an attack database ...").
Retrieving attack commands stored in a database for execution is a known technique in the art, as demonstrated by OGURA. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, NAKANISHI, BANG, GOEL, PICARD, and OGURA with the stored attack commands retrieval of OGURA with the motivation maintain the planned commands in storage for later retrieval during testing.
Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), and further in view of BANG (Doc ID US 20220027793 A1).
Regarding claim 13:
GOMEZ teaches:
An automated cybersecurity testing system comprising: a first language model that has been trained without using sensitive information of a target system ([0086] "As shown in FIG. 3, a user 310 can enter a prompt intended to be submitted to an LLM 360 .... At a sensitive data detection stage 320, the user entered prompt is scanned .... Then, … the user entered prompt is anonymized …"); and
Examiner notes that the prior art does not explicitly teach that LLM 360 is not trained on the "sensitive data of a target system"; however, the provided citation and drawings make this clearly implicit.
a second language model that has been trained using information comprising the sensitive information ([0055] "... machine learning (ML) approaches can be used for sensitive data detection. One example ML technique is supervised learning, wherein ML algorithms trained on annotated data can learn … personal data or business-critical information."),
the second language model configured to receive a first prompt comprising the sensitive information and return non-sensitive information ([0089] "At step 410, an original prompt intended to be sent to an LLM is entered through a user interface …" and [0091] "At step 430, a modified prompt which anonymizes the sensitive data detected in the original prompt can be generated (e.g., by the prompt modifier 136)."),
Examiner notes that the anonymized prompt is not generated by the model. Additional reference BANG is provided as prior art which uses a model for both recognition and anonymization of sensitive information.
the first language model configured to generate, in response to receiving a second prompt comprising the non-sensitive information, a first model response ([0092] "At 440, the modified prompt can be submitted to a large language model (e.g., the LLM 200 or 360)." and [0093] "At 450, a reply generated by the LLM is received (e.g., by the anonymizer 130). The reply generated by the LLM may contain anonymized sensitive data …").
BANG also teaches the following limitation(s) in addition to GOMEZ:
… receive a first prompt comprising the sensitive information and return non-sensitive information ([0029] "Referring to FIG. 1, the dedicated artificial intelligence system 100 includes ... an operation processor 130." and [0059] "Alternatively, referring to FIG. 6B, the operation processor 130 may ... generating the external query information by replacing sensitive information contained in the information input with meaningless information."),
Utilizing a machine learning (ML) model to recognize sensitive information in a prompt, masking the sensitive information in a new prompt, passing the sanitized prompt to another ML model to request commands, and de-masking the response with the first model are known techniques in the art, as demonstrated by GOMEZ.
Further, using an ML model to sanitize sensitive information in a prompt and outputting a sanitized version of the prompt is a known technique in the art, as demonstrated by BANG.
It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the multi-model prompt generation method of GOMEZ with the prompt sanitation of BANG with the motivation to utilize an ML model to generate the sanitized prompt for passing to the next model. It would be obvious to look to systems which offload functions from additional software or hardware modules to the model.
Claims 14-16 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1) and BANG (Doc ID US 20220027793 A1) as applied to claim 13 above, and further in view of NAKANISHI et al (Doc ID US 20220067171 A1).
Regarding claim 14:
The combination of GOMEZ and BANG teaches:
The system of claim 13,
a management server configured to generate the first prompt, the first prompt comprising the sensitive information and a request for security testing ([0089] "At step 410, an original prompt intended to be sent to an LLM is entered through a user interface ..." and [0090] "At step 420, sensitive data in the original prompt that violates a security protocol can be detected (e.g., by the detector 132)."); and
Examiner notes that the "request for security testing" is not given any function in this claim or any of its dependent claims, and is therefore considered intended use, and holds no patentable weight.
a finetuning module configured to train the second language model to learn the sensitive information ([0055] "... machine learning (ML) approaches can be used for sensitive data detection. One example ML technique is supervised learning, wherein ML algorithms trained on annotated data can learn … personal data or business-critical information.").
NAKANISHI teaches the following limitation(s) not taught by the combination of GOMEZ and BANG:
The system of claim 13, further comprising: a scan module configured to scan a target system to obtain the sensitive information ([0020] "… the control apparatus … acquires network structure information, vulnerability information, and the like of a system under test.);
Scanning a target system to obtain its information for a penetration test is a known technique in the art, as demonstrated by NAKANISHI. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the multi-model prompt sanitation of GOMEZ and BANG with the target scanning of NAKANISHI with the motivation to use information about a target system for testing. It is obvious to use what is generally considered a universal early step of penetration testing.
Regarding claim 15:
The combination of GOMEZ, BANG, and NAKANISHI teaches:
The system of claim 14, wherein the second language model, in response to receiving the first prompt, converts the sensitive information to non-sensitive information and communicates the non-sensitive information in the second prompt to the management server (GOMEZ [0091] "At step 430, a modified prompt which anonymizes the sensitive data detected in the original prompt can be generated (e.g., by the prompt modifier 136).").
Regarding claim 16:
The combination of GOMEZ, BANG, and NAKANISHI teaches:
The system of claim 14, wherein the second language model is configured to obtain the sensitive information from at least one of the finetuning module, the scan module, or user-provided data (GOMEZ [0055] "... machine learning (ML) approaches can be used for sensitive data detection. One example ML technique is supervised learning, wherein ML algorithms trained on annotated data can learn … personal data or business-critical information.").
Regarding claim 19:
The combination of GOMEZ, BANG, and NAKANISHI teaches:
The system of claim 14, wherein the scan module comprises a test tool library comprising a file system or database system to manage a security testing tool, the security testing tool comprising at least one of a network scanning tool, a vulnerability scanning tool, or a penetration testing tool (NAKANISHI [0021] "... the control apparatus with automated test suites calls other tools and acquires output of the other tools, for example. Examples of the other tools include a vulnerability scanner and a network exploration tool.").
Scanning a target system with a suite of tools including a vulnerability scanner and network mapper is a known technique in the art, as demonstrated by NAKANISHI. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, BANG, and NAKANISHI with the target scanning tools of NAKANISHI with the motivation to use established methods to acquire information about a target system. It is obvious to use methods long in use which are proven and known in the art.
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), BANG (Doc ID US 20220027793 A1), and NAKANISHI et al (Doc ID US 20220067171 A1) as applied to claim 14 above, and further in view of DONGLE et al (Doc ID US 20250139252 A1).
Regarding claim 17:
The combination of GOMEZ, BANG, and NAKANISHI teaches:
The system of claim 14,
DONGLE teaches the following limitation(s) not taught by the combination of GOMEZ, BANG, and NAKANISHI:
wherein the finetuning module is configured to receive input data or information automatically in a machine-readable format (DONGLE [0100] "In addition to improving the quality of the data, the data pre-processing engine 716 may implement feature extraction and/or selection techniques to generate training data 718.").
Converting structured data into data usable to train an ML model is a known technique in the art, as demonstrated by DONGLE. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, BANG, and NAKANISHI with the training data of DONGLE with the motivation to provide the appropriate format and labels to data so that it can be used to enable desired training outcomes in the ML model.
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1), BANG (Doc ID US 20220027793 A1), and NAKANISHI et al (Doc ID US 20220067171 A1) as applied to claim 14 above, and further in view of CAREY et al (Doc ID US 20140137190 A1).
Regarding claim 18:
The combination of GOMEZ, BANG, and NAKANISHI teaches:
The system of claim 14,
CAREY teaches the following limitation(s) not taught by the combination of GOMEZ, BANG, and NAKANISHI:
further comprising a database configured to store information about the target system, the information comprising at least one of network information, or security testing information, or test pattern results ([0010] "... the one or more security metrics can include information related to a software, a hardware, or both a software and hardware configuration of the target computer device." and [0016] "... providing the one or more security metrics ... to determine a level of security vulnerability for the target computing device.").
Utilizing configuration and component information of a target system such as software and hardware information is a known technique in the art, as demonstrated by CAREY. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention to modify the ML-enabled penetration testing of GOMEZ, BANG, and NAKANISHI with the system information of CAREY with the motivation to limit the types of information gathered about a target system to a select few which can be more accurately learned and analyzed by the system.
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over GOMEZ (Doc ID US 20250165648 A1) and BANG (Doc ID US 20220027793 A1) as applied to claim 13 above, and further in view of WANG et al (Doc ID US 20200226476 A1).
Regarding claim 20:
The combination of GOMEZ and BANG teaches:
The system of claim 13, wherein the first language model comprises a greater number of parameters than the second language model (GOMEZ [0055] "... machine learning (ML) approaches can be used for sensitive data detection. One example ... wherein ML algorithms trained … can learn … personal data or business-critical information." and [0059] "... the LLM 200 uses an autoregressive model (as implemented in OpenAI's GPT) .... The LLM 200 can be trained to maximize the likelihood of each word in the training dataset ...").
Examiner notes that while the prior art does not explicitly recite the number of parameters involved in each model, it would be obvious to one skilled in the art that a model trained only to recognize sensitive data would utilize fewer parameters than a general LLM like the recited OpenAI GPT. Additional prior art under WANG is provided as addition support of obviousness.
WANG also teaches this limitation in addition to the combination of GOMEZ and BANG:
wherein the first language model comprises a greater number of parameters than the second language model ([0016] In some non-limiting embodiments or aspects, the first model includes a greater number of parameters than the second model.).
Larger ML models utilizing a larger number of parameters than a smaller ML model is known in the art, as demonstrated by WANG. It would have been obvious to a PHOSITA before the effective filing date of the claimed invention that the ML-enabled penetration testing of GOMEZ, BANG, and NAKANISHI would follow this convention demonstrated by WANG, with the motivation to follow typical conventions of ML, as to do otherwise would represent wasted effort.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRANDON BINCZAK whose telephone number is (703)756-4528. The examiner can normally be reached M-F 0800-1700.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexander Lagor can be reached on (571) 270-5143. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BB/Examiner, Art Unit 2437
/BENJAMIN E LANIER/Primary Examiner, Art Unit 2437