Prosecution Insights
Last updated: April 19, 2026
Application No. 18/679,009

CONTRASTIVE FINE-TUNING ALIGNMENT

Non-Final OA §101§103
Filed
May 30, 2024
Examiner
LOWEN, NICHOLAS DANIEL
Art Unit
2653
Tech Center
2600 — Communications
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
62%
Grant Probability
Moderate
1-2
OA Rounds
2y 7m
To Grant
99%
With Interview

Examiner Intelligence

Grants 62% of resolved cases
62%
Career Allow Rate
5 granted / 8 resolved
+0.5% vs TC avg
Strong +75% interview lift
Without
With
+75.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
23 currently pending
Career history
31
Total Applications
across all art units

Statute-Specific Performance

§101
36.3%
-3.7% vs TC avg
§103
42.0%
+2.0% vs TC avg
§102
17.2%
-22.8% vs TC avg
§112
3.2%
-36.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 8 resolved cases

Office Action

§101 §103
DETAILED ACTION This communication is in response to the Application filed on 5/30/2024. Claims 1-20 are pending and have been examined. Notice of Pre-AIA or AIA Status The present application, filed on or after March 13, 2013, is being examined under the first inventor to file provisions of the AIA . Information Disclosure Statement The information disclosure statement (IDS) submitted on 5/30/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. Claim Objections Claims 1-3, and 9 are objected to because of the following informalities: Claim 1 introduces a “tuning component” whereas claims 2, 3, and 9 refer to a “fine-tuning component”. These appear to be referring to the same component and should be amended to use the same language. Appropriate correction is required. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Claims 1 and 19 recite A system, comprising: a [memory] that stores computer executable components; and a [processor] that executes the computer executable components stored in the memory, the executable components comprising: a negative data generation component configured to generate, using a [first language model] trained to generate misaligned natural language responses to natural language prompts, misaligned natural language responses to sample natural language prompts, and to generate unlikelihood training data comprising the misaligned natural language responses, wherein the misaligned natural language responses violate a response preference to which a [second language model] is to be aligned; and a tuning component configured to train the second language model, using the unlikelihood training data, to generate responses that align with the response preference. The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. The human mind is capable of creating misaligned responses to prompts. For example, teaching someone in customer service how they should not respond to questions by providing them examples of inappropriate response. They could create a written guide on responding to questions by providing sample questions, correct responses, and incorrect responses. This written guide would be used to teach another human how to avoid the inappropriate responses. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. This judicial exception is not integrated into a practical application. The claims recite the additional components of a memory, a processor, and a first and second language model. The memory is merely a generic computer component being used to apply the method. The memory is detailed in paragraph 63 of the specification with a generalized description of the component. The processor is merely a generic computer component being used to apply the method. The processor is detailed in paragraph 66 of the specification with a generalized description of the component. The first language model is merely being used to apply the method of producing misaligned responses to prompts. The first language model is detailed in paragraph 23 of the specification and is said to be a pre-trained model trained on generalized data. The second language model is merely being used to apply the data to a training step with no description on how it is done. The second language model is detailed in paragraph 31 of the specification as a generic LLM or pre-trained model. Claim 19 specifically lists the additional components of computer-readable storage medium. The computer-readable storage medium is merely a generic computer component being used to apply the method. The computer-readable storage medium is detailed in paragraph 63 of the specification with a generalized description of the component. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claim 11 recites A computer-implemented method, comprising: generating, by a system comprising a [processor] and using a [first language model] trained to generate misaligned natural language responses to natural language prompts, misaligned natural language responses to sample natural language prompts, wherein the misaligned natural language responses characterize a response type that a [second language model] is to be trained to suppress; generating, by the system, unlikelihood training data comprising the misaligned natural language responses; and training, by the system, the second language model, using the unlikelihood training data, to suppress responses corresponding to the response type. The limitations in this claim, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. The human mind is capable of creating misaligned responses to prompts. For example, teaching someone in customer service how they should not respond to questions by providing them examples of inappropriate response. They could create a written guide on responding to questions by providing sample questions, correct responses, and incorrect responses. This written guide would be used to teach another human how to avoid the inappropriate responses. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim recites the additional components of a processor and a first and second language model. The processor is merely a generic computer component being used to apply the method. The processor is detailed in paragraph 66 of the specification with a generalized description of the component. The first language model is merely being used to apply the method of producing misaligned responses to prompts. The first language model is detailed in paragraph 23 of the specification and is said to be a pre-trained model trained on generalized data. The second language model is considered an intended use for the data created by the method (use it to train another model), thus this component is considered post-solution activity as there are no specifics on how to use this data. The second language model is detailed in paragraph 31 of the specification as a generic LLM or pre-trained model. Accordingly, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible. Claims 2 and 12 recite wherein the fine-tuning component is configured to perform supervised fine-tuning on the first language model that trains the first language model to generate the misaligned natural language responses to the natural language prompts. The limitation in these claims, as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. The human mind can perform a “supervised fine-tuning” by asking for input from others when creating responses. For example, when creating inappropriate customer service responses, asking coworkers if they also feel this is an inappropriate response. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. This judicial exception is not integrated into a practical application. The claims do not recite any additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claims 3 and 13 recite wherein the fine-tuning component is configured to perform the supervised fine-tuning on the first language model using a misaligned dataset comprising sample misaligned natural language responses that violate the response preference. The limitation in these claims, as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. From the independent claim example, the human mind is capable of improving at creating inappropriate customer service response by learning from examples of inappropriate responses. Furthermore, this could be considered a design decision on the type of data to train a model with, which is a decision the human mind is capable of making. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. This judicial exception is not integrated into a practical application. The claims do not recite any additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claims 4 and 14 recite wherein the negative data generation component is configured to generate the misaligned natural language responses using the first language model and an aligned dataset comprising the sample natural language prompts and corresponding aligned natural language responses that align with the response preference. The limitation in these claims, as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. From the independent claim example, the human mind is capable of at creating inappropriate customer service response guide by referencing sample question and appropriate response. Furthermore, this could be considered a design decision on the type of data to train a model with, which is a decision the human mind is capable of making. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. This judicial exception is not integrated into a practical application. The claims do not recite any additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claims 5 and 15 recite wherein the negative data generation component is configured to generate the unlikelihood training data to include the misaligned natural language responses, the sample natural language prompts, and the aligned natural language responses. The limitation in these claims, as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. From the independent claim example, the human mind is capable of creating a written set of sample questions, inappropriate response, and appropriate response. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. This judicial exception is not integrated into a practical application. The claims do not recite any additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claims 6 and 16 recite wherein the response preference specifies that the second language model is to generate responses that at least one of omit biased, omit toxic language, omit misinformation, maximize legibility, omit language that violates a copywrite, or omits harmful information. The limitation in these claims, as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. From the independent claim example, the human mind is capable of creating a guide that tells other not to use responses with bias, toxic language, misinformation, illegibility, language that violates copywrite, or harmful information. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. This judicial exception is not integrated into a practical application. The claims do not recite any additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claims 7 and 17 recite further comprising a conditional supervised fine-tuning (SFT) component configured to perform conditional fine-tuning on the second language model using a prosocial dataset comprising sample problematic prompts and corresponding prosocial natural language responses to the sample problematic prompts. The limitation in these claims, as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. From the independent claim example, the guide could also include a section that include questions that intentionally try to provoke inappropriate responses and use this to test new employees in training. These could be paired with the correct answer to help them learn. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. This judicial exception is not integrated into a practical application. The claims do not recite any additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claims 8 and 18 recite wherein the sample problematic prompts comprise requests for information that facilitate harm to a person, a system, or property. The limitation in these claims, as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. From the previous claim example, these sample question could intentionally try to provoke answers that facilitate harm to a person, system, or property. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. This judicial exception is not integrated into a practical application. The claims do not recite any additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible. Claim 9 recites wherein training of the second language model by the fine-tuning component using the unlikelihood training data causes the second language model to suppress generation of responses that do not align with the response preference in response to prompts submitted to the second language model. The limitation in this claim, as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. From the independent claim example, the customer service guide created would serve the purpose of teaching new employees to avoid these types of inappropriate answers to customer questions. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim does not recite any additional components that were not present in the independent claim. Accordingly, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible. Claim 10 recites further comprising a user interface component configured to render a [user interface] on a [client device] and to receive, via interaction with the user interface, a natural language prompt; and an analysis component configured to submit the natural language prompt to the second language model and to obtain a natural language response to the prompt generated by the second language model based on processing of the natural language prompt, wherein the user interface component is further configured to render the natural language response on the user interface. The limitations in this claim, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. From the independent claim example, the employee that received training from the guide could receive a question from a customer in the form of a written letter. They could then write a response to the question and send it back to the customer for them to see. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim recites the additional components of a user interface and a client device. The user interface is being used for extra-solution activity such as data gathering (pre-solution) and data presentment (post-solution). The user interface is detailed in paragraph 42 of the specification and is directly associated with generic examples of the client’s device. The client device is merely a generic computer component being used to apply the method. The client device is detailed in paragraph 42 of the specification with generic example components provided. Accordingly, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible. Claim 20 recites further comprising performing, by the processor, supervised fine-tuning on the first language model using a misaligned dataset comprising sample misaligned natural language responses that violate the response preference, wherein the supervised fine-tuning trains the first language model to generate the misaligned natural language responses to the natural language prompts. The limitations in this claim, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. The human mind can perform a “supervised fine-tuning” by asking for input from others when creating responses. From the independent claim example, they could learn from examples of inappropriate responses in order to better create their own inappropriate response. Then when creating their own inappropriate customer service responses, they could also ask coworkers if they also feel this response is inappropriate. Furthermore, this could be considered a design decision on the type of data to train a model with, which is a decision the human mind is capable of making. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. The claim does not recite any additional components that were not present in the independent claim. Accordingly, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1, 4-6, 9-11, 14-16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Publication 12259952 B2 (Sun et al.) in view of US Patent Application Publication US 20250139445 A1 (Gao et al.). Regarding Claims 1 and 19, Sun et al. teaches A system, comprising: a memory that stores computer executable components; and a processor that executes the computer executable components stored in the memory, the executable components comprising: (The software architecture 304 is supported by hardware such as a machine 302 that includes processors 320, memory 326, and I/O components 338.) (Col. 3, Lines 61-64) Claim 19 alternatively states: A computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: (The main memory 406, the static memory 416, and storage unit 418 store the instructions 410 embodying any one or more of the methodologies or functions described herein.) (Col. 5, Lines 61-64) a negative data generation component configured to generate, using a first language model trained to generate misaligned natural language responses to natural language prompts, (Next, GPT-4 simulates a conversation between a user and a dialog agent that violates the rule according to the provided scenario. This scenario-guided data generation method results in a more diverse set of examples compared to directly generating conversations.) (Col. 2, Lines 32-37). (As seen in FIG. 6, in the conversation generation step, 3 different types of conversations are generated to fine-tune our GPT-3 models: 1. Violations, 2. Contrastive Nonviolations, and 3. Nonviolations. Starting with Violations, using the scenarios generated above, rule-violating synthetic user-agent conversations (Prompt 3) are generated.) (Col. 9, Lines 4-17) Sun et al. generates rule-violating responses (negative data) to conversations. It uses a first language model generate rule-violating conversation. It uses GPT-4 as the first language model and then fine-tunes a GPT-3 model with the generated data misaligned natural language responses to sample natural language prompts, (Starting with Violations, using the scenarios generated above, rule-violating synthetic user-agent conversations (Prompt 3) are generated. For each rule, we rotate through the 7-10 scenarios in a roundrobin fashion and generate an equal number of conversations for each rule. The entire conversation is generated and truncate it to the last 2 turns. This generates more realistic conversations than prompting the model to just generate the last two turns of a hypothetical conversation.) (Col. 9, Lines 4-17) The system utilizes scenarios which act as templates from which replies are generated in order to create the training data. Rule-violating responses are generated for each scenario. and to generate unlikelihood training data comprising the misaligned natural language responses, (The generative artificial intelligence 212 (e.g., large language model) operationally generates training data sets for the trained guardrail model 214, and the trained guardrail model 214 operationally verifies that output from the automated software 210 complies with rules.) (Col. 3, 48-53). (As seen in FIG. 6, in the conversation generation step, 3 different types of conversations are generated to fine-tune our GPT-3 models: 1. Violations, 2. Contrastive Nonviolations, and 3. Nonviolations. … This set of generated data is used to fine-tune GPT-3 models.) (Col. 9, Lines 4-36). The violations being generated are used as training data for a guardrail model (fine-tuned GPT-3 model) along with generated non-violations. wherein the misaligned natural language responses violate a response preference to which a second language model is to be aligned; (As seen in FIG. 6, in the conversation generation step, 3 different types of conversations are generated to fine-tune our GPT-3 models: 1. Violations, 2. Contrastive Nonviolations, and 3. Nonviolations. Starting with Violations, using the scenarios generated above, rule-violating synthetic user-agent conversations (Prompt 3) are generated.) (Col. 9, Lines 4-17). (FIG. 5 illustrates an example guardrail task. In this example, the automated software 210 (e.g., virtual assistant, chatbot, etc.) in the restaurant domain provides information about an ongoing promotion to the user, thereby breaking rule 2. The guardrail model uses the last 2 turns of the conversation to classify the last two turns as a rule violation (which rule) or no violation.) (Col. 7, Lines 54-60). Sun et al. uses a set of rules that act as response preferences. In the above example the rule is to not provide information on promotions. In this case, a generated violation would be breaking that rule as seen in Fig. 5. and a tuning component configured to train the second language model, using the unlikelihood training data, (In addition to directly generating non-violating conversations, contrastive example generation takes further advantage of LLM's (e.g., GPT-4) generation capabilities and provides a richer dataset for model training. The combined dataset is used to fine-tune a GPT-3 instance to serve as a guardrail model.) (Col. 2, Lines 40-45). (As seen in FIG. 6, in the conversation generation step, 3 different types of conversations are generated to fine-tune our GPT-3 models: 1. Violations, 2. Contrastive Nonviolations, and 3. Nonviolations.) (Col. 9, Lines 4-17) Sun et al. uses an LLM such as GPT-4 to generate the training dataset which then fine-tunes a guardrail model which is the second language model in this instance. Sun et al. does not explicitly teach: to generate responses that align with the response preference. The guardrail model created by Sun et al. does not directly generate response but rather acts as a helper for a different language model to guide it in generating responses that align with the rules/user preferences. However, Gao et al. teaches to generate responses that align with the response preference. (At step 304, negative examples are generated and labeled. As with the positive examples, the negative examples can be generated and labeled manually in one or more embodiments.) (Paragraph 30). (At step 306, both positive and negative examples may be used for contrastive in-context learning prompts for the large language model. For example, a prompt writing module (e.g., prompt writing module 208 shown in FIG. 2) may feed both positive and negative examples to the large language model as a part of a contrastive in-context learning protocol.) (Paragraph 32). (At step 308, the large language model is deployed. The deployment may be on any type of application. One example deployment may be on Chatbots/AI agents, which interact with a plurality of users with similar questions/issues.) (Paragraph 35). Gao et al. teaches a contrastive learning method in which positive and negative data is generated to form a dataset that is directly used to train an LLM which can be deployed as a Chatbot/AI agent. It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the contrastive model training as taught by Sun et al. to directly train the user level model rather than a guardrail model as taught by Y et al. This would have been an obvious alternative implementation as it would allow the LLM to be directly adapted to a user’s response preferences (Gao et al. Paragraphs 2-3). Regarding Claims 11, Sun et al. teaches A computer-implemented method, comprising: (The machine 400 may comprise, but not be limited to, a server computer, a client computer, a personal computer (PC), a tablet computer, a laptop computer,) (Col. 5, Lines 23-25). generating, by a system comprising a processor and using a first language model trained to generate misaligned natural language responses to natural language prompts, (Next, GPT-4 simulates a conversation between a user and a dialog agent that violates the rule according to the provided scenario. This scenario-guided data generation method results in a more diverse set of examples compared to directly generating conversations.) (Col. 2, Lines 32-37). (As seen in FIG. 6, in the conversation generation step, 3 different types of conversations are generated to fine-tune our GPT-3 models: 1. Violations, 2. Contrastive Nonviolations, and 3. Nonviolations. Starting with Violations, using the scenarios generated above, rule-violating synthetic user-agent conversations (Prompt 3) are generated.) (Col. 9, Lines 4-17) Sun et al. generates rule-violating responses (negative data) to conversations. It uses a first language model generate rule-violating conversation. It uses GPT-4 as the first language model and then fine-tunes a GPT-3 model with the generated data misaligned natural language responses to sample natural language prompts, (Starting with Violations, using the scenarios generated above, rule-violating synthetic user-agent conversations (Prompt 3) are generated. For each rule, we rotate through the 7-10 scenarios in a roundrobin fashion and generate an equal amount of conversations for each rule. The entire conversation is generated and truncate it to the last 2 turns. This generates more realistic conversations than prompting the model to just generate the last two turns of a hypothetical conversation.) (Col. 9, Lines 4-17) The system utilizes scenarios which act as templates from which replies are generated in order to create the training data. Rule-violating responses are generated for each scenario. wherein the misaligned natural language responses characterize a response type that a second language model is to be trained to suppress; (In the last turn of the example conversation in FIG. 5, the automated software 210 breaks rule r=2: Do not provide information on promotions, discounts, or special offers, related to the restaurant. The expected behavior of the agent model A varies by the outcome of the guardrail. If no violation is found, the conversation continues as normal. Otherwise, the agent model A must regenerate its output, escalate to a human expert, or end the conversation.) (Col. 8, Lines 30-37). The trained guardrail model suppresses response that violate rules/response preferences generating, by the system, unlikelihood training data comprising the misaligned natural language responses; (The generative artificial intelligence 212 (e.g., large language model) operationally generates training data sets for the trained guardrail model 214, and the trained guardrail model 214 operationally verifies that output from the automated software 210 complies with rules.) (Col. 3, 48-53). (As seen in FIG. 6, in the conversation generation step, 3 different types of conversations are generated to fine-tune our GPT-3 models: 1. Violations, 2. Contrastive Nonviolations, and 3. Nonviolations. … This set of generated data is used to fine-tune GPT-3 models.) (Col. 9, Lines 4-36). The violations being generated are used as training data for a guardrail model (fine-tuned GPT-3 model) along with generated non-violations. Sun et al. does not explicitly teach: and training, by the system, the second language model, using the unlikelihood training data, to suppress responses corresponding to the response type. The guardrail model created by Sun et al. does not directly generate response but rather acts as a helper for a different language model to guide it in generating responses that align with the rules/user preferences. However, Gao et al. teaches and training, by the system, the second language model, using the unlikelihood training data, to suppress responses corresponding to the response type. (The large language model with such contrastive in-context learning can generate specific responses/answers based on user preferences, generally not possible using conventional models.) (Paragraph 15). (At step 304, negative examples are generated and labeled. As with the positive examples, the negative examples can be generated and labeled manually in one or more embodiments.) (Paragraph 30). (At step 306, both positive and negative examples may be used for contrastive in-context learning prompts for the large language model. For example, a prompt writing module (e.g., prompt writing module 208 shown in FIG. 2) may feed both positive and negative examples to the large language model as a part of a contrastive in-context learning protocol.) (Paragraph 32). (At step 308, the large language model is deployed. The deployment may be on any type of application. One example deployment may be on Chatbots/AI agents, which interact with a plurality of users with similar questions/issues.) (Paragraph 35). Gao et al. teaches a contrastive learning method in which positive and negative data is generated to form a dataset that is directly used to train an LLM which can be deployed as a Chatbot/AI agent. It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the contrastive model training as taught by Sun et al. to directly train the user level model rather than a guardrail model as taught by Y et al. This would have been an obvious alternative implementation as it would allow the LLM to be directly adapted to a user’s response preferences (Gao et al. Paragraphs 2-3). Regarding Claims 4 and 14, Sun et al. in view of Gao et al. teaches the system of claims 1 and 11. Furthermore, Sun et al. teaches wherein the negative data generation component is configured to generate the misaligned natural language responses using the first language model and an aligned dataset comprising the sample natural language prompts and corresponding aligned natural language responses that align with the response preference. (An example multi-stage generation pipeline is shown in FIG. 6. For each rule r, a LLM generates a set of scenarios (Prompt 2). Each scenario represents a high-level reason why a rule might be violated.) (Col. 8, Lines 55-58). (In addition to rule-violating conversations, non-rule-violating conversations are generated. These conversations are produced in two ways. Contrastive Nonviolations are created by taking each rule-violating conversation and remove just the automated software 210 line that was a violation (aT). This is replaced with a non-violating assistant utterance (Prompt 4). By using this contrastive learning approach, non-violations are generated that are very similar to violations. As the entire conversation is the same up to the last message, this forces the model to focus on just the agent output. Finally, Nonviolation conversations are generated by few-shot prompting GPT-4 to output a conversation that does not violate any of the rules in our rule group. These conversations are sliced at different points in the conversations to give us a wide variety of non-violations throughout the conversation, which will allow the model to generalize throughout the progression of the conversation. This set of generated data is used to fine-tune GPT-3 models.) (Col. 9, Lines 17-36). (In one aspect, the routine 900 of guarding an automated software 210, includes generating block 902, by a first language model 212, a training set of rule-violating data (e.g., conversations); generating block 904, by the first language model 212, a training set of contrastive examples by altering the rule-violating data (e.g., conversations) into non-violating data (conversations); training block 906 a guardrail machine learning model 214 using the generated training sets;) (Col. 20, Lines 63-67). The GPT-3 model is trained with a dataset of scenarios, violation response, and non-violation response in order to create the guardrail model. A GPT-4 model (the first language model) generates the responses of each type based on the same conversation scenarios. These scenarios are created based on a set of rules that are considered response preferences. Regarding Claims 5 and 15, Sun et al. in view of Gao et al. teaches the system of claims 4 and 14. Furthermore, Sun et al. teaches wherein the negative data generation component is configured to generate the unlikelihood training data to include the misaligned natural language responses, the sample natural language prompts, and the aligned natural language responses. (An example multi-stage generation pipeline is shown in FIG. 6. For each rule r, a LLM generates a set of scenarios (Prompt 2). Each scenario represents a high-level reason why a rule might be violated.) (Col. 8, Lines 55-58). (Starting with Violations, using the scenarios generated above, rule-violating synthetic user-agent conversations (Prompt 3) are generated. For each rule, we rotate through the 7-10 scenarios in a roundrobin fashion and generate an equal amount of conversations for each rule. The entire conversation is generated and truncate it to the last 2 turns. … In addition to rule-violating conversations, non-rule-violating conversations are generated. … By using this contrastive learning approach, non-violations are generated that are very similar to violations. As the entire conversation is the same up to the last message, this forces the model to focus on just the agent output. Finally, Nonviolation conversations are generated by few-shot prompting GPT-4 to output a conversation that does not violate any of the rules in our rule group. These conversations are sliced at different points in the conversations to give us a wide variety of non-violations throughout the conversation, which will allow the model to generalize throughout the progression of the conversation. This set of generated data is used to fine-tune GPT-3 models.) (Col. 9, Lines 9-36) (In one aspect, the routine 900 of guarding an automated software 210, includes generating block 902, by a first language model 212, a training set of rule-violating data (e.g., conversations); generating block 904, by the first language model 212, a training set of contrastive examples by altering the rule-violating data (e.g., conversations) into non-violating data (conversations); training block 906 a guardrail machine learning model 214 using the generated training sets;) (Col. 20, Lines 63-67). A dataset is formed comprising the conversations/scenarios, the violating response, and the non-violating responses. Scenarios represent the natural language prompts as these are the base for each response (violating or non-violating) that the first language model produces. The rule-violating response represent misaligned response and the non-violating data represent aligned responses. Regarding Claims 6 and 16, Sun et al. in view of Gao et al. teaches the system of claims 1 and 11. Furthermore, Sun et al. teaches wherein the response preference specifies that the second language model is to generate responses that at least one of omit biased, omit toxic language, omit misinformation, maximize legibility, omit language that violates a copywrite, or omits harmful information. (FIG. 5 illustrates an example guardrail task. In this example, the automated software 210 (e.g., virtual assistant, chatbot, etc.) in the restaurant domain provides information about an ongoing promotion to the user, thereby breaking rule 2. The guardrail model uses the last 2 turns of the conversation to classify the last two turns as a rule violation (which rule) or no violation.) (Col. 7, Lines 54-60). (In the last turn of the example conversation in FIG. 5, the automated software 210 breaks rule r=2: Do not provide information on promotions, discounts, or special offers, related to the restaurant. The expected behavior of the agent model A varies by the outcome of the guardrail. If no violation is found, the conversation continues as normal. Otherwise, the agent model A must regenerate its output, escalate to a human expert, or end the conversation.) (Col. 8, Lines 30-37). Sun et al. uses a set of rules that act as response preferences. In the above example the rule is to not provide information on promotions. In this case, that rule/response preference could be considered a biased response (advertising a particular restaurant) or harmful information (restaurant owner does not want this information available). Regarding Claim 9, Sun et al. in view of Gao et al. teaches the system of claim 1. Furthermore, Sun et al. teaches wherein training of the second language model by the fine-tuning component using the unlikelihood training data causes the second language model to suppress generation of responses that do not align with the response preference in response to prompts submitted to the second language model. (The automated software 210 operationally generates a conversation with a user, turn by turn, or performs other automated tasks. The generative artificial intelligence 212 (e.g., large language model) operationally generates training data sets for the trained guardrail model 214, and the trained guardrail model 214 operationally verifies that output from the automated software 210 complies with rules.) (Col. 3, Lines 39-53) (In the last turn of the example conversation in FIG. 5, the automated software 210 breaks rule r=2: Do not provide information on promotions, discounts, or special offers, related to the restaurant. The expected behavior of the agent model A varies by the outcome of the guardrail. If no violation is found, the conversation continues as normal. Otherwise, the agent model A must regenerate its output, escalate to a human expert, or end the conversation.) (Col. 8, Lines 30-37). The guardrail model acts to suppress generation of responses that no not align with rules/response preferences. Furthermore, as shown in Gao et al., this model could be directly trained on a Chatbot/AI agent rather than operating alongside an automated software as seen in Sun et al. Regarding Claim 10, Sun et al. in view of Gao et al. teaches the system of claim 1. Furthermore, Sun et al. teaches further comprising a user interface component configured to render a user interface on a client device and to receive, via interaction with the user interface, a natural language prompt; (The frameworks 308 provide a high-level common infrastructure used by the applications 306. For example, the frameworks 308 provide various graphical user interface (GUI) functions, high-level resource management, and high-level location services) (Col. 4, Lines 43-47) (FIG. 5 illustrates an example guardrail task. In this example, the automated software 210 (e.g., virtual assistant, chatbot, etc.)) (Col. 7, Lines 54-60) (generating an output based on user input (e.g., engaging block 908, with automated software 210, in conversation with a user); monitoring block 910 with the trained guardrail machine learn model 214 whether the generated output (e.g., a turn of the conversation) violates a rule;) (Col. 21, Lines 3-7). Sun et al. teaches a software system that the user interacts with which can include a user interface. and an analysis component configured to submit the natural language prompt to the second language model and to obtain a natural language response to the prompt generated by the second language model based on processing of the natural language prompt, (FIG. 5 illustrates an example guardrail task. In this example, the automated software 210 (e.g., virtual assistant, chatbot, etc.) in the restaurant domain provides information about an ongoing promotion to the user, thereby breaking rule 2. The guardrail model uses the last 2 turns of the conversation to classify the last two turns as a rule violation (which rule) or no violation.) (Col. 7, Lines 54-60) (generating an output based on user input (e.g., engaging block 908, with automated software 210, in conversation with a user); monitoring block 910 with the trained guardrail machine learn model 214 whether the generated output (e.g., a turn of the conversation) violates a rule;) (Col. 21, Lines 3-7). (training block 906 a guardrail machine learning model 214 using the generated training sets; generating an output based on user input (e.g., engaging block 908, with automated software 210, in conversation with a user); monitoring block 910 with the trained guardrail machine learn model 214 whether the generated output (e.g., a turn of the conversation) violates a rule; and preventing block 912 the automated software from transmitting to the user the generated output (e.g., turn) that violates a rule.) (Col. 21, Lines 1-10). The automated software responds to user input with the trained guardrail model operating alongside it to avoid violating response preferences. After receiving the user input the guardrail model (fine-tuned GPT-3 model/ second language model) is prompted to verify if the output should be provided to the user. Furthermore, as shown in Gao et al., this model could be directly trained on a Chatbot/AI agent rather than operating alongside an automated software as seen in Sun et al. wherein the user interface component is further configured to render the natural language response on the user interface. (The automated software 210 operationally generates a conversation with a user, turn by turn, or performs other automated tasks. The generative artificial intelligence 212 (e.g., large language model) operationally generates training data sets for the trained guardrail model 214, and the trained guardrail model 214 operationally verifies that output from the automated software 210 complies with rules.) (Col. 3, Lines 46-53). (The frameworks 308 provide a high-level common infrastructure used by the applications 306. For example, the frameworks 308 provide various graphical user interface (GUI) functions, high-level resource management, and high-level location services) (Col. 4, Lines 43-47) The automated software outputs the response to the user which can be done through a graphical user interface. An example of such conversations can be seen in Fig. 5. Claims 2, 3, 12, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Publication 12259952 B2 (Sun et al.) in view of US Patent Application Publication US 20250139445 A1 (Gao et al.) and further in view of “LLM-Based Synthetic Datasets: Applications and Limitations in Toxicity Detection” (Schmidhuber et al.). Regarding Claims 2 and 12, Sun et al. in view of Gao et al. teaches the system of claims 1 and 11. Sun et al. in view of Gao et al. does not explicitly teach: wherein the fine-tuning component is configured to perform the supervised fine-tuning on the first language model using a misaligned dataset comprising sample misaligned natural language responses that violate the response preference. Sun et al. utilizes a language model to generate misaligned datasets but does not explicitly state training the language model on misaligned data first. However, Schmidhuber et al. teaches wherein the fine-tuning component is configured to perform supervised fine-tuning on the first language model that trains the first language model to generate the misaligned natural language responses to the natural language prompts. (During pre-processing, all datasets were transformed to be binary (0: non-toxic, 1: toxic).) (Section 3.4, Paragraph 1). (Dorig-train was split by class label. This split results in two datasets, Dorig-0 and Dorig-1, to fine-tune two GPT-3 Curie models, respectively. … These datasets are then used to fine-tune a GPT 3 Curie model via the OpenAI API, resulting in FTorig-0 and FTorig-1. The fine-tuned models are prompted (”) to generate a total of 40.000 synthetic samples per class-label, resulting in Dsynth-0 and Dsynth-1.) (Section 3.5, Paragraphs 1-2). Schmidhuber et al. creates a dataset of toxic data to test toxic data classifiers. In order to do this, they take an initial toxic dataset (Dorig-1) and use it to fine-tune a GPT-3 Curie model which then creates a larger toxic dataset (Dsynth-1). It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the contrastive model training as taught by Sun et al. in view of Gao et al. to include training the first LLM to better produce negative responses as taught by Schmidhuber et al. This would have been an obvious improvement as Sun et al. is already using a GPT model to generate negative responses and this alleviates the burden of manually creating the dataset. (Schmidhuber et al. Introduction, Paragraph 1). Regarding Claims 3 and 13, Sun et al. in view of Gao et al. and Schmidhuber et al. teaches the system of claims 2 and 12. Furthermore, Schmidhuber et al. teaches wherein the fine-tuning component is configured to perform the supervised fine-tuning on the first language model using a misaligned dataset comprising sample misaligned natural language responses that violate the response preference. (We evaluated six datasets. Davidson (Davidson et al., 2017), Founta (Founta et al., 2018), HatEval (Basile et al., 2019) and Stormfront (de Gibert et al., 2018) are also investigated by Wullach et al. (2020, 2021) and focus on English Hate Speech detection.) (Section 3.1, Paragraph 1) (During pre-processing, all datasets were transformed to be binary (0: non-toxic, 1: toxic).) (Section 3.4, Paragraph 1). (Dorig-train was split by class label. This split results in two datasets, Dorig-0 and Dorig-1, to fine-tune two GPT-3 Curie models, respectively. … These datasets are then used to fine-tune a GPT 3 Curie model via the OpenAI API, resulting in FTorig-0 and FTorig-1. The fine-tuned models are prompted (”) to generate a total of 40.000 synthetic samples per class-label, resulting in Dsynth-0 and Dsynth-1.) (Section 3.5, Paragraphs 1-2). Initial datasets are used to train the GPT-3 Curie model which then generates more data on its own. Regarding Claim 20, Sun et al. in view of Gao et al. teaches the system of claims 19. Sun et al. in view of Gao et al. does not explicitly teach: further comprising performing, by the processor, supervised fine-tuning on the first language model using a misaligned dataset comprising sample misaligned natural language responses that violate the response preference, wherein the supervised fine-tuning trains the first language model to generate the misaligned natural language responses to the natural language prompts. Sun et al. utilizes a language model to generate misaligned datasets but does not explicitly state training the language model on misaligned data first. However, Schmidhuber et al. teaches further comprising performing, by the processor, supervised fine-tuning on the first language model using a misaligned dataset comprising sample misaligned natural language responses that violate the response preference, wherein the supervised fine-tuning trains the first language model to generate the misaligned natural language responses to the natural language prompts. (We evaluated six datasets. Davidson (Davidson et al., 2017), Founta (Founta et al., 2018), HatEval (Basile et al., 2019) and Stormfront (de Gibert et al., 2018) are also investigated by Wullach et al. (2020, 2021) and focus on English Hate Speech detection.) (Section 3.1, Paragraph 1) (During pre-processing, all datasets were transformed to be binary (0: non-toxic, 1: toxic).) (Section 3.4, Paragraph 1). (Dorig-train was split by class label. This split results in two datasets, Dorig-0 and Dorig-1, to fine-tune two GPT-3 Curie models, respectively. … These datasets are then used to fine-tune a GPT 3 Curie model via the OpenAI API, resulting in FTorig-0 and FTorig-1. The fine-tuned models are prompted (”) to generate a total of 40.000 synthetic samples per class-label, resulting in Dsynth-0 and Dsynth-1.) (Section 3.5, Paragraphs 1-2). Schmidhuber et al. creates a dataset of toxic data to test toxic data classifiers. In order to do this, they take an initial toxic dataset (Dorig-1) and use it to fine-tune a GPT-3 Curie model which then creates a larger toxic dataset (Dsynth-1). It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the contrastive model training as taught by Sun et al. in view of Gao et al. to include training the first LLM to better produce negative responses as taught by Schmidhuber et al. This would have been an obvious improvement as Sun et al. is already using a GPT model to generate negative responses and this alleviates the burden of manually creating the dataset. (Schmidhuber et al. Introduction, Paragraph 1). Claims 7, 8, 17, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Publication 12259952 B2 (Sun et al.) in view of US Patent Application Publication US 20250139445 A1 (Gao et al.) and further in view of US Patent Publication 12242522 B2 (Gardner). Regarding Claims 7 and 17, Sun et al. in view of Gao et al. teaches the system of claims 1 and 11. Sun et al. in view of Gao et al. does not explicitly teach: further comprising a conditional supervised fine-tuning (SFT) component configured to perform conditional fine-tuning on the second language model using a prosocial dataset comprising sample problematic prompts and corresponding prosocial natural language responses to the sample problematic prompts. Gao et al. does show using a contrastive learning dataset to train an LLM, it doesn’t not explicitly state using negative prompts combined with positive responses. However, Gardner teaches further comprising a conditional supervised fine-tuning (SFT) component configured to perform conditional fine-tuning on the second language model using a prosocial dataset comprising sample problematic prompts and corresponding prosocial natural language responses to the sample problematic prompts. (For instance, each of the adverse-input-proper-output pairs includes an example of an adverse input (e.g., malicious, etc.) and an example of a desirable or mitigating output for the adverse input, which may include an output stating that the AI Model cannot produce a response to such an input.) (Col. 22, Lines 1-). (At operation 720, a prompt is generated that includes the current dialogue context as well as the subset of similar example pairs. The prompt is provided to an AI Model, at operation 725. The prompt may also include a request or other instructions based on the NL input received at operation 705. The AI Model then processes the prompt and returns an output that is received at operation 730. Due to the inclusion of the subset of example adverse-input-proper-output pairs and the subset of example non-adverse-input-proper-output pairs, the output of the AI Model is less likely to produce an improper response to a malicious input.) (Col. 28, Line 64 to Col. 29, Line 7). In Gardner adverse input and proper output pairs are generated in order to improve the outputs of an AI model. In this case the pairs are injected into the user’s prompt in order to improve the output responses. It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the contrastive model training as taught by Sun et al. in view of Gao et al. to include training data for problematic prompts and prosocial response as taught by Gardner. This would have been an obvious improvement to mitigate the danger of adverse or malicious inputs to the language model (Gardner, Col. 21, Lines 64-67). Regarding Claims 8 and 18, Sun et al. in view of Gao et al. and Gardner teaches the system of claims 7 and 17. Furthermore, Gardner teaches wherein the sample problematic prompts comprise requests for information that facilitate harm to a person, a system, or property. (Herein, “malicious input” may refer to an input that is intended to corrupt or manipulate the AI Model into responding in an undesirable or improper manner (e.g., in a manner that is offensive, inappropriate, prejudicial, and/or emotionally or psychologically harmful to particular individuals or groups of individuals, etc.). Although similar, herein, “adversarial input” may refer to an input that is intended to corrupt or manipulate the AI Model into responding in a manner that is openly confrontational or aggressive and/or a manner that incites violence or promotes conspiracy theories. Although similar, herein, “attack vector-based input” may refer to an input that is intended to corrupt or manipulate the AI Model into operating in a manner that would affect operation of the AI Model (e.g., causing the AI Model to enter into an infinite loop, causing the AI Model to generate programs that are designed to tie up significant amounts of computing and/or network resources, causing the AI Model to generate computer viruses or other malware, causing the AI Model to access other users' information without permission, etc.). Quite differently, “off-topic input” may refer to an input that causes the AI Model to respond in a manner in which the topic of the conversion exchange shifts either chaotically, periodically, or randomly, and in some cases may include flirtations, disjointed speech, or mixing of topics.) (Col. 22, Lines 13-37). Gardner describes the types of prompts that are considered adverse to be ones that might facilitate harm to a person, system, or property. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS DANIEL LOWEN whose telephone number is (571)272-5828. The examiner can normally be reached Mon-Fri 8:00am - 4:00pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras D Shah can be reached at (571) 270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /NICHOLAS D LOWEN/Examiner, Art Unit 2653 /Paras D Shah/Supervisory Patent Examiner, Art Unit 2653 02/04/2026
Read full office action

Prosecution Timeline

May 30, 2024
Application Filed
May 09, 2025
Response after Non-Final Action
Feb 04, 2026
Non-Final Rejection — §101, §103
Apr 14, 2026
Applicant Interview (Telephonic)
Apr 14, 2026
Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12592224
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT
2y 5m to grant Granted Mar 31, 2026
Patent 12511494
SYSTEMS AND METHODS FOR FINETUNING WITH LEARNED HIDDEN REPRESENTATIONS OF PARAMETER CHANGES
2y 5m to grant Granted Dec 30, 2025
Study what changed to get past this examiner. Based on 2 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
62%
Grant Probability
99%
With Interview (+75.0%)
2y 7m
Median Time to Grant
Low
PTA Risk
Based on 8 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month