Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-13 and 15-21 are pending. Claim 14 has been canceled. Claim 21 is new. This Office Action is responsive to the amendment filed on 02/02/2026, which has been entered in the above identified application.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1-13 and 15-21 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Regarding Claims 1, 8 and 16, the specification fails to recite or suggest “causing a generative model, based on a prompt, to generate a second set of training data”. At best, the specification recites generating synthetic data with a generative model that is similar to the training dataset, but makes no reference to a “prompt”, much less that the generative model generates synthetic data based on a prompt. As such, this limitation is considered new matter and fails to comply with the written description requirement.
Regarding Claim 3, the specification fails to recite or suggest “the prompt includes at least one exemplar from the first set of training data”. At best, the specification recites that “the generative model 224 is pre-trained using the training data 222 to generate 204 synthetic data” [0041], but makes no reference to a “exemplar”, much less that a prompt given to the generative model includes an exemplar. As such, this limitation is considered new matter and fails to comply with the written description requirement.
Regarding Claim 21, the specification fails to recite or suggest, “wherein the prompt includes a seed”. At best, the specification recites that “values can be used as seeds to the generative model 224 to generate 204 the synthetic data 228” [0042], but does not recite the seed in relation to a prompt. As such, this limitation is considered new matter and fails to comply with the written description requirement.
Regarding Claims 2, 4-7, 9-13, 15, and 17-21, they are rejected for their dependency on an unallowable claim.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-13 and 15-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed towards an abstract idea.
Independent Claims
Step 1 – Claim 1 is drawn to a method, claim 8 is drawn to a non-transitory computer-readable medium, and claim 16 is drawn to a system. Therefore, each of these claims fall under one of the four categories of statutory subject matter (process/method, machine/product/apparatus, manufacture or composition of matter).
Step 2A Prong 1 – Claims 1, 8 and 16 are directed to a judicially recognized exception of an abstract idea without significantly more. Claims 1, 8 and 16 recite:
Claim 1:
training the event detection model based on the first set of training data and the second set of training data – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0035] of the specification, it states “In various embodiments, to train the event detection model126 Mϴ, the training data122 batch Bo and the synthetic data batch BG are combined to form an augmented training batch BC. The event detection model126 Mϴ, in an embodiment, is then updated using the gradient of the loss function LB over BC, leading to the new parameters ϴt for Mϴ. Training the event detection model126 is illustrated by the following equation: [Equation]”. BRI in light of the specification would support that “training the event detection model” would encompass a gradient calculation of a loss function and fall under the mathematical concepts grouping.
determining a reward value based on a performance of the event detection model to detect events based on using a third set of training data and a gradient of a loss function based on the second set of training data – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0016] of the specification, it states “various embodiments described in the present disclosure compute the reward for reinforcement learning for training based on an agreement between the gradient of a loss function computed based on the generated samples and the gradient computed of the loss function (e.g. cosine similarity) based on the development data.” BRI in light of the specification would support that “determining a reward value” would encompass a similarity calculation of gradients of a loss function and fall under the mathematical concepts grouping.
updating a parameter of the generative model to reduce loss values associated with the loss function based on the reward value – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0048] of the specification, it states “For example, the gradient used to update the parameters of the generative model is computed to include the reward value determined based on the performance of the event detection model.” BRI in light of the specification would support that “updating a parameter of the generative model” would encompass a gradient calculation and fall under the mathematical concepts grouping.
Claim 8:
training the event detection model based on the first set of labeled sequences and the second set of labeled sequences by at least updating parameters of the event detection model based on a first gradient of a loss function based on the first set of labeled sequences and the second set of labeled sequences to generate an updated event detection model – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0035] of the specification, it states “In various embodiments, to train the event detection model126 Mϴ, the training data122 batch Bo and the synthetic data batch BG are combined to form an augmented training batch BC. The event detection model126 Mϴ, in an embodiment, is then updated using the gradient of the loss function LB over BC, leading to the new parameters ϴt for Mϴ. Training the event detection model126 is illustrated by the following equation: [Equation]”. BRI in light of the specification would support that “training the event detection model” would encompass a gradient calculation of a loss function and fall under the mathematical concepts grouping.
determining a set of reward values corresponding to labeled sequences of the second set of labeled sequences, reward values of the set of reward values determined based on a result of the updated event detection model performance during inferencing and a second gradient of the loss function based on the second set of labeled sequences – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0016] of the specification, it states “various embodiments described in the present disclosure compute the reward for reinforcement learning for training based on an agreement between the gradient of a loss function computed based on the generated samples and the gradient computed of the loss function (e.g. cosine similarity) based on the development data.” BRI in light of the specification would support that “determining a set of reward values” would encompass a similarity calculation of gradients of a loss function and fall under the mathematical concepts grouping.
updating parameters of the generative model to reduce loss values associated with the loss function based on the set of reward values – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0048] of the specification, it states “For example, the gradient used to update the parameters of the generative model is computed to include the reward value determined based on the performance of the event detection model.” BRI in light of the specification would support that “updating parameters of the generative model” would encompass a gradient calculation and fall under the mathematical concepts grouping.
Claim 16:
training an event detection model based on the training dataset by at least updating parameters of the event detection model to generate an updated event detection model – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0035] of the specification, it states “In various embodiments, to train the event detection model126 Mϴ, the training data122 batch Bo and the synthetic data batch BG are combined to form an augmented training batch BC. The event detection model126 Mϴ, in an embodiment, is then updated using the gradient of the loss function LB over BC, leading to the new parameters ϴt for Mϴ. Training the event detection model126 is illustrated by the following equation: [Equation]”. BRI in light of the specification would support that “training an event detection model” would encompass a gradient calculation of a loss function and fall under the mathematical concepts grouping.
determining a reward value corresponding to the second set of labeled sequences based on a result of the updated event detection model performing inferencing operations on a third set of labeled sequences and a gradient of a loss function based on the second set of labeled sequences – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0016] of the specification, it states “various embodiments described in the present disclosure compute the reward for reinforcement learning for training based on an agreement between the gradient of a loss function computed based on the generated samples and the gradient computed of the loss function (e.g. cosine similarity) based on the development data.” BRI in light of the specification would support that “determining a reward value” would encompass a similarity calculation of gradients of a loss function and fall under the mathematical concepts grouping.
updating parameters of the generative model based on the reward value to reduce loss values with the loss function – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0048] of the specification, it states “For example, the gradient used to update the parameters of the generative model is computed to include the reward value determined based on the performance of the event detection model.” BRI in light of the specification would support that “updating parameters of the generative model” would encompass a gradient calculation and fall under the mathematical concepts grouping.
Step 2A Prong 2 – The following additional limitations recited do not integrate the abstract idea into a practical application:
obtaining a first set of training data for training an event detection model, the first set training data including labeled data – This limitation recites an insignificant extra-solution activity of mere data gathering (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
causing a generative model, based on a prompt, to generate a second set of training data, not including training data from the first set of training data, the second set of training data including labeled synthetic data generated by the generative model – This limitation merely recites the idea of causing a generative model to generate synthetic data using a prompt and fails to recite details of how the generating is accomplished by the model. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying to merely "apply" the generative model (see MPEP § 2106.05(f)) and thus, fails to integrate the exception into a practical application.
Claim 8:
A non-transitory computer-readable medium storing executable instructions embodied thereon, that, as a result of being executed by a processing device, cause the processing device to perform operations – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It recites a generic computer or generic computer components that merely act as a tool on which the method operates.
Claim 16:
a memory component; and a processing device coupled to the memory component, the processing device to perform operations – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It recites a generic computer or generic computer components that merely act as a tool on which the method operates.
Step 2B – The additional elements in Step 2A Prong 2, view individually or wholistically, do not provide an inventive concept or otherwise amount to significantly more than the abstract idea itself.
obtaining a first set of training data for training an event detection model, the first set training data including labeled data – This limitation recites an insignificant extra-solution activity of mere data gathering (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving receiving or transmitting data over a network (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
causing a generative model, based on a prompt, to generate a second set of training data, not including training data from the first set of training data, the second set of training data including labeled synthetic data generated by the generative model – This limitation merely recites the idea of causing a generative model to generate data and fails to recite details of how the generating is accomplished. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying to merely "apply" the generative model (see MPEP § 2106.05(f)) and thus, fails to provide significantly more to the judicial exception. Similarly, it recites the well-understood, routine, conventional activity of data augmentation (see MPEP § 2106.05(d)) and thus, fails to provide significantly more to the judicial exception.
Claim 8:
A non-transitory computer-readable medium storing executable instructions embodied thereon, that, as a result of being executed by a processing device, cause the processing device to perform operations – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.
Claim 16:
a memory component; and a processing device coupled to the memory component, the processing device to perform operations – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.
As such, claims 1, 8 and 16 are not patent eligible.
Dependent Claims
Claims 2-7, 9-15, and 17-20 merely narrow the previously cited abstract idea limitations. For the reasons described above with respect to independent claims 1, 8 and 16, these judicial exceptions are not meaningfully integrated into a practical application, nor amount to significantly more than the abstract idea itself. The claims disclose similar limitations described for the independent claims above and do not provide anything more than the mathematical concepts that are achievable through mathematical computation. Therefore claims 2-7, 9-15, and 17-20 also recite abstract ideas that do not integrate into a practical application or amount to significantly more than the judicial exception, and are rejected under U.S.C. § 101.
Step 1 – Claims 2-7 are drawn to a method, claims 9-15 are drawn to a non-transitory computer-readable medium, and claims 17-20 are drawn to a system. Therefore, each of these claims fall under one of the four categories of statutory subject matter (process/method, machine/product/apparatus, manufacture or composition of matter).
Step 2A Prong 1 – These claims are directed to a judicially recognized exception of an abstract idea without significantly more.
Claim 2:
wherein performance of the event detection model is determined based on the event detection model detecting events within the third set of training data – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0033] of the specification, it states “For example, the performance of the event detection model 126 Mϴ on the development training data Odev (e.g., measured by F1 scores and/or loss function values) is used as the reward for the synthetic data BG generated by the generative model 124 to update Mψ with reinforcement learning.” BRI in light of the specification would support that “wherein the result indicates performance” would encompass a F1 score or loss function calculation and fall under the mathematical concepts grouping.
Claim 4:
wherein the reward value indicates a similarity between the gradient of the loss function and a second gradient of the loss function based on the third set of training data – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0016] of the specification, it states “various embodiments described in the present disclosure compute the reward for reinforcement learning for training based on an agreement between the gradient of a loss function computed based on the generated samples and the gradient computed of the loss function (e.g. cosine similarity) based on the development data.” BRI in light of the specification would support that “wherein the reward value indicates a similarity” would encompass a similarity calculation of gradients and fall under the mathematical concepts grouping.
Claim 5:
wherein the loss function further comprises a cosine similarity – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C).
Claim 10:
wherein updating the parameters of the generative model based on the set of reward values further includes determining a third gradient of a second loss function based on the set of reward values and a set of labels of the second set of labeled sequences – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0035] of the specification, it states “In various embodiments, to train the event detection model126 Mϴ, the training data122 batch Bo and the synthetic data batch BG are combined to form an augmented training batch BC. The event detection model126 Mϴ, in an embodiment, is then updated using the gradient of the loss function LB over BC, leading to the new parameters ϴt for Mϴ. Training the event detection model126 is illustrated by the following equation: [Equation]”. BRI in light of the specification would support that “wherein updating the parameters includes determining a third gradient of a second loss function” would encompass a gradient calculation and fall under the mathematical concepts grouping.
Claim 12:
wherein the result indicate performance of the updated event detection model to detect events within the third set of labeled sequences – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0033] of the specification, it states “For example, the performance of the event detection model 126 Mϴ on the development training data Odev (e.g., measured by F1 scores and/or loss function values) is used as the reward for the synthetic data BG generated by the generative model 124 to update Mψ with reinforcement learning.” BRI in light of the specification would support that “wherein the result indicates performance” would encompass an F1 score or loss function calculation and fall under the mathematical concepts grouping.
Claim 18:
wherein the result of the updated event detection model includes a second gradient of the loss function based on the third set of labeled sequences – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0016] of the specification, it states “various embodiments described in the present disclosure compute the reward for reinforcement learning for training based on an agreement between the gradient of a loss function computed based on the generated samples and the gradient computed of the loss function (e.g. cosine similarity) based on the development data.” BRI in light of the specification would support that “wherein the result includes a second gradient of the loss function” would encompass a gradient calculation and fall under the mathematical concepts grouping.
Claim 19:
wherein the second gradient indicates a performance of the updated event detection model to detect a set of event triggers within the third set of labeled sequences – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0033] of the specification, it states “For example, the performance of the event detection model 126 Mϴ on the development training data Odev (e.g., measured by F1 scores and/or loss function values) is used as the reward for the synthetic data BG generated by the generative model 124 to update Mψ with reinforcement learning.” BRI in light of the specification would support that “wherein the second gradient indicates a performance” would encompass a F1 score or loss function calculation and fall under the mathematical concepts grouping.
Claim 20:
wherein the reward value indicates a similarity between the gradient of the loss function based on the second set of labeled sequences and the second gradient of the loss function based on the third set of labeled sequences – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0016] of the specification, it states “various embodiments described in the present disclosure compute the reward for reinforcement learning for training based on an agreement between the gradient of a loss function computed based on the generated samples and the gradient computed of the loss function (e.g. cosine similarity) based on the development data.” BRI in light of the specification would support that “wherein the reward value indicates a similarity” would encompass a similarity calculation between gradients and fall under the mathematical concepts grouping.
Step 2A Prong 2 – These limitations do not recite any additional elements which integrate the abstract idea into a practical application.
Claim 3:
wherein the first set of training data is sampled from human labeled training data and the prompt includes at least one exemplar from the first set of training data – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
Claim 6:
wherein the method further comprises causing the event detection model to perform an event detection task – This limitation merely recites the idea of causing the model to perform an event detection task and fails to recite details of how the performing is accomplished. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying to merely "apply" the event detection model (see MPEP § 2106.05(f)) and thus, fails to integrate the exception into a practical application.
Claim 7:
wherein the event detection model is included in an information extraction pipeline – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---information extraction and thus, fails to integrate the exception into a practical application.
Claim 9:
wherein the result of the updated event detection model is generated based on a third set of labeled sequences – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
Claim 10:
labels of the set of labels generated by the generative model and indicate an event trigger within the labeled sequences of the second set of labeled sequences – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---event detection and thus, fails to integrate the exception into a practical application.
Claim 11:
wherein the result of the updated event detection model is generated based on a third set of labeled sequences – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
Claim 13:
wherein the first set of labeled sequences are sampled from a set of labeled training data – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
Claim 15:
wherein a labeled sequence of the first set of labeled sequences includes a first vector indicating words in the labeled sequence and a second vector indication labels associated with the words – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---vectors of text and thus, fails to integrate the exception into a practical application.
Claim 21:
wherein the prompt includes a seed – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
Step 2B – These limitations, as a whole, do not amount to significantly more than the judicial exception.
Claim 3:
wherein the first set of training data is sampled from human labeled training data and the prompt includes at least one exemplar from the first set of training data – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving storing or retrieving information in memory (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
Claim 6:
wherein the method further comprises causing the event detection model to perform an event detection task – This limitation merely recites the idea of causing the model to perform an event detection task and fails to recite details of how the performing is accomplished. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying to merely "apply" the event detection model (see MPEP § 2106.05(f)) and thus, fails to provide significantly more to the judicial exception.
Claim 7:
wherein the event detection model is included in an information extraction pipeline – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---information extraction and thus, fails to provide significantly more to the judicial exception.
Claim 9:
wherein the result of the updated event detection model is generated based on a third set of labeled sequences – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving storing or retrieving information in memory (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
Claim 10:
labels of the set of labels generated by the generative model and indicate an event trigger within the labeled sequences of the second set of labeled sequences – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---event detection and thus, fails to provide significantly more to the judicial exception.
Claim 11:
wherein the result of the updated event detection model is generated based on a third set of labeled sequences – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving storing or retrieving information in memory (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
Claim 13:
wherein the first set of labeled sequences are sampled from a set of labeled training data – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving storing or retrieving information in memory (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
Claim 15:
wherein a labeled sequence of the first set of labeled sequences includes a first vector indicating words in the labeled sequence and a second vector indication labels associated with the words – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---vectors of text and thus, fails to provide significantly more to the judicial exception.
Claim 21:
wherein the prompt includes a seed – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving storing or retrieving information in memory (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
As such, claims 2-7, 9-13, 15, and 17-21 are not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4, 6-13 and 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hataya et al. (“Meta Approach to Data Augmentation Optimization”, published 06/14/2020), hereinafter Hataya; in view of Liu et al. (“MRCAug: Data Augmentation via Machine Reading Comprehension for Document-Level Event Argument Extraction”, published 10/14/2022), hereinafter Liu; in further view of Yoo et al. (“GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation”, published 11/11/2021), hereinafter Yoo. Hataya and Liu were cited in the previous Office Action.
Regarding Claim 1, Hataya teaches a method comprising:
obtaining a first set of training data for training a model, the first set training data including labeled data (Hataya: “Let us define a set of input images X and a set of operations S consisting of data augmentation operations such as rotation and color inversion.” [Section 2.1. Designing Data Augmentation Space]);
causing a generative model to generate a second set of training data generated by the generative model (Hataya: “In the AutoAugment family, each image x ∈ X ⊂ [0, 1]D is augmented by an operation O : X → X with a probability of pO ∈ [0, 1] and a magnitude of µO ∈ [0, 1] as illustrated in Figure 1.” [2.1. Designinig Data Augmentation Space]);
training the model based on the first set of training data and the second set of training data (Hataya: “the inner process optimizes parameters of a CNN on training data using a given combination of operations” [Section 1. Introduction]; See [Algorithm 1], where “input = policy(train_data[i]); criterion = cnn.train(input)”; “Our proposed method, MADAO, can optimize a CNN and its data augmentation policy simultaneously by gradient descent in an online manner. Namely, the parameters of the CNN θ is updated to minimize the training loss Ltrain (also written as f), and the parameters of the policy ϕ = {p, µ, π} is updated to minimize the validation loss Lval (also written as g).” [Figure 1]);
determining a reward value based on performance of the model to detect events based on a third set of training data and a gradient of a loss function based on the second set of training data (Hataya: “we set 10 % of the original training data aside as validation data DV and report error rates on test data.” [Section 4. Experiments and Results]; “Our proposed method, MADAO, can optimize a CNN and its data augmentation policy simultaneously by gradient descent in an online manner. Namely, the parameters of the CNN θ is updated to minimize the training loss Ltrain (also written as f), and the parameters of the policy ϕ = {p, µ, π} is updated to minimize the validation loss Lval (also written as g).” [Figure 1]; See [Algorithm 1], where “vcriterion = cnn.val(val_data)”; In light of Paragraph [0033] of the specification, which states “the performance of the event detection model 126 Mϴ on the development training data Odev (e.g., measured by F1 scores and/or loss function values) is used as the reward for the synthetic data BG generated by the generative model124 to update Mψ with reinforcement learning”, BRI would support that “determining a reward value based on performance” constitutes calculating loss values with a loss function); and
updating a parameter of the generative model to reduce loss values associated with the loss function based on the reward value (Hataya: “the outer process optimizes the combination of operations to maximize the validation performance.” [Section 1. Introduction]; “Our proposed method, MADAO, can optimize a CNN and its data augmentation policy simultaneously by gradient descent in an online manner. Namely, the parameters of the CNN θ is updated to minimize the training loss Ltrain (also written as f), and the parameters of the policy ϕ = {p, µ, π} is updated to minimize the validation loss Lval (also written as g).” [Figure 1]).
However, Hataya fails to expressly disclose an event detection model; and generating a second set of training data, not including training data from the first set of training data, the second set of training data including labeled synthetic data.
In the same field of endeavor, Liu teaches an event detection model (Liu: “we devise two data augmentation regimes via MRC, including an implicit knowledge transfer method, which enables knowledge transfer from other tasks to the document-level [event argument extraction] task, and an explicit data generation method, which can explicitly generate new training examples by treating a pre-trained MRC model as an annotator.” [Abstract]); and
generating a second set of training data, not including training data from the first set of training data, the second set of training data including labeled synthetic data (Liu: “Despite its effectiveness, one disadvantage of implicit knowledge transfer is that it cannot create explicit training data, hence it can only benefit a model in an MRC formulation but not in other formulations [2], [4]. To overcome this issue, we propose another data augmentation approach named as explicit data generation, which can generate new training examples explicitly to enlarge the training set and hence can benefit any model proposed for document-level EAE.” [Section III.C. Explicit Data Generation via MRC]; “Particularly, in our method, we first create: (i) a clean set, which contains all of the labeled data for the document-level EAE task, and (ii) a wild set, which contains the labeled data from other tasks for the implicit knowledge transfer method, or the automatically generated data for the explicit data generation method.” [Section III.D. Noise Filtering via a Self-Training Regime]; “b) A Joint Training Stage: We devise the following loss function to combine the originally labeled data with the automatically generated data for model training: [Equation 5, 6] where δ is a weight used to balance the contributions of two different forms of data.” [Section III.C. Explicit Data Generation via MRC]; In light of Paragraph [0013] of the specification, which states “the generative model generates a batch of synthetic data (e.g., new labeled training data)”, BRI of “synthetic data” would encompass labeled training data that is generated by a generative model).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated an event detection model; and generating a second set of training data, not including training data from the first set of training data, the second set of training data including labeled synthetic data, as taught by Liu to the method of Hataya because both of these methods are directed towards augmentation of training data by a generative model for joint optimization with a separate neural network. Image classification and event detection are both fields in which a large amount of labeled data is required for model training, but might not exist due to the cost and inadequacy of manual annotation. In making this combination and applying the method of Hataya to the environment of event detection, as well as generating synthetic data for training, it would allow for the mitigation of the data sparsity problem faced by document level event argument extraction (Liu: [Abstract]) by “generat[ing] new training examples explicitly to enlarge the training set and hence can benefit any model proposed for document-level EAE” (Liu: [Section III.C. Explicit Data Generation via MRC]).
Hataya and Liu still fail to expressly disclose generating based on a prompt.
In the same field of endeavor, Yoo teaches disclose generating based on a prompt (Yoo: “We propose GPT3Mix, a method for generating synthetic but hyper-realistic text samples from a mixture of real samples utilizing large-scale language models such as GPT-31. GPT3Mix extracts few sample sentences from the task-specific training data, embed these samples in the prompt, and generates an augmented mixed sentence influenced by the sample sentences.” [Section 1. Introduction]).
It would have been obvious before the effective filing date of the invention to have incorporated generating based on a prompt, as taught by Yoo to the method of Hataya and Liu because both of these methods are directed towards performing data augmentation for training an inferencing model. In making this combination and generating training data based on a prompt, it would allow the method of Hataya and Liu to prpvide “task-specific information to allow the large-scale language models to generalize better about the data distribution” (Yoo: [Section 3. GPT3Mix]).
Regarding Claims 8 and 16, they are non-transitory computer readable medium and system claims that correspond to Claim 1. Therefore, they are rejected for the same reasons as Claim 1 above.
Regarding Claim 2, Hataya, Liu and Yoo teach the method of Claim 1, wherein performance of the event detection model is determined based on the event detection model detecting events within the third set of training data (Hataya: “the outer process optimizes the combination of operations to maximize the validation performance.” [Section 1. Introduction]; “We propose Meta Approach to Data Augmentation Optimization (MADAO), which optimizes CNNs and augmentation policies simultaneously by using gradient based optimization. Here, policies are updated so that they directly increase CNNs’ validation performance.” [Section 1. Introduction]).
Regarding Claim 3, Hataya, Liu and Yoo teach the method of Claim 1, wherein the first set of training data is sampled from human labeled training data and the prompt includes at least one exemplar from the first set of training data (Hataya: “We empirically demonstrate that MADAO learns effective data augmentation policies and achieves performance comparable or even superior to existing methods on benchmark datasets for image classification: CIFAR-10, CIFAR-100, SVHN, and ImageNet, as well as fine-grained datasets.” [Section 1. Introduction]; Yoo: “We propose GPT3Mix, a method for generating synthetic but hyper-realistic text samples from a mixture of real samples utilizing large-scale language models such as GPT-31. GPT3Mix extracts few sample sentences from the task-specific training data, embed these samples in the prompt, and generates an augmented mixed sentence influenced by the sample sentences.” [Section 1. Introduction]).
Regarding Claim 4, Hataya, Liu and Yoo teach the method of Claim 1, wherein the reward value indicates a similarity between the gradient of the loss function and a second gradient of the loss function based on the third set of training data (Hataya: “the parameters of the policy ϕ = {p, µ, π} is updated to minimize the validation loss Lval (also written as g).” [Figure 1]; “Gradient-based optimization of Equation (1) requires ∇ϕg for iterative updating. Since the data augmentation implicitly affects the validation criterion, in other words, data augmentation is not used for validation, we obtain [Equation 3]. Because of the requirement of g, ∇θg can be obtained.” [Section 3.1. Optimizing Policies by Gradient Descent]).
Regarding Claim 6, Hataya, Liu and Yoo teach the method of Claim 1, wherein the method further comprises causing the event detection model to perform an event detection task (Liu: “Document-level event argument extraction (EAE) is such a task requiring a model to extract arguments (i.e., participants) of an event at the document level” [Section I. Introduction]).
Regarding Claim 7, Hataya, Liu and Yoo teach the method of Claim 1, wherein the event detection model is included in an information extraction pipeline (Hataya: “Data augmentation is an effective way to improve the performance of CNN models for image recognition tasks, particularly when its policy is optimized for the target model and dataset.” [Section 1. Introduction]; BRI of information extraction pipeline is that it involves tasks in which information is extracted from data).
Regarding Claim 9, Hataya, Liu and Yoo teach the medium of Claim 8, wherein the result of the updated event detection model is generated based on a third set of labeled sequences (Hataya: “Optimization of data augmentation policy in AutoAugment family methods can be generalized as [Equation 1] that is, optimizing CNNs on training data with policies that minimize validation criteria on validation data.” [Section 2.2. Generalizing AutoAugment Family]).
Regarding Claim 10, Hataya, Liu and Yoo teach the medium of Claim 8, wherein updating the parameters of the generative model based on the set of reward values further includes determining a third gradient of a second loss function based on the set of reward values and a set of labels of the second set of labeled sequences (Hataya: “Our proposed method, MADAO, can optimize a CNN and its data augmentation policy simultaneously by gradient descent in an online manner. Namely, the parameters of the CNN θ is updated to minimize the training loss Ltrain (also written as f), and the parameters of the policy ϕ = {p, µ, π} is updated to minimize the validation loss Lval (also written as g).” [Figure 1]; See [Section 3.2. Approximating Gradients of Policy and Inverse Hessian]),
labels of the set of labels generated by the generative model and indicate an event trigger within the labeled sequences of the second set of labeled sequences (Liu: “we adopt an MRC viewpoint to the document-level EAE task for data augmentation. Let D be a document containing a set of event instances E(D) = {ei}ni=1, each represented by an event trigger.” [Section III. Approach]; “[explicit data generation] uses an MRC model as an annotator to label new training examples for explicitly expanding the training set.” [Figure 3]).
Regarding Claim 11, Hataya, Liu and Yoo teach the medium of Claim 8, wherein the result of the updated event detection model is generated based on a third set of labeled sequences (Hataya: “Optimization of data augmentation policy in AutoAugment family methods can be generalized as [Equation 1] that is, optimizing CNNs on training data with policies that minimize validation criteria on validation data.” [Section 2.2. Generalizing AutoAugment Family]).
Regarding Claim 12, Hataya, Liu and Yoo teach the medium of Claim 11, wherein the result indicate performance of the updated event detection model to detect events within the third set of labeled sequences (Hataya: “Optimization of data augmentation policy in AutoAugment family methods can be generalized as [Equation 1] that is, optimizing CNNs on training data with policies that minimize validation criteria on validation data.” [Section 2.2. Generalizing AutoAugment Family]; Liu: “we adopt an MRC viewpoint to the document-level EAE task for data augmentation. Let D be a document containing a set of event instances E(D) = {ei}ni=1, each represented by an event trigger.” [Section III. Approach]).
Regarding Claim 13, Hataya, Liu and Yoo teach the medium of Claim 8, wherein the first set of labeled sequences are sampled from a set of labeled training data (Hataya: “We empirically demonstrate that MADAO learns effective data augmentation policies and achieves performance comparable or even superior to existing methods on benchmark datasets for image classification: CIFAR-10, CIFAR-100, SVHN, and ImageNet, as well as fine-grained datasets.” [Section 1. Introduction]).
Regarding Claim 15, Hataya, Liu and Yoo teach the medium of Claim 8, wherein a labeled sequence of the first set of labeled sequences includes a first vector indicating words in the labeled sequence and a second vector indication labels associated with the words (Liu: “In particular, we use a pre-trained MRC model as an annotator to label new training examples in unlabeled documents. For example, we may use a question Who is the attacker in the bombarding event? to query each document, and treat those with answers as new training examples annotated with an attacker role. In contrast to implicit knowledge transfer, explicit data generation can produce tangible training examples, which is shown to benefit a wide range of existing models for the task (e.g., those based on sequence labeling).” [Section I. Introduction]).
Regarding Claim 17, Hataya, Liu and Yoo teach the system of Claim 16, wherein the processing device to perform the operations comprising pre-training the generative model based on the annotated dataset (Liu: “we use a pre-trained MRC model as an annotator to label new training examples in unlabeled documents.” [Section I. Introduction]).
Regarding Claim 18, Hataya, Liu and Yoo teach the system of Claim 16, wherein the result of the updated event detection model includes a second gradient of the loss function based on the third set of labeled sequences (Hataya: “Optimization of data augmentation policy in AutoAugment family methods can be generalized as [Equation 1] that is, optimizing CNNs on training data with policies that minimize validation criteria on validation data.” [Section 2.2. Generalizing AutoAugment Family]).
Regarding Claim 19, Hataya, Liu and Yoo teach the system of Claim 18, wherein the second gradient indicates a performance of the updated event detection model to detect a set of event triggers within the third set of labeled sequences (Hataya: “Optimization of data augmentation policy in AutoAugment family methods can be generalized as [Equation 1] that is, optimizing CNNs on training data with policies that minimize validation criteria on validation data.” [Section 2.2. Generalizing AutoAugment Family]; “Gradient-based optimization of Equation (1) requires ∇ϕg for iterative updating. Since the data augmentation implicitly affects the validation criterion, in other words, data augmentation is not used for validation, we obtain [Equation 3]. Because of the requirement of g, ∇θg can be obtained.” [Section 3.1. Optimizing Policies by Gradient Descent]; Liu: “we adopt an MRC viewpoint to the document-level EAE task for data augmentation. Let D be a document containing a set of event instances E(D) = {ei}ni=1, each represented by an event trigger.” [Section III. Approach]).
Regarding Claim 20, Hataya, Liu and Yoo teach the system of Claim 19, wherein the reward value indicates a similarity between the gradient of the loss function based on the second set of labeled sequences and the second gradient of the loss function based on the third set of labeled sequences (Hataya: “the parameters of the policy ϕ = {p, µ, π} is updated to minimize the validation loss Lval (also written as g).” [Figure 1]; “Gradient-based optimization of Equation (1) requires ∇ϕg for iterative updating. Since the data augmentation implicitly affects the validation criterion, in other words, data augmentation is not used for validation, we obtain [Equation 3]. Because of the requirement of g, ∇θg can be obtained.” [Section 3.1. Optimizing Policies by Gradient Descent]).
Regarding Claim 21, Hataya, Liu and Yoo teach the medium of Claim 11, wherein the prompt includes a seed (Yoo: “For each experiment, we perform a class-balanced sub-sample on the training set. We account for statistical variance in our experiments by fixating the sub-samples on 15 different data seeds and repeating the augmentation procedure and downstream classification experiments on all sub-samples. The data seeds were chosen randomly.” [Section 4.1 Experimental Settings]).
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Hataya in view of Liu and Yoo, as applied to Claim 4, in further view of Zheng et al. (“Deep AutoAugment”, published 03/15/2022), hereinafter Zheng. Zheng was cited in the previous Office Action.
Regarding Claim 5, Hataya and Liu teach the method of Claim 4. However, they fail to expressly disclose wherein the loss function further comprises a cosine similarity.
In the same field of endeavor, Zheng teaches wherein the loss function further comprises a cosine similarity (Zheng: “the policy is optimized to maximize the cosine similarity between the gradients of the original and augmented data along the direction with low variance.” [Abstract]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the loss function further comprises a cosine similarity, as taught by Zheng to the method of Hataya and Liu because both of these methods are directed towards augmentation of training data by a generative model for joint optimization with a separate neural network through gradient calculations. Cosine similarity and cross entropy are both commonly used as loss functions when updating parameters of neural networks. In making this combination and utilizing a cosine similarity-based loss function, it would allow for “detect[ing] when an auxiliary loss is helpful to the main loss” to optimize the data augmentation policy (Zheng: [Section 2. Related Work]).
Response to Arguments
Examiner acknowledges the Applicant’s amendments to Claims 1-3, 8, 16 and 18, as well as new Claim 21.
Applicant's arguments, filed 02/02/2026, regarding the objection to the have been fully considered and are persuasive. The objection has been withdrawn.
Applicant's arguments, filed 02/02/2026, traversing the rejection of Claims 1-20 under 35 U.S.C. § 101 have been fully considered but are not persuasive.
Applicant alleges, on pages 1-3 of the Remarks, that the claims are not directed to a mathematical relationship, formula, or calculation in the abstract. Rather, as a whole, the claims recite a concrete, computer-implemented training pipeline that (i) trains a generative model using label-augmented and/or human-labeled sequences, (ii) causes the generative model to generate new, labeled synthetic sequences distinct from the first set, (iii) trains an event-detection model (e.g., using a combined batch of original and synthetic data via computed gradients of a specified loss function), (iv) derives per-sample rewards, and (v) updates the generative model’s parameters. These recited steps are computer-implemented manipulations of model parameters and structured training data to produce improved trained systems—not a claim to a bare mathematical formula or a result in the abstract. Accordingly, since the claim does not recite a judicial exception, it is not directed to a judicial exception and is eligible.
Examiner respectfully disagrees.
As per MPEP § 2106.04, Prong One of the two-prong Alice/Mayo test for patent eligibility directs the Examiner to determine whether a claim recites a judicial exception. If the claim recites a judicial exception, the claim requires further analysis in Prong Two. As noted in the 101 analysis above, the limitations outlining steps for manipulating model parameters, namely “training the event detection model…”, “determining a reward value based on a performance… and a gradient of a loss function…” and “updating a parameter of the generative model to reduce loss values associated with the loss function…” all recite mathematical concepts when interpreted in light of the specification and do not merely involve them. As noted in the specification, “training the event detection model 126 is illustrated by the following equation:
g
θ
←
1
B
C
∑
(
S
,
Y
)
∈
B
C
∇
θ
L
B
(
S
,
Y
;
θ
t
-
1
)
” [0035], reward values are determined “based on an average of loss (e.g., loss function)” [0014] and represent “the performance of the event detection model 126 Mϴ on the development training data Odev (e.g., measured by F1 scores and/or loss function values)” [0033], and the event detection model is “updated using the gradient of the loss function LB over BC, leading to the new parameters ϴt for Mϴ” [0035]. These limitations are not merely based on mathematical concepts; they are mathematical concepts (e.g., F1 score measurement, loss function values, gradient calculation, updated parameter calculation). Merely reciting that the abstract ideas are implemented by a computer does not preclude the claims from reciting abstract ideas. As such, Examiner asserts that said independent claims recite the abstract idea of a mathematical concept and fulfills the requirements of Prong One of the patent eligibility test.
Applicant alleges, on pages 4- of the Remarks, that the claims recite elements that establish a practical application. Similar to the claims in McRO, the present claims are directed to a specific improvement in computer technology related to user productivity and productivity features. The claims do not merely compute a loss; they use the computed gradients and reward as control signals in a closed-loop, batch-level co-training architecture to update a generative model so that it produces in-domain labeled synthetic data that measurably reduce detector loss on development data. For example, the specification describes the technical improvements of reducing noise, alleviating data scarcity, and eliminating separate noise-filtering by jointly optimizing the generator with the detector using a gradient-agreement reward. The improvement to computer-implemented model training and to the event-detection integrates mathematical tools into a practical application consistent with MPEP § 2106.05(a), § 2106.05(b), and § 2106.05(e).
Examiner respectfully disagrees.
According to MPEP § 2111.01(II), “Though understanding the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim. For example, a particular embodiment appearing in the written description may not be read into a claim when the claim language is broader than the embodiment.” As such, the present claims do not recite “using the computed gradients and reward as control signals in a closed-loop, batch-level co-training architecture to update a generative model so that it produces in-domain labeled synthetic data that measurably reduce detector loss on development data” nor “jointly optimizing the generator with the detector using a gradient-agreement reward”. Instead, it recites much broader and more general limitations that merely recite the idea of an outcome or solution. For example, the limitation “causing a generative model… to generate… labeled synthetic data…” provides only the result-oriented solution and lacks details as to how the generative model uses a prompt to generate the synthetic data, such that it covers any and all methods of generating labeled synthetic data and does not integrate the judicial exception into a practical application because this type of recitation is equivalent to the words “apply it”, as per MPEP § 2106.05(f). As stated in the Prong One analysis, the limitations regarding the training and parameter manipulation of the generative model recite mathematical concepts when viewed individually, and even when viewed wholistically, the combination of limitations still recite generally training an inferencing model with an augmented training dataset and updating the generative model’s parameters based on the inferencing model’s performance on a validation set and a calculated gradient loss to achieve the conventional improvement of reducing loss. As such, the present claims are inapposite for comparison with McRO, as they merely claim the idea of the purported improvement without reflecting the improvement in the claim language and do not qualify as an improvement to the conventional technology, use of a particular machine, nor meaningful limitations confining the claim to a specific technical implementation that would integrate the claims into a practical application.
Applicant alleges, on pages 5-6, that the present claims, as amended, are directly analogous to those found eligible in Ex Parte Desjardins, in which the PTAB reversed a § 101 rejection for claims directed to estimating a physical property using real-world sensor data and a machine learning model because the claims required the collection and processing of specific sensor data, applied a trained machine learning model, and provided a technical solution to a technical problem. The Board found the claims could not be performed mentally or with pen and paper, and that they were “rooted in computer technology”. The present claims recite a method that (i) pretrains or conditions a generative model on label-sequences—e.g., sentences in which event triggers are explicitly demarcated with special tokens such as “<TRG> … </TRG>”; (ii) causes the generative model to generate new-labeled synthetic sequences that are different from the first set; (iii) trains an event-detection model (e.g., based on a combined batch of original and synthetic data); (iv) computes rewards for the generated data (e.g., a dot product or cosine similarity); and (v) updates parameters of the generative model. Accordingly, the claims are integrated into a practical application that improves the training of machine-learning systems for event detection under Step 2A; they cannot be performed mentally or with pen-and-paper. Even if an abstract concept were implicated, Step 2B is satisfied because the specific combination of (1) label-augmented pretraining of a text generator, (2) generation of new labeled synthetic sequences disjoint from the first set, (3) gradient-agreement-based rewards derived from detector performance on development data, and (4) parameter updates to the generator constitutes an unconventional, technological improvement expressly taught by the as-filed specification.
Examiner respectfully disagrees.
As noted in the Prong Two analysis above and MPEP § 2111.01, it is improper to import claim limitations from the specification such that it changes the scope or interpretation of the claim language. In light of this, the claim as amended does not recite some of the limitations that are allegedly analogous to Ex Parte Desjardins, namely “pretrain[ing] or condition[ing] a generative model on label-sequences” or “label-augmented pretraining of a text generator”. Further, the present claims are inapposite for comparison with Desjardins because, unlike in Desjardins in which PTAB found the recited limitations constitute an improvement to how the machine learning model itself operates as opposed to, for example, the identified mathematical calculation, the present claims do not reflect the alleged improvements beyond what is either already well-known in the art or attributed to the abstract idea itself (e.g., general data augmentation providing more training data, updating parameters based on gradient calculation to reduce loss). At best, the claimed invention amounts to the idea of data augmentation to supplement general and well-known mathematical methods of iteratively training a model and thus, Examiner asserts that a prima facie case of patent ineligibility for the independent claims 1, 8 and 16 has been established. Dependent claims 2-7, 9-13, 15, and 17-21 are similarly ineligible for their dependency on an ineligible independent claim, as well as for their own deficiencies outlined int eh 35 U.S.C. § 101 rejection above.
Applicant’s arguments, filed 02/02/2026, regarding the rejection of Claims 1-20 under 35 U.S.C. § 103 have been fully considered but are either not persuasive or found moot in light of the new grounds of rejection (see rejection above).
Applicant alleges, on pages 6-8 of the Remarks, that the cited combination of references fails to disclose a generative model that produces new, labeled synthetic data distinct from the original training set, nor the reward-based update system as claimed. Hataya and Zheng optimize augmentation policies that transform existing labeled examples within the same dataset instead of synthesizing new labeled sequences generated by a generative model. Liu fails to cure the deficiencies of Hataya and Zheng because the taught “explicit data generation” does not generate synthetic sequences with model-created labels, and there is no reinforcement-learning update of the generative model. Further, none of these references discloses the reward function as claimed—e.g., computing a reward from event-detector performance on a third dataset together with a gradient of a loss based on the synthetic second set, and then updating generative model parameters in a closed loop specifically to reduce loss; Hataya and Zheng use gradient/objective-based policy updates for image augmentation (not RL rewards tied to an event detector) and Liu employs pre-training, fine-tuning, and self-training noise reduction without any reward driven parameter updates of a generative model.
Examiner respectfully disagrees.
Firstly, the present application recites synthetic data as “annotated and/or labeled data generated by the generative model” [0004]. The claims do not limit how the generation of the synthetic data is accomplished or what it entails, only that it is done by a generative model, is based on a prompt, creates data distinct from the original training data, and that it is labeled by the model. Liu not only explicitly recites “generating new training examples” as opposed to transforming existing data, but the augmented data and original data are considered separate, distinct sets, and the model generates the labels from the new training data. The generating may be done by training the generative model as an annotator, but as the claim does not limit what can be considered “generating”, the limitations of Liu are sufficient to teach the claim language of the present claims. Secondly, while Hataya does not explicitly use the term “reward”, it does employ the same process described in the claims. Specifically, Hataya utilizes gradient descent to update the parameters of the data augmentation policy to reduce the loss values associated with a validation loss function, in which the performance of the CNN model in recognition tasks is measured on a validation set and the gradient of the loss function based on the augmented training dataset is calculated to update the parameters of the augmentation policy. As such, Examiner asserts that the combination of Hataya, Liu, and the newly cited reference Yoo teach all of the recited limitations of the independent claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Such et al. (US 20200234144 A1) teaches a generative cooperative network in which a generator model generates training datasets to train a learner model, then the generator is modified based on the evaluation of the performance of the learner model.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MEGAN E HWANG whose telephone number is (703)756-1377. The examiner can normally be reached Monday-Thursday 10:00-7:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Welch can be reached at (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/M.E.H./Examiner, Art Unit 2143
/JENNIFER N WELCH/Supervisory Patent Examiner, Art Unit 2143