Last updated: April 19, 2026

Application No. 18/645,563

SYSTEM AND METHOD FOR INTELLIGENT EVALUATION OF ARTIFICIAL INTELLIGENCE GENERATED TEXTS

Non-Final OA §112

Filed

Apr 25, 2024

Examiner

YEN, ERIC L

Art Unit

2658

Tech Center

2600 — Communications

Assignee

Actimize Ltd.

OA Round

1 (Non-Final)

Interview Optional

— +11.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 765 resolved cases, 2023–2026

Examiner Intelligence

YEN, ERIC L View full profile →

Grants 85% — above average

Career Allow Rate

650 granted / 765 resolved

+23.0% vs TC avg

Moderate +12% lift

Without

With

+11.7%

Interview Lift

resolved cases with interview

Typical timeline

2y 8m

Avg Prosecution

11 currently pending

Career history

776

Total Applications

across all art units

Statute-Specific Performance

§101

18.1%

-21.9% vs TC avg

§103

29.8%

-10.2% vs TC avg

§102

3.5%

-36.5% vs TC avg

§112

35.1%

-4.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 765 resolved cases

Office Action

§112

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
	As per Claim 1 (and similarly claim 8):
	“the determining to accept or reject the text” in the last line of claim 1 is interpreted as referring to “determining… whether to accept or reject the input text” in lines 6-7 of claim 1 (determining whether to accept or reject determines to do one of accept or reject the input text, and the input text is the only text recited in the claims and so “the text” logically refers to “an/the input text”)
	As per Claim 2 (and similarly claim 9):
“the plurality of calculated metrics” in lines 2-3 of claim 2 is interpreted as referring to “a plurality of metrics” which are “calculat[ed]” in line 3 of claim 1 (not to an embodiment of “one or more of the calculated metrics” in line 6 of claim 1 which includes more than one of the calculated metrics, which could also be interpreted as a separate plurality of the calculated metrics when the more than one of the calculated metrics is less than all of the calculated metrics).
As per Claim 6 (and similarly claim 13):
“the calculated metrics” in line 2 of claim 6 is interpreted as referring to “a plurality of metrics” which are “calculat[ed]” in line 3 of claim 1 (not to an embodiment of “one or more of the calculated metrics” in line 6 of claim 1 which includes more than one of the calculated metrics, which could also be interpreted as a separate plurality of the calculated metrics when the more than one of the calculated metrics is less than all of the calculated metrics).
“the metrics” in line 3 of claim 6 is interpreted as referring to “a plurality of metrics” which are “calculat[ed]” in line 3 of claim 1 (not to an embodiment of “one or more of the calculated metrics” in line 6 of claim 1 which includes more than one of the calculated metrics, which could also be interpreted as a separate plurality of the calculated metrics when the more than one of the calculated metrics is less than all of the calculated metrics).
Claim Objections
Claim 18 is objected to because of the following informalities:
As per Claim 18:
“the plurality of score” in line 1 of claim 18 should be –the plurality of scores—(grammar and antecedent basis consistency).
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

As per Claim 1 (and similarly claim 8):
“the perplexity scores” (necessarily plural) in line 4 of claim 1 lacks antecedent basis when “one or more perplexity scores” in line 4 of claim 1 refers to only one perplexity score.

As per Claim 2 (and similarly claim 9):
It is not clear if “one or more of perplexity scores” in line 1 of claim 2 is supposed to refer to:
1. “one or more of” a set of “perplexity scores” which may or may not be “the perplexity scores”/”one or more perplexity scores”-when-“one or more”-refers-to-more-than-one in claim 1 (no amendment is needed if this is the intended interpretation, but a statement that this is the intended interpretation would be helpful).
2. “the one or more perplexity scores” in claim 1.
Or
3. “one or more of the one or more perplexity scores”
“wherein one or more of the plurality of calculated metrics comprises a perplexity table” in lines 2-3 of claim 2 seems to suggest that claim 2 is intended to further define limitations of claim 1, but “wherein one or more of perplexity scores describe predicting of one or more individual sentences by the LLM” does not grammatically/linguistically need to further define the perplexity score(s) in claim 1.

As per Claim 15:
“the perplexity metrics” in line 4 of claim 15 lacks antecedent basis when “one or more perplexity metrics” in line 4 of claim 15 refers to only one perplexity metric.
“the input text” in lines 4-5 of claim 15 is ambiguous (it can refer to either “an input text” in line 3 of claim 15 or to “an input text” in line 1 of claim 15 [i.e. as claimed, these two recitations of “an input text” do not necessarily need to refer to the same input text, such that it is not clear which “input text” is the one that “the input text” in lines 4-5 of claim 15 when the two recitations refer to different input texts).
“the computed metrics” (necessarily plural) in line 6 of claim 15 lacks antecedent basis when “one or more perplexity metrics” in line 4 of claim 15 refers to only one perplexity metric (in which case multiple scores are computed, but only one perplexity metric is computed)
“the input text” in the 3rd to last line of claim 15, and in the last line of claim 15, is ambiguous (same issue as discussed above pertaining to “the input text” in lines 4-5 of claim 15)
“the determining whether to accept or reject the input text” in the last 2 lines of claim 15 lacks antecedent basis (lines 6-7 of claim 15 recite “determining… whether to approve or dismiss the input text” which is just about synonymous with “the determining whether to accept or reject the input text” in the last 2 lines of claim 15 but which is nevertheless inconsistently worded [and may have slightly different meaning] relative to “the determining whether to accept or reject the input text” in the last 2 lines of claim 15).

As per Claim 16:
It is not clear if “one or more of perplexity metrics” in line 1 of claim 16 is supposed to refer to:
1. “one or more of” a set of “perplexity metrics” which may or may not be “the perplexity metrics”/”one or more perplexity metrics”-when-“one or more”-refers-to-more-than-one in claim 15 (no amendment is needed if this is the intended interpretation, but a statement that this is the intended interpretation would be helpful).
2. “the one or more perplexity metrics” in claim 15.
Or
3. “one or more of the one or more perplexity metrics”
“wherein one or more of the plurality of computed scores comprises a perplexity table” in lines 2-3 of claim 16 seems to suggest that claim 16 is intended to further define limitations of claim 15, but “wherein one or more of perplexity metrics describe generating of one or more individual sentences by the GenAI model” does not grammatically/linguistically need to further define the perplexity metric(s) in claim 15.

As per Claim 18:
“the input text” in line 3 of claim 18 and in line 5 of claim 18 is ambiguous (same issue as discussed above pertaining to “the input text” in lines 4-5 of claim 15)

As per Claim 19:
“the input text” in line 2 of claim 19 and in line 3 of claim 19 is ambiguous (same issue as discussed above pertaining to “the input text” in lines 4-5 of claim 15)

The dependent claims include the issues of their respective parent claims.

Allowable Subject Matter
Claims 1, 8, and 15 would be allowable if rewritten or amended to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action.
Claims 2-7, 9-14, and 16-20, would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  

	As per Claim(s) 1 (and similarly claim[s] 8, and consequently claim[s] 2-7 and 9-14 which depend on claim[s] 1 and 8), the prior art of record does not teach or suggest the combination of all limitations in claim(s) 1, including (i.e. in combination with the remaining limitations in claim[s] 1) A computerized method of automatically evaluating computer generated content, the method comprising, using a computer processor: calculating a plurality of metrics for an input text, the plurality of metrics comprising one or more perplexity scores, the perplexity scores describing a prediction of the input text by a large language model (LLM); determining, based on one or more of the calculated metrics, whether to accept or reject the input text; and performing an exchange of data between remotely connected computer devices based on the determining to accept or reject the text.
As per Claim(s) 15 (and consequently claim[s] 16-20 which depend on claim[s] 15), the prior art of record does not teach or suggest the combination of all limitations in claim(s) 15, including (i.e. in combination with the remaining limitations in claim[s] 15) A computerized method of automatically evaluating an input text, the method comprising, using a computer processor: computing a plurality of scores for an input text, the plurality of scores comprising one or more perplexity metrics, the perplexity metrics describing a generation of the input text by a generative artificial intelligence (GenAI) model; determining, based on one or more of the computed metrics, whether to approve or dismiss the input text; and saving an update to the GenAI model based on the determining whether to accept or reject the input text.
2024/0362417 teaches “In some implementations, the methods and systems of the present disclosure use readability metrics that measure and quantify characteristics of textual data. The methods and systems use the readability metrics to generate features that quantify the readability of the input and the LLM output. The methods and systems calculate these features for a training data and train a classifier to predict an associated confidence score of the LLM output. The confidence score can be used as feedback to improve the output quality of the LLM. The methods and systems use the text of the input provided to the LLM as well as the text of the generated LLM output as a source of information to determine the confidence score of the generated LLM output (answers generated by the LLM). The confidence score may be used to identify the quality of the generated LLM output” (paragraph 15) and “A readability model 108 parses the text of the input 12 and the text of the generated LLM output 14 and evaluates readability metrics of the text included in the input 12 and the text included in the LLM output 14. The readability model 108 creates a feature vector 16 based on the evaluation of the readability metrics. The feature vector 16 includes a plurality of features 18 that quantify the complexity of text included in the input 12 prompt to the LLM 106 and the LLMs output 14. The features 18 include different values that quantify the complexity of the text in the input 12 and the text of the LLM output 14. In some implementations, a high value for the feature 18 indicates that the text is complex and more difficult to read, and a low value for the feature 18 indicates that the text is easier to read. In some implementations, a low value for the feature 18 indicates that the text is complex and more difficult to read, and a high value for the feature 18 indicates that the text is easier to read” (paragraph 20) and “In some implementations, the features 18 include readability metrics that evaluate human readability features of the text included in the input 12 and the text of the LLM output 14. Example human readability metrics include the Gunning Fog Index, the Coleman-Liau Index, and the Automated Readability Index. Example human readability features include sentence length, word length, and syllable count. Table 1 illustrates example human readability features and the corresponding mathematical definitions for the human readability features. Table 1 also includes descriptions of the different human readability feature…The human readability features are used by the readability model 108 to provide a value that quantifies how humans comprehend the text in the input 12 and the LLM output 14” (paragraph 21).  Paragraph 22 describes readability metrics that include language model evaluation features to evaluate performance of an LLM, and where the language model evaluation features evaluate the LLM output to measure the LLM’s ability to predict and generate text, and where an example of language model evaluation features includes perplexity of the text included in the LLM output which gives an indication of the quality of the LLM’s performance.  This reference does not appear to describe where the readability/perplexity features are used to determine whether to accept/deny input text
11157693 teaches “The predicted original values output by the decoder portion are then analyzed to determine whether the trained language model achieves a threshold perplexity value when compared to the plurality of word sequences (block 610). The language modeling module 112, for instance, compares the predicted original values output by the decoder portion of the trained language model 214 to the corresponding words in the target author corpus 204 to determine whether there are any differences between the predicted original values and words in the target author corpus 204. Differences are assessed against a threshold perplexity value to determine whether the language modeling module 112 has completed training of the trained language model 214. In some implementations, the threshold perplexity value specifies that training should continue until the output of the decoder shows no perplexity over comparison to the ground truth (e.g., the target author corpus 204). In response to determining that the output of the trained language model 214 fails to satisfy the threshold perplexity value, the language modeling module 112 continues training the language model, as indicated by the arrow returning to block 608” (col. 16, lines 18-39).  This reference describes training/updating a language model based on perplexity value, but does not appear to describe where the perplexity value is one of a plurality of values and does not appear to describe where the perplexity value is used to accept/reject input text.
11488579 teaches “FIG. 2 illustrates how a language model may be evaluated using a perplexity measure, according to some embodiments. In addition to generating or predicting output text as described above in FIG. 1, a language model 102 may also be used to characterize an input data string by generating a perplexity output. In information theory, perplexity is a measurement of how well a probabilistic language model 102 predicts a string of sample text. By providing a text string input to different language models, the perplexity measurement can be used to compare and evaluate language models against each other. Generally, a lower perplexity score indicates a better model” (col. 4, lines 50-61) and “In the example of FIG. 2, the language model may be provided with a sample text string 204 of “My mom and I are going in the room.” The sample text string 204 may be an output from a voice/text recognition program that is being tested for its likelihood of natural occurrence. The sample text string 204 may also be a test input to determine how well the language model 102 would predict the text string 204. A perplexity output 206 may be generated that characterizes the likelihood that the language model 102 would generate the sample text string 204” (col. 5, lines 32-42) and  “testing the first language model using the output text from the second language model to generate one or more perplexity scores and determining whether the one or more perplexity scores exceed a threshold score.” (claim 1).  This reference does not appear to describe where perplexity scores are used to accept/reject input text.
2021/0097242 teaches “That is, the feature that a perplexity value is high may be interpreted as meaning that an input sentence is not familiar to a corpus constituting a language model. In contrast, the feature that a perplexity value is low may be interpreted as meaning that the performance of a language model for an input sentence is high, and the input sentence is familiar to the corpus constituting the language model” (paragraphs 45-46) and “When a first perplexity value for the first language model and a second perplexity value for the second language model are acquired, the electronic device may determine whether to correct the first sentence to another sentence in the first language based on at least one of the acquired first perplexity value and second perplexity value. Specifically, the electronic device may determine whether to correct the first sentence to another sentence in the first language by comparing the acquired first perplexity value with a predetermined first threshold value, or comparing the second perplexity value with a predetermined second threshold value, or comparing the first perplexity value and the second perplexity value respectively with a predetermined first threshold value and a predetermined second threshold value” (paragraphs 47-48).  Determining not to correct a sentence could be interpreted as determining to accept the sentence and determining to correct the sentence could be interpreted as determining to reject the sentence in favor of a corrected version of the sentence.  The perplexity in this reference does not appear to describe prediction/generation of input text by an LLM/generative-model, and correcting a sentence does not appear to read on “performing an exchange of data between remotely connected computer devices” or “saving an update to the GenAI model”.
	2025/0259065 teaches “an LLM steward model, which may configure the LLM steward model to generate LLM validation information indicating classifications of LLM outputs as acceptable, tolerable, or non-acceptable. The computing platform may input, into an LLM, an LLM prompt, which may cause the LLM to generate an LLM output. The computing platform may input the LLM output into the LLM steward model, which may cause the LLM steward model to output the LLM validation information. Based on outputting LLM validation information indicating that the LLM output is acceptable or tolerable, the computing platform may send the LLM output to a user device for presentation. Based on outputting LLM validation information indicating that the LLM output is non-acceptable, the computing platform may update the LLM output to conform with a corresponding subset of the plurality of regimes” (paragraph 3).  In this reference, the classification of an LLM output as acceptable/tolerable/non-acceptable does not appear to be based on readability/perplexity/fluency/coherence/etc.
	10978056 teaches “Using different types of classification models including grammaticality models, semantic-correctness models, and naturalness models may be an effective solution for addressing the technical challenge of effectively filtering out unacceptable candidate responses generated by the natural-language generation module of the assistant system 140, since these classification models may automatically determine whether a response satisfies grammaticality, semantic correctness, and naturalness, based on which the assistant system 140 may further determine if it is acceptable” (col. 24, lines 29-39).  This reference does not appear to describe where data is exchanged between remote devices or where a model is updated based on determining that candidate responses are unacceptable or not.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249. The examiner can normally be reached M-F 12:00PM -8:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached at (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





EY 11/13/2025
/ERIC YEN/Primary Examiner, Art Unit 2658

Read full office action

Prosecution Timeline

Apr 25, 2024

Application Filed

Nov 13, 2025

Non-Final Rejection — §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/644,842

Patent 12602541

MINIMIZING LARGE LANGUAGE MODEL HALLUCINATIONS IN GENERATED SUMMARIES

2y 5m to grant Granted Apr 14, 2026

18/429,109

Patent 12585880

SCALABLE CONSISTENCY ENSEMBLE FOR MACHINE LEARNING MODELS

2y 5m to grant Granted Mar 24, 2026

19/094,630

Patent 12585886

CONVERSATION METHODS, APPARATUS, ELECTRONIC DEVICES, STORAGE MEDIA, AND PRODUCTS

2y 5m to grant Granted Mar 24, 2026

18/034,007

Patent 12547651

SYSTEMS AND METHOD FOR DYNAMICALLY UPDATING MATERIALITY DISTRIBUTIONS AND CLASSIFICATIONS IN MULTIPLE DIMENSIONS

2y 5m to grant Granted Feb 10, 2026

18/464,446

Patent 12524617

SYSTEM AND METHOD FOR VISUAL REPRESENTATION OF DOCUMENT TOPICS

2y 5m to grant Granted Jan 13, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

85%

Grant Probability

97%

With Interview (+11.7%)

2y 8m

Median Time to Grant

Low

PTA Risk

Based on 765 resolved cases by this examiner. Grant probability derived from career allow rate.