Last updated: April 19, 2026
Application No. 18/788,970
Combined Machine Learning and Large Language Models

Non-Final OA §101§103§112
Filed
Jul 30, 2024
Examiner
NEWAY, SAMUEL G
Art Unit
2657
Tech Center
2600 — Communications
Assignee
Varonis Systems, Inc.
OA Round
1 (Non-Final)
Interview Optional

— +7.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 686 resolved cases, 2023–2026
Examiner Intelligence

NEWAY, SAMUEL G View full profile →
Grants 75% — above average
Career Allow Rate
517 granted / 686 resolved
+13.4% vs TC avg
Moderate +8% lift
Without
With
+7.6%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
29 currently pending
Career history
715
Total Applications
across all art units
Statute-Specific Performance

§101
16.6%
-23.4% vs TC avg
§103
34.5%
-5.5% vs TC avg
§102
17.1%
-22.9% vs TC avg
§112
20.1%
-19.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 686 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
This is responsive to the application filed 30 July 2024.
Claims 1-20 are pending and considered below.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 11-12, 16 and 19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 11 recites the limitation "the question" in line 4.  There is insufficient antecedent basis for this limitation in the claim.
Claim 16 recites the limitation "the at least one identifier" in lines 1-2.  There is insufficient antecedent basis for this limitation in the claim. It is believed the claim should depend upon claim 15.
Claim 19 recites the limitation "the question" in line 4.  There is insufficient antecedent basis for this limitation in the claim.
Claim 12 is rejected for depending upon a rejected claim without providing a remedy.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea without significantly more. The judicial exception is not integrated into a practical application.
In claims 1 and 14, the limitations 
That is, other than reciting a “computer system for document analysis, comprising: one or more computer readable storage media storing program instructions and one or more processors which, in response to executing the program instructions, are configured to”, a “large language model” and a “decision system” (claim 1),  a “computer-implemented method, comprising the steps of at a computer system comprising one or more computer readable storage media and one or more processors”, a “large language model” and a “decision system”  (claim 14) nothing in the claims precludes the steps from practically being performed in the mind. For example, a person may extract textual data from a document (e.g. a human may read an email and extract the body of the email); provide a scalar indication for each of a plurality of features of the textual data (e.g. a human may count a plurality of features such as spelling errors); and produce an output based on at least the scalar indications (e.g. a human may determine that the email is spam based on the count of spelling errors).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. 
This judicial exception is not integrated into a practical application. In particular, the claims recite the additional elements – a “computer system for document analysis, comprising: one or more computer readable storage media storing program instructions and one or more processors which, in response to executing the program instructions, are configured to”, a “large language model” and a “decision system” (claim 1),  a “computer-implemented method, comprising the steps of at a computer system comprising one or more computer readable storage media and one or more processors”, a “large language model” and a “decision system”  (claim 14) which are recited at a high-level of generality (i.e., as generic processors performing generic computer functions) such that they amount to no more than mere instructions to apply the exception using a generic computer components. 
The claims also recite the additional elements “receiving a document”. The claims do not impose any limits on how the document is received. In other words, the claims recite only the idea of a solution or outcome i.e., the claims fail to recite details of how a solution to a problem is accomplished. These limitations therefore represent extra-solution activity because they are mere nominal or tangential addition to the claims. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claims are therefore directed to an abstract idea. 
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea. As stated above, the claims recite the additional limitations of a “computer system for document analysis, comprising: one or more computer readable storage media storing program instructions and one or more processors which, in response to executing the program instructions, are configured to”, a “large language model” and a “decision system” (claim 1),  a “computer-implemented method, comprising the steps of at a computer system comprising one or more computer readable storage media and one or more processors”, a “large language model” and a “decision system”  (claim 14). However, these are recited at a high level of generality and are recited as performing generic computer functions routinely used in computer applications (see Applicant’s specification [0030][0034] and [0036]-[0037]). Generic computer components recited as performing generic computer functions that are well-understood, routine and conventional activities amount to no more than implementing the abstract idea with a computerized system. 
The claims also recite the additional elements “receiving a document”. The claims do not impose any limits on how the document is received. In other words, the claims recite only the idea of a solution or outcome i.e., the claims fail to recite details of how a solution to a problem is accomplished. These limitations represent the extra-solution activity of gathering data which is well-understood, routine and conventional activitiy. Thus, taken alone, the additional elements do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
Moreover, see Recentive Analytics, Inc. v. Fox Corp. (Fed. Cir. April 18, 2025)- “Machine learning is a burgeoning and increasingly important field and may lead to patent-eligible improvements in technology. Today, we hold only that patents that do no more than claim the application of generic machine learning to new data environments, without disclosing improvements to the machine learning models to be applied, are patent ineligible under § 101.”
The dependent claims, when analyzed as a whole, are held to be patent ineligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea. 
The dependent claims recite:
wherein the decision system comprises a machine learning model;
wherein the decision system comprises a rules-based system;
wherein the one or more processors are further configured to obtain information based on at least one identifier associated with the document;
wherein the at least one identifier is an indication of a person's identity who is associated with the document and the information is obtained from an organisational database;
wherein the obtained information is an indication of how often there are communications with the identified person;
wherein the document is an email and the textual data comprises the text body of the email;
wherein at least one of the scalar indications is an indication of the quantity of content relating to a feature;
wherein at least one of the scalar indications is an indication of the strength of language in relation to a feature;
wherein the at least one feature of the textual data include at least one of the urgency of language used, spelling accuracy, pressure applied to recipient to take certain action, language which appears disingenuous, offers which are “too good to be true”, and attempts to sell products;
wherein the step of requesting a scalar indication comprises requesting a plurality of scalar indications from the large language model for at least one of the features, wherein each of the plurality of scalar indications are requested using a different form of the question;
wherein the decision system utilises the average, minimum or maximum of the plurality of scalar indications for a feature;
wherein the step of requesting a scalar indication comprises requesting the large language model to verify a deliberately false scalar indication to verify confidence in a scalar indication provided by the large language model.
The additional recited limitations further narrow the steps of the independent claims without however providing “a practical application of” or "significantly more than" the underlying “Mental Processes” abstract idea. Therefore, the dependent claims are also not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-9, 11, 14-17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Schweighauser et al. (US 2019/0222606) in view of Lee et al. ("‘CatBERT: Context-aware tiny BERT for detecting targeted social engineering emails." arXiv preprint arXiv:2010.03484 (2021)).
Claim 1:
Schweighauser discloses a computer system for document analysis, comprising: 
one or more computer readable storage media storing program instructions and one or more processors which, in response to executing the program instructions ([0038]), are configured to: 
receive a document (“As soon as one or more new/incoming messages or emails have been sent internally by one user within the entity 114 from an email account on the electronic messaging system 116 to another user within the entity 114, the message collection and analysis component 106 of the AI engine 104 is configured to collect such new electronic messages sent”, [0018]); 
extract textual data from the document (“the message collection and analysis component 106 is configured to use the unique communication patterns identified to examine and extract various features or signals from the collected electronic messages”, [0020]); 
requesting an AI model to provide a scalar indication for each of a plurality of features of the textual data (“The fraud detection component 108 is then configured to utilize one or more of the following features and/or criteria that are unique to the email account to make a determination of whether the email account has been compromised (e.g., taken over by an attacker) or not: … Number of embedded links in the email sent by the email account; … Length of the longest URL in the email sent by the email account”, [0021]-[0024], see also “the fraud detection component 108 is configured to compute term frequency-inverse document frequency (TF-IDF) of each word offline”, [0028]); and 
utilise a decision system to produce an output based on at least the scalar indications (“If the fraud detection component 108 determines that the email account has been compromised, it is configured to block (remove, delete, modify) or quarantine electronic messages sent from the compromised email account in real time, and automatically notify the user, intended recipient(s) of the electronic message and/or an administrator of the electronic communication system 116 of the email account takeover attack”, [0034]).
Schweighauser does not explicitly disclose that the AI model is a large language model.
In an analogous art similarly requesting an AI model to analyze email in order to detect fraudulent ones, Lee discloses that the AI model is a large language model (BERT) (“we propose a phishing detection strategy based on transformers [22], leveraging a BERT-derived approach that is trained on a self-supervised cloze task on a public corpus of documents [3] and then optimizing the language model to perform phishing detection”, section 1, paragraph 2).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to combine the references to yield the predictable result of implementing Schweighauser’s AI model as an LLM because LLMs are the state of the art in AI text processing generating excellent results in fields such as text analysis and classification.
Claim 2:
Schweighauser in view of Lee discloses a computer system according to claim 1, wherein the decision system comprises a machine learning model (Schweighauser, [0034], note that an AI module is a machine learning model).
Claim 3:
Schweighauser in view of Neystadt discloses a computer system according to claim 1, wherein the decision system comprises a rules-based system (Schweighauser, [0029], note “If a certain domain has been seen in internal communications often during a short period of time, it is deemed to be legitimate” represents an "if-then" rule).
Claim 4:
Schweighauser in view of Lee discloses a computer system according to claim 1, wherein the one or more processors are further configured to obtain information based on at least one identifier associated with the document (Schweighauser, [0010]).
Claim 5:
Schweighauser in view of Lee discloses a computer system according to claim 4, wherein the at least one identifier is an indication of a person's identity who is associated with the document and the information is obtained from an organisational database (Schweighauser, [0010]).
Claim 6:
Schweighauser in view of Lee discloses a computer system according to claim 5, wherein the obtained information is an indication of how often  there are communications with the identified person (Schweighauser, [0029]).
Claim 7:
Schweighauser in view of Lee discloses a computer system according to claim 1, wherein the document is an email and the textual data comprises the text body of the email (Schweighauser, [0018]).
Claim 8:
Schweighauser in view of Lee discloses a computer system according to claim 1, wherein at least one of the scalar indications is an indication of the quantity of content relating to a feature (Schweighauser, [0021]-[0024]).
Claim 9:
Schweighauser in view of Lee discloses a computer system according to claim 1, wherein at least one of the scalar indications is an indication of the strength of language in relation to a feature (Schweighauser, [0024]).
Claim 11:
Schweighauser in view of Lee discloses a computer system according to claim 1, wherein the step of requesting a scalar indication comprises requesting a plurality of scalar indications from the large language model for at least one of the features, wherein each of the plurality of scalar indications are requested using a different form of a question (Schweighauser, “maintain a score for each word wherein the score represents the likelihood of the word to be associated with malicious (phishing) emails. In some embodiments, the fraud detection component 108 is configured to compute term frequency-inverse document frequency (TF-IDF) of each word offline”, [0028], Neystadt, [0015]-[0016]).
Claims 14-17 and 19:
Schweighauser in view of Lee discloses a computer-implemented method, comprising the steps of at a computer system comprising one or more computer readable storage media and one or more processors (Schweighauser, [0010]) for executing the steps performed by the system of claims 1, 4-5, 7 and 11 as shown above.

Claims 1-9, 11, 14-17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Schweighauser et al. (US 2019/0222606) in view of Neystadt et al (US 2025/0133111).
Claim 1:
Schweighauser discloses a computer system for document analysis, comprising: 
one or more computer readable storage media storing program instructions and one or more processors which, in response to executing the program instructions ([0038]), are configured to: 
receive a document (“As soon as one or more new/incoming messages or emails have been sent internally by one user within the entity 114 from an email account on the electronic messaging system 116 to another user within the entity 114, the message collection and analysis component 106 of the AI engine 104 is configured to collect such new electronic messages sent”, [0018]); 
extract textual data from the document (“the message collection and analysis component 106 is configured to use the unique communication patterns identified to examine and extract various features or signals from the collected electronic messages”, [0020]); 
requesting an AI model to provide a scalar indication for each of a plurality of features of the textual data (“The fraud detection component 108 is then configured to utilize one or more of the following features and/or criteria that are unique to the email account to make a determination of whether the email account has been compromised (e.g., taken over by an attacker) or not: … Number of embedded links in the email sent by the email account; … Length of the longest URL in the email sent by the email account”, [0021]-[0024], see also “the fraud detection component 108 is configured to compute term frequency-inverse document frequency (TF-IDF) of each word offline”, [0028]); and 
utilise a decision system to produce an output based on at least the scalar indications (“If the fraud detection component 108 determines that the email account has been compromised, it is configured to block (remove, delete, modify) or quarantine electronic messages sent from the compromised email account in real time, and automatically notify the user, intended recipient(s) of the electronic message and/or an administrator of the electronic communication system 116 of the email account takeover attack”, [0034]).
Schweighauser does not explicitly disclose that the AI model is a large language model.
In an analogous art similarly requesting an AI model to analyze email in order to detect fraudulent ones, Neystadt discloses that the AI model is a large language model (“Email Security Agent 112 requests from the LLM 108 (arrow 156) to analyze the email message and to convert the email message, including its content and its meta-data (e.g., header data; recipient name/email address/data; sender name/email address/data; IP address of sender and of relay nodes or relay servers; trace-route data; subject data; information about CC recipients; list of email relays), into embeddings”, [0028]).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to combine the references to yield the predictable result of implementing Schweighauser’s AI model as an LLM because LLMs are the state of the art in AI text processing generating excellent results in fields such as text analysis and classification.
Claim 2:
Schweighauser in view of Neystadt discloses a computer system according to claim 1, wherein the decision system comprises a machine learning model (Schweighauser, [0034], note that an AI module is a machine learning model, see also Neystadt, [0017] for explicit ML disclosure).
Claim 3:
Schweighauser in view of Neystadt discloses a computer system according to claim 1, wherein the decision system comprises a rules-based system (Schweighauser, [0029], note “If a certain domain has been seen in internal communications often during a short period of time, it is deemed to be legitimate” represents an "if-then" rule, Neystadt, “the system may utilize a set of rules for determining an initial risk score for the evaluated message based on such characteristics”, [0036]).
Claim 4:
Schweighauser in view of Neystadt discloses a computer system according to claim 1, wherein the one or more processors are further configured to obtain information based on at least one identifier associated with the document (Schweighauser, [0010], Neystadt, [0022]).
Claim 5:
Schweighauser in view of Neystadt discloses a computer system according to claim 4, wherein the at least one identifier is an indication of a person's identity who is associated with the document and the information is obtained from an organisational database (Schweighauser, [0010], Neystadt, [0022]).
Claim 6:
Schweighauser in view of Neystadt discloses a computer system according to claim 5, wherein the obtained information is an indication of how often there are communications with the identified person (Schweighauser, [0029], Neystadt, [0036]).
Claim 7:
Schweighauser in view of Neystadt discloses a computer system according to claim 1, wherein the document is an email and the textual data comprises the text body of the email (Schweighauser, [0018], Neystadt, [0018]).
Claim 8:
Schweighauser in view of Neystadt discloses a computer system according to claim 1, wherein at least one of the scalar indications is an indication of the quantity of content relating to a feature (Schweighauser, [0021]-[0024], Neystadt, “the existence or the absence of particular keywords or terms in the scanned message (e.g., “urgent”, or “password”, or “you must pay”)”, [0016]).
Claim 9:
Schweighauser in view of Neystadt discloses a computer system according to claim 1, wherein at least one of the scalar indications is an indication of the strength of language in relation to a feature (Schweighauser, [0024], Neystadt, [0015]).
Claim 11:
Schweighauser in view of Neystadt discloses a computer system according to claim 1, wherein the step of requesting a scalar indication comprises requesting a plurality of scalar indications from the large language model for at least one of the features, wherein each of the plurality of scalar indications are requested using a different form of a question (Schweighauser, “maintain a score for each word wherein the score represents the likelihood of the word to be associated with malicious (phishing) emails. In some embodiments, the fraud detection component 108 is configured to compute term frequency-inverse document frequency (TF-IDF) of each word offline”, [0028], Neystadt, [0015]-[0016]).
Claims 14-17 and 19:
Schweighauser in view of Neystadt discloses a computer-implemented method, comprising the steps of at a computer system comprising one or more computer readable storage media and one or more processors (Schweighauser, [0010]) for executing the steps performed by the system of claims 1, 4-5, 7 and 11 as shown above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
De La Noval et al. ("Methodologies for email spam classification using large language models." 2023 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE, 2023) discloses email spam detection using LLMs.
Benkovich et al. (US 2021/0021553) discloses a method for spam identification. A spam filter module may receive an email at a client device and may determine a signature of the email. The spam filter module may compare the determined signature with a plurality of spam signatures stored in a database. In response to determining that no match exists between the determined signature and the plurality of spam signatures, the spam filter module may placing the email in quarantine. A spam classifier module may extract header information of the email and determine a degree of similarity between known spam emails and the email. In response to determining that the degree of similarity exceeds a threshold, the spam filter module may transfer the email from the quarantine to a spam repository.
Singh et al. (US 2021/0281606) discloses detecting a phishing attack on a computer device can involve scanning one or more email messages, and separating email parts from the one or more email messages, in response to scanning the at least one email message. In addition, the email parts of the at least one email message can be subject to a feature extraction operation. The email features extracted from the email parts can be then analyzed to determine whether or not any of the email features contain suspected phishing content, confirmed phishing content and benign email content.
Islam et al. (US 2025/0111238) discloses an identity management system may obtain a set of data signals via a system log application programming interface (API). The set of data signals may be associated with interactions between a user of a client device and one or more applications associated with the identity management system. The identity management system may then output a set of text strings that include parsed data from the set of data signals to a large language model (LLM) that is affiliated with a multi-modal machine learning model configured to generate risk metrics based on data signals. A first risk metric may then be obtained from the LLM. Further, the identity management system may generate a second risk metric by using a risk combinator of the multi-modal machine learning model. As such, the identity management system may generate a unified risk metric based on the first risk metric and the second risk metric.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAMUEL G NEWAY whose telephone number is (571)270-1058. The examiner can normally be reached Monday-Friday 9:00am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SAMUEL G NEWAY/            Primary Examiner, Art Unit 2657
Read full office action
Prosecution Timeline

Jul 30, 2024
Application Filed
Feb 19, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/067,086
Patent 12602538
METHOD AND SYSTEM FOR EXEMPLAR LEARNING FOR TEMPLATIZING DOCUMENTS ACROSS DATA SOURCES
2y 5m to grant Granted Apr 14, 2026
18/100,645
Patent 12603177
INTERACTIVE CONVERSATIONAL SYMPTOM CHECKER
2y 5m to grant Granted Apr 14, 2026
18/642,010
Patent 12603092
AUTOMATED ASSISTANT CONTROL OF NON-ASSISTANT APPLICATIONS VIA IDENTIFICATION OF SYNONYMOUS TERM AND/OR SPEECH PROCESSING BIASING
2y 5m to grant Granted Apr 14, 2026
18/146,276
Patent 12596734
PARSE ARBITRATOR FOR ARBITRATING BETWEEN CANDIDATE DESCRIPTIVE PARSES GENERATED FROM DESCRIPTIVE QUERIES
2y 5m to grant Granted Apr 07, 2026
18/385,484
Patent 12596892
MACHINE TRANSLATION SYSTEM FOR ENTERTAINMENT AND MEDIA
2y 5m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
75%
Grant Probability
83%
With Interview (+7.6%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 686 resolved cases by this examiner. Grant probability derived from career allow rate.
Combined Machine Learning and Large Language Models

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email