Last updated: May 29, 2026

Application No. 18/336,578

METHOD AND APPARATUS WITH TEXT CLASSIFICATION BASED ON SALIENT WORD REPLACEMENT

Non-Final OA §103

Filed

Jun 16, 2023

Priority

Jan 13, 2023 — RE 10-2023-0005370

Examiner

LERNER, MARTIN

Art Unit

2658

Tech Center

2600 — Communications

Assignee

Seoul National University R&Db Foundation

OA Round

3 (Non-Final)

Interview Optional

— +13.3% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 78% grant rate with +13.3% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 988 resolved cases, 2023–2026

Examiner Intelligence

LERNER, MARTIN View full profile →

Grants 78% — above average

Career Allowance Rate

771 granted / 988 resolved

+16.0% vs TC avg

Moderate +13% lift

Without

With

+13.3%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

22 currently pending

Career history

1008

Total Applications

across all art units

Statute-Specific Performance

§101

10.0%

-30.0% vs TC avg

§103

74.1%

+34.1% vs TC avg

§102

3.5%

-36.5% vs TC avg

§112

8.7%

-31.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 988 resolved cases

Office Action

§103

DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Election/Restrictions
Claim 20 is withdrawn from further consideration pursuant to 37 CFR 1.142(b), as being drawn to a nonelected invention, there being no allowable generic or linking claim.  Applicants timely traversed the restriction (election) requirement in the reply filed on 29 May 2025.

Claim Objections
Claims 1 to 6, 8 to 16, and 18 to 19 are objected to because of the following informalities:
Independent claims 1 and 11 set forth limitations of “obtain the text classification result of the input text” and “obtaining the text classification result of the input text”, where there is no clear antecedent basis for “the text classification result”.  The prior limitations set forth “obtain probabilities values of inferred labels and classification results” and “obtaining probability values of inferred labels and classification results”, but these limitations only provide a prior recitation of “classification results” and not for “text classification results”.  Generally, any initial recitation of a claim element should be accompanied by an indefinite article of “a” or “an”, and not a definite article of “the” or “said”. 
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 to 3, 6, 11 to 13, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Singh et al. (U.S. Patent No. 11,972,211) in view of Echauz et al. (U.S. Patent No. 11,552,137).
Concerning independent claims 1 and 11, Singh et al. discloses a system and method for adversarial input generation, comprising:
“one or more processors; a memory comprising instructions configured to cause the one or more processors to:” – computer system 600 may include hardware processor 602 and a machine readable medium 622 on which is stored instructions 624 embodying the inventive techniques or functions (column 10, line 54 to column 11, line 23: Figure 6);
“determine saliencies of the respective words of the input text” – search component 214 may find important tokens in the discrete language units of text corpus 212 that have a likelihood that exceeds a threshold of changing as model decision (column 4, lines 1 to 4: Figure 2); here, important tokens are words that are ‘salient’;
“select some words having relatively high saliency from among the determined saliencies of the respective words” – search component 214 extracts the x tokens that have the highest contribution to the gradient of loss (column 4, lines 28 to 32: Figure 2); tokens above a predetermined rank may be selected by search component 214; other examples of token selection may select the x tokens having the highest term frequency (column 4, lines 54 to 57: Figure 2);
“generate replaced text by replacing the selected words in instances of the input text with other words” – once tokens 216 are generated, replacement selector component 218 may select, for each of tokens 216, one or more replacement tokens; replacement selector component 218 may use an embedding space algorithm to find the closest words to the tokens in an embedding space (column 5, lines 1 to 5: Figure 2).
Concerning independent claims 1 and 11, Singh et al. discloses that adversarial attacks include perturbations in the textual domain with textual perturbation including changes at the character level, word level, and sentence level.  (Column 2, Lines 11 to 26)  Generating adversarial text is used by analysis service 535 to evaluate one or more models for robustness against adversarial text.  (Column 9, Lines 33 to 35)  Singh et al., then, is directed to generating adversarial text by replacing important (‘salient’) words to test a model against adversarial attacks.  However, Singh et al. does not expressly disclose the limitations of “determine whether the input text has been subjected to an adversarial attack” and “responsive to determining that the input text has been subjected to an adversarial attack”, “obtain probability values of inferred labels and classification results of the respective replaced texts as an inference result from the text classification model, which receives the replaced texts as inputs” and “obtain the text classification result of the input text based on the probability values of the inferred labels or the classification result of the respective replaced texts.”  That is, Singh et al. discloses generating replacement text for adversarial attacks to test a model, but does not describe how the model is tested.
Concerning independent claims 1 and 11, Echauz et al. teaches machine learning adversarial campaign mitigation by deploying a classification monitor in the model environment to monitor classification decision outputs in the machine learning model.  (Abstract)  Machine learning models may be used as binary classifiers that classify elements of a given set into two groups and predicts into which group each input belongs on the basis of a classification rule or threshold.  (Column 2, Lines 56 to 62)  Adversarial attack campaigns may be determined by measuring the performance of the model by plotting a true positive rate (TPR) against a false positive rate (FPR) at various threshold settings, where a true positive rate may reflect correctly assigned positive classification and be known as a ‘probability of detection’, and a false positive rate may reflect incorrectly assigned positive classifications and be known as a ‘probability of false alarm.  (Column 3, Lines 57 to 67)  A machine learning model may receive at least one data set of inputs in order to train the model to determine which class the inputs should be classified.  The classes may be ‘malicious’ and ‘benign’.  Labels of ‘malicious’ and ‘benign’ are assigned during training to be considered as ground truth.  (Column 5, Lines 25 to 42: Figure 2)  Example inputs 204 and 210 may be represented by ‘x’ marks and ‘o’ marks, where ‘x’ inputs may be determined to be part of a pre-determined class A, and ‘o’ marks may be determined to be part of a pre-determined class B.  Each class may represent a ground truth with ‘o’ inputs representing clean or benign files and ‘x’ inputs representing malicious files.  (Column 6, Lines 7 to 30: Figure 2)  The model may use Bayesian decision rules to classify each input by determining the least probability of misclassification in a specific environment.  (Column 6, Lines 40 to 44: Figure 2)  Figures 3A to 3C illustrates probabilities from 0 to 1 of true positive rate versus false positive rate.  Echauz et al., then, teaches “determine whether the input text has been subjected to an adversarial attack” by providing ground truth inputs marked ‘x’ for training a model that are labeled as ‘malicious’.  These inputs are applied to the model to determine a probability that the ‘malicious’ and ‘benign’ training inputs are correctly classified as true positives or false positives.  This classification of training inputs produces “inferred labels and classification results . . . as an inference result” as compared to ground truth labels of ‘malicious’ or ‘benign’.  That is, a classifier model ‘infers’ based upon ‘probability values’ if training input should be classified with a ‘result’ that the training input is ‘malicious’ or ‘benign’.  An objective is to provide practical and low-cost adversarial mitigation to prevent misclassification of outputs that result from adversarial inputs.  (Column 1, Lines 6 to 16)  It would have been obvious to one having ordinary skill in the art to generate adversarial input by replacing salient words of Singh et al. to perform training of a machine learning model to generate a correct classification result and inference result of inferred labels as compared to ground truth in Echauz et al. for a purpose of preventing misclassification of outputs that result from adversarial inputs. 

Concerning claims 2 and 12, Singh et al. discloses synonym replacement provides an option to create text that preserves the original semantics of the original text for generating adversarial text for hardening a model against attacks.  (Column 2, Lines 27 to 52)  Even if synonym replacement may change the semantic content of the text, it is a way to generate token replacements with similar words in an embedding space.
Concerning claims 3 and 13, Echauz et al. teaches performing an adversarial classification decision to thwart an adversarial campaign.  (Abstract)  Machine learning models may be used as binary classifiers to classify input as benign or malicious.  (Column 2, Line 56 to Column 3, Line 24)  Here, malicious input may be construed as “an anomaly”, and benign input is “the input text does not indicate an anomaly” so that a binary classifier operates to “obtain the text classification result of the input text from the text classification model receiving the input text and performing the text classification result.”  Broadly, a binary classifier that classifies the text as benign is “the text classification model” that obtains “the text classification result.” 
Concerning claims 6 and 16, Singh et al. discloses that search component 214 may use a gradient method in which the system backpropagates (“a backpropagation algorithm”) the gradient to the embedding layer, and extracts the tokens that have the highest contribution to the gradient of loss with respect to the input layer in order to find the most important tokens (“wherein the saliencies are determined based on backpropagation algorithm”).  (Column 4, Lines 1 to 4 and Column 4, Lines 28 to 32)  Here, these most important tokens are the most ‘salient’ tokens (“the saliencies”).

Claims 4, 8 to 10, 14, and 18 to 19 are rejected under 35 U.S.C. 103 as being unpatentable over Singh et al. (U.S. Patent No. 11,972,211) in view of Echauz et al. (U.S. Patent No. 11,552,137) as applied to claims 1 and 11 above, and further in view of Shukla et al. (U.S. Patent Publication 2022/0083898).
Concerning claims 4 and 14, Singh et al. discloses that search component 214 finds important tokens 216 that have a likelihood that exceeds a threshold of changing the model decision.  (Column 4, Lines 1 to 4: Figure 2)  Once tokens 216 are generated replacement selector component 218 may select one or more replacement tokens using an embedding space algorithm to find the closest words to the tokens in an embedding space.  (Column 5, Lines 1 to 5: Figure 2)  Replacement token candidates with a cosine similarity score to the original tokens that is below a threshold may be removed from consideration.  (Column 6, Lines 14 to 17)  However, Singh et al. does not disclose “obtain a first probability of a label of the input text as output from the text classification model based on receiving the input text”, “obtain a second probability of a label of the input text with one word thereof omitted therefrom based on the text classification model receiving the word-omitted input text as an input”, and “determine the saliency of the one word based on a difference between the first probability and the second probability.”  
Concerning claims 4 and 14, Shukla et al. teaches anomalous text detection that determines an importance measure for each word based at least in part on a deviation between an anomaly probability score generated by dropping the word from the training corpus data entry and the anomaly probability score of the noted training corpus data entry, e.g., an importance measure for a word ‘bradbury’ is a measure of deviation between an anomaly probability score of text corpus data entry ‘933 bradbury se ste 22222’ and an anomaly probability score of anomalous data entry ‘933 se ste 22220’, which is 0.93 – 0.90 = 0.03.  (¶[0079]: Figure 8)  Shukla et al., then, teaches limitations of “obtain a first probability of a label of the input text as output from the text classification model based on receiving the input text” for non-anomalous text of a corpus with a probability 0.93, and “obtain a second probability of a label of the input text with one word omitted therefrom based on the text classification model receiving the word-omitted input text as an input” for text omitting the word ‘bradbury’ with a probability 0.90.  Then, a deviation of the two probabilities of 0.93 – 0.90 = 0.03 for an importance of the word ‘bradbury’ is “determine saliency of the one word based on a difference between the first probability and the second probability.”  An objective is to efficiently and reliably perform anomalous text detection.  (¶[0001])  It would have been obvious to one having ordinary skill in the art to obtain probability values of inferred labels of replaced text to perform text classification of anomalies with a classification model as taught by Shukla et al. to generate adversarial input in Singh et al. for a purpose of efficiently and reliably performing anomalous text detection.

Concerning claims 8 and 18, Shukla et al. teaches that if a word is dropped from more than one input data entry, an importance measure for the word is determined based at least in part on a measure of statistical distribution, e.g., an average, of importance measures calculated for the word on a per-entry basis (¶[0031]); if a word is dropped from three sentences, an importance measure for the word may be determined based at least in part on the average of the importance measures for the word based on the first sentence, the second sentence, and the third sentence of the three sentences (“determine an average probability value of the inferred labels based on the probability values”) (¶[0080]); if a word is dropped from more than one input data entry, an importance measure for the word is determined based at least in part on a measure of statistical distribution, e.g., an average, of importance measures calculated for the word on a per-entry basis (“obtain the text classification result of the input text based on the average probability value”) (¶[0084]).
Concerning claims 9 to 10 and 19, Shukla et al. teaches that a predictive data analysis computer entity may determine that a number of exploration-exploitation keyword extraction iterations should continue until an iteration during which the keyword list has a keyword count that falls below a keyword count threshold (¶[0068]); during a first exploration-exploitation keyword extraction iteration, a per-iteration extracted keyword list of 200 words is extracted, during a second exploration-exploitation keyword extraction iteration, a per-iteration extracted keyword list of 140 words is extracted, and during a third exploration-exploitation keyword extraction iteration, a per-iteration extracted keyword list of 100 words is extracted (“wherein the selecting the some words comprises selecting a preset number of words based on having the highest saliences”)  (¶[0069]); a predictive data analysis computing entity determines that each word whose importance measure satisfies an importance measure threshold condition, e.g., whose importance measure is above an importance measure threshold value, is an extracted keyword for an iteration that should be included in a per-iteration extracted keyword list for an initial exploration-exploitation keyword extraction iteration (“wherein the selected words are selected based on having respective saliences above a threshold”) (¶[0078]); a predictive data analysis computing entity determines that each word whose importance measure satisfies an importance measure threshold condition, e.g., whose importance measure is above an importance measure threshold value, is an extracted keyword for a non-initial exploration-exploitation keyword extraction iteration that should be included in the per-iteration extracted keyword list for a non-initial exploration-exploitation keyword extraction iteration (“wherein the selected words are selected based on having respective saliences above a threshold”) (¶[0084]). 

Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Singh et al. (U.S. Patent No. 11,972,211) in view of Echauz et al. (U.S. Patent No. 11,552,137) as applied to claims 1 and 11 above, and further in view of Xiao et al. (U.S. Patent Publication 2023/0022943).
Concerning claims 5 and 15, Singh et al. discloses determining replacement tokens with a constraint filter that uses BERT (Bidirectional Encoder Representation from Transformers).  (Column 5, Lines 26 to 63)  Conventionally, BERT has an architecture that includes an encoder that ‘compresses’ the text into an embedding space.  However, Singh et al. operates on input text but does not expressly disclose “obtain a compressed version of the input text from an encoder that receives the input text as input”, “obtain a decompressed version of the input text from a decoder that receives the compressed version of the input text as an input”, and “determine whether the input text indicates has been subjected to the adversarial attack based on a reconstruction error which is obtained based on the input text and the decompressed version of the input text.”   
Concerning claims 5 and 15, Xiao et al. teaches a method for defending against adversarial samples in an analogous art of input images.  (Abstract)  Defending against adversarial samples includes denoising an input image by an adversarial denoising network to acquire a reconstructed image, calculating an adversarial score of the input image, and determining the input image as an adversarial sample or a benign sample according to a threshold.  (¶[0019] - ¶[0020])  A denoising network may be a denoising autoencoder-decoder network with a down-sampling operation in an encoding phase and an up-sampling operation in a decoding phase.  (¶[0029])  An adversarial score is calculated based on the visual reconstruction error and the categorical reconstruction error may be used as a basis for determining whether the image is an adversarial sample or a benign sample (“determine whether the input [text] has been subjected to the adversarial attack based on a reconstruction error which is obtained based on the input [text] and the decompressed version of the input [text]”).  (¶[0040] - ¶[0043])  A denoiser module uses a stacked denoising network formed by combining down-sampling and up-sampling operations.  (¶[0112])  Here, down-sampling by an encoder is to “obtain a compressed version of the input [text] from an encoder that receives the input [text] as an input”, and up-sampling by a decoder is to “obtain a decompressed version of the input [text] from a decoder that receives the compressed version of the input [text] as an input”.  An objective is to increase a defense against an adversarial attack by dealing with unknown types of adversarial samples in the future which might result in limited robustness and expandability.  (¶[0016])  It would have been obvious to one having ordinary skill in the art to determine whether an input is subject to an adversarial attack based on a reconstruction error between a compressed version and a decompressed version of the input as taught by Xiao et al. of adversarial attacks using text input of Singh et al. for a purpose of increasing a defense by dealing with unknown types of future adversarial samples.


Response to Arguments
Applicants’ arguments filed 23 March 2026 are moot in view of new grounds of rejection, as necessitated by amendment.
Applicants provide some significant amendments to the independent claims and present arguments traversing the prior rejection under 35 U.S.C. §103 over Mazor et al. (U.S. Patent No. 11,281,858) in view of Shukla et al. (U.S. Patent Publication 2022/0083898).  Specifically, Applicants amend these independent claims so that they are now directed to whether input text is “subjected to an adversarial attack” and delete limitations directed to whether the input text “indicates an anomaly”.  Applicants argue that Mazor et al. does not disclose determining saliencies responding to determining that the input text indicates an anomaly or determining that the input text has been subjected to an adversarial attack.  Applicants allege that Mazor et al.’s saliencies are used to identify anomalous text and are not derived responsive to a determination that the input text has been subjected to an adversarial attack.
Generally, Applicants’ arguments are moot in light of new grounds of rejection necessitated by amendment.  Independent claims 1 and 11 are now rejected as being obvious under 35 U.S.C. §103 over Singh et al. (U.S. Patent No. 11,972,211) in view of Echauz et al. (U.S. Patent No. 11,552,137).  The rejection now no longer relies upon Mazor et al. (U.S. Patent No. 11,281,858), but some dependent claims are rejected as being obvious further in view of Shukla et al. (U.S. Patent Publication 2022/0083898).  The rejection now no longer relies upon Casserini et al. (U.S. Patent Publication 2023/0024884), but instead some dependent claims are rejected by substituting Xiao et al. (U.S. Patent Publication 2023/0022943).  All of these new grounds of rejection are necessitated by amendment due to the new limitations directed to “an adversarial attack”.  
Mainly, Singh et al. discloses a basic concept of generating adversarial input for testing a machine learning model to defend against an adversarial attack by expanding adversarial input to generate additional training data with replacement of certain important words in the adversarial text.  Singh et al. does not perform a preliminary determination of whether the input text is adversarial text from an adversarial attack because it is presumed that this input text is already adversarial text for a purpose of generating additional adversarial text by replacement of important words.  However, Echauz et al. teaches a general procedure to provide an adversarial classification decision based on probabilities to determine a malicious attack with a classification algorithm that classifies input samples as malicious or benign to improve a classification model.  Consequently, an input sample that is classified as malicious, i.e., ground truth samples ‘x’, in Echauz et al. can be provided as a preliminary input that is subsequently processed to generate additional adversarial input by replacement of important words in Singh et al.  A rationale for a combination can be premised on KSR International Co. v. Teleflex Inc. (KSR), 550 U.S. 398, 82 USPQ2d 1385 (2007): (A) Combining prior art elements according to known methods to yield predictable results or (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results.  See MPEP §2141.  Here, a preliminary method of classifying text as being malicious in Echauz et al. is a known technique that can be combined in a predictable way with generating adversarial input by replacement of important words in Singh et al.  Alternatively, Singh et al. represents a known technique for generating adversarial input by replacement of important words that is ready for improvement to predictably determine a probability that the input was correctly classified according to its ground truth label in Echauz et al.
Applicants’ amendments necessitate all of the new grounds of rejection.  This Office Action is NON-FINAL.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Lee et al., Stokes, III et al., Kuta et al., and Liu et al. disclose related prior art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608.  The examiner can normally be reached Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format.  For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/MARTIN LERNER/Primary Examiner
Art Unit 2658                                                                                                                                                                                                        May 4, 2026

Read full office action

Prosecution Timeline

Show 3 earlier events

Nov 18, 2025

Applicant Interview (Telephonic)

Nov 21, 2025

Final Rejection mailed — §103

Mar 23, 2026

Response after Non-Final Action

Mar 30, 2026

Applicant Interview (Telephonic)

Mar 31, 2026

Examiner Interview Summary

Apr 21, 2026

Request for Continued Examination

Apr 23, 2026

Response after Non-Final Action

May 07, 2026

Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/272,516

Patent 12632656

TEXT GENERATION INCLUDING DE-DUPLICATION OF DECODED WORD INFORMATION TO SPLICE TARGET WORD INFORMATION INTO AN INFORMATION SEQUENCE

2y 10m to grant Granted May 19, 2026

17/770,177

Patent 12620404

DEEP SOURCE SEPARATION ARCHITECTURE

4y 0m to grant Granted May 05, 2026

18/365,535

Patent 12596880

DETERMINING CAUSALITY BETWEEN FACTORS FOR TARGET OBJECT BY ANALYZING TEXT

2y 8m to grant Granted Apr 07, 2026

17/882,447

Patent 12586592

METHODS AND APPARATUS FOR GENERATING AUDIO FINGERPRINTS FOR CALLS USING POWER SPECTRAL DENSITY VALUES

3y 7m to grant Granted Mar 24, 2026

18/336,831

Patent 12585680

CONTEXTUAL TITLES BASED ON TEMPORAL PROXIMITY AND SHARED TOPICS OF RELATED COMMUNICATION ITEMS WITH SENSITIVITY POLICY

2y 9m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

78%

Grant Probability

91%

With Interview (+13.3%)

2y 11m (~0m remaining)

Median Time to Grant

High

PTA Risk

Based on 988 resolved cases by this examiner. Grant probability derived from career allowance rate.