Last updated: May 29, 2026

Application No. 18/302,729

ACTIVE MACHINE LEARNING MODEL FOR TARGETED MASS SPECTROMETRY DATA ANALYSIS

Non-Final OA §102

Filed

Apr 18, 2023

Priority

Apr 18, 2022 — provisional 63/332,065 +1 more

Examiner

NGUYEN, NHAT HUY T

Art Unit

2147

Tech Center

2100 — Computer Architecture & Software

Assignee

The Administrators Of The Tulane Educational Fund

OA Round

1 (Non-Final)

This examiner grants 54% of cases after interview

— +24.2% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 352 resolved cases, 2023–2026

Examiner Intelligence

NGUYEN, NHAT HUY T View full profile →

Grants 54% of resolved cases

Career Allowance Rate

190 granted / 352 resolved

-1.0% vs TC avg

Strong +24% interview lift

Without

With

+24.2%

Interview Lift

resolved cases with interview

Typical timeline

3y 5m

Avg Prosecution

24 currently pending

Career history

400

Total Applications

across all art units

Statute-Specific Performance

§101

1.5%

-38.5% vs TC avg

§103

83.1%

+43.1% vs TC avg

§102

13.2%

-26.8% vs TC avg

§112

1.5%

-38.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 352 resolved cases

Office Action

§102

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Hogan et al. (U.S. 2022/0084636 hereinafter Hogan).
	As Claim 1, Hogan teaches a method of determining presence of an analyte in a sample, comprising the steps of: 
a) obtaining mass spectrometry (MS) data from a sample (Hogan (¶0073 line 1-4), “After processing, the samples were analyzed by LC/Q-TOF for metabolite discovery”); 
b) extracting, by a computer, features from the MS data (Hogan (¶0075 line 5-8), “Untargeted metabolomics identified a total of 3,366 ion features. Of these, 48 ion features were removed since they showed "zero" values for all samples tested, leaving 3,318 ion features for analysis”); 
c) inputting, by a computer, the features extracted in step b) into a trained prediction model, wherein the prediction model is trained to predict presence of an analyte in said sample (Hogan (¶0096 line 1-5), “As noted, ion features showing zero values through all samples tested were removed from the dataset. The remaining dataset was partitioned without normalization into a training set used to develop machine learning models”); and 
d) generating an output, wherein the output comprises prediction of the presence of the analyte in said sample (Hogan (¶0056 line 7-10), “uses this learned model on new inputs (the metabolic profiles of new samples) to make predictions of new outputs (biomarker identification in new samples)”).  

As Claim 2, besides claim 1, Hogan teaches wherein the features comprise statistical features (Hogan (¶0050 line 6-8), “typically features of interest are filtered after data acquisition applying different statistical methods followed by their identification”) and morphological features (Hogan (¶0055 line 1-6), “the presently contemplated methods and systems provide not only the feature importance, but also the direction of the difference (relative abundance of the differentiating compound). Furthermore, these methods and systems provide the necessary infrastructure to automate potential biomarker identification”).  

	As Claim 3, besides Claim 2, Hogan teaches wherein the statistical features comprise Peak_Max, Peak_Area, Peak_Ratio, and/or Peak_Shift (Hogan (¶0090 last 4 lines), “Data were directly exported from Progenesis for machine learning analysis using peak area filters of 0; 5,000; 10,000 and 20,000 relative abundance values”).  

	As Claim 4, besides Claim 2, Hogan teaches wherein the morphological features comprise: updown-difference, similarity, jaggedness, modality, symmetry, and/or FWHM (Hogan (¶0055 line 1-6), “the presently contemplated methods and systems provide not only the feature importance, but also the direction of the difference (relative abundance of the differentiating compound). Furthermore, these methods and systems provide the necessary infrastructure to automate potential biomarker identification”).  

	As Claim 5, besides Claim 4, Hogan teaches wherein the morphological features are extracted using normalized MS data (Hogan (¶0050 last 5 lines), “Metabolomics platforms generate a large amount of data that is also complex, therefore highlighting the need for appropriate data processing tools that allow the uniform and normalized preparation of chromatographic and spectral data for data analysis”).  

	As Claim 6, besides Claim 1, Hogan teaches wherein in step d) the output further comprises feature importance (Hogan (¶0055 line 1-6), “the presently contemplated methods and systems provide not only the feature importance, but also the direction of the difference (relative abundance of the differentiating compound). Furthermore, these methods and systems provide the necessary infrastructure to automate potential biomarker identification”).  

	As Claim 7, besides Claim 6, Hogan teaches wherein the feature importance is obtained by calculating a Shapley Additive exPlanation (SHAP) value for each extracted feature (Hogan (¶0060 line 1-3), “The Shapley Additive exPlanations (SHAP) method was often used to quantify an impact of features on the models”), and sorting the features by the SHAP value (Hogan (¶0061 line 4-5), “The top k features with highest overall importance to the machine learning models were used”).  

	As Claim 8, Hogan teaches a method of building a machine learning pipeline, comprising the steps of: 
a) extracting features from mass spectrometry (MS) or liquid-chromatography mass spectrometry (LC-MS) data regarding presence of an analyte (Hogan (¶0075 line 5-8), “Untargeted metabolomics identified a total of 3,366 ion features. Of these, 48 ion features were removed since they showed "zero" values for all samples tested, leaving 3,318 ion features for analysis”); 
b) constructing, by one or more computing devices that implement a machine learning program, two or more machine learning models using an active learning workflow (Hogan (¶0097 line 1-5), “All models were developed on the training set, and their final performance reported on the holdout test set and/or the prospective cohort. Within the training set, cross-validation was used to develop the models to avoid overfitting to the training set”); 
c) optimizing, by the one or more computing devices, the machine learning model (Hogan (¶0097 line 1-5), “All models were developed on the training set, and their final performance reported on the holdout test set and/or the prospective cohort. Within the training set, cross-validation was used to develop the models to avoid overfitting
to the training set”); and 
d) selecting, by the one or more computing devices, a best model (Hogan (¶0097 last 7 lines), “grid search was used to find the best set of hyperparameters for model training; the same hyperparameter settings were used across all k folds. The resulting k models ( one from each fold) were used to make k sets of predictions on the test set, which were then averaged using a simple mean to make the final prediction for each sample in the test set”); wherein the features in step a) comprises statistical (Hogan (¶0050 line 6-8), “typically features of interest are filtered after data acquisition applying different statistical methods followed by their identification”) and morphological features (Hogan (¶0055 line 1-6), “the presently contemplated methods and systems provide not only the feature importance, but also the direction of the difference (relative abundance of the differentiating compound). Furthermore, these methods and systems provide the necessary infrastructure to automate potential biomarker identification”).  

	As Claim 9, besides Claim 8, Hogan teaches wherein the active learning workflow comprises at least one of: (i) label balancing, and (ii) even score distribution (Hogan (¶0097 last 7 lines), “the resulting k models (one from each fold) were used to make k sets of predictions on the test set, which were then averaged using a simple mean to make the final prediction for each sample in the test set”).  

	As Claim 10, besides Claim 9, Hogan teaches wherein the label balancing comprises randomly providing positive rate of training dataset (Hogan (¶0097 line 5-8), “the training dataset was randomly partitioned into k=4 equal sized subsamples consisting of an approximately equal percentage of each class”).  

	As Claim 11, besides Claim 9, Hogan teaches wherein the even score distribution evaluates at least one of the following: accuracy, sensitivity, specificity, area under curve (AUC), and F1 (Hogan (¶0101 line 1-4), “The primary measure of model performance was the area under the receiver operating characteristic curve (AUC), which illustrates the diagnostic discriminative performance of the models”).  

	As Claim 12, besides Claim 8, Hogan teaches wherein the features comprise statistical features and morphological features, and wherein the morphological features are extracted using normalized MS data (Hogan (¶0055 line 1-6), “the presently contemplated methods and systems provide not only the feature importance, but also the direction of the difference (relative abundance of the differentiating compound). Furthermore, these methods and systems provide the necessary infrastructure to automate potential biomarker identification”).  

	As Claim 13, besides Claim 8, Hogan teaches wherein the machine learning model in step c) comprises training set optimization (Hogan (¶0075 line 5-8), “Untargeted metabolomics identified a total of 3,366 ion features. Of these, 48 ion features were removed since they showed "zero" values for all samples tested, leaving 3,318 ion features for analysis”).  

	As Claim 14, Hogan teaches a system, comprising: 
a) at least one processor (Hogan (¶0025 line 2), processors); 
b) a memory, storing program instructions that when executed by the at least one processor causes the at least one processor to perform a machine learning pipeline, the machine learning pipeline (Hogan (¶0025 line 3), storage medium)) is configured to perform at least one of the following modes: 
i) training mode:
(A) receive mass spectrometry data of a sample (Hogan (¶0073 line 1-4), “After processing, the samples were analyzed by LC/Q-TOF for metabolite discovery”);
(B) extract at least one feature from the mass spectrometry data, wherein the at least one feature is a statistical features (Hogan (¶0050 line 6-8), “typically features of interest are filtered after data acquisition applying different statistical methods followed by their identification”) and/or morphological features (Hogan (¶0055 line 1-6), “the presently contemplated methods and systems provide not only the feature importance, but also the direction of the difference (relative abundance of the differentiating compound). Furthermore, these methods and systems provide the necessary infrastructure to automate potential biomarker identification”);
(C) optimize training dataset by active learning strategy (Hogan (¶0097 line 1-5), “All models were developed on the training set, and their final performance reported on the holdout test set and/or the prospective cohort. Within the training set, cross-validation was used to develop the models to avoid overfitting to the training set”); and
(D) select a best prediction model (Hogan (¶0097 last 7 lines), “grid search was used to find the best set of hyperparameters for model training; the same hyperparameter settings were used across all k folds. The resulting k models (one from each fold) were used to make k sets of predictions on the test set, which were then averaged using a simple mean to make the final prediction for each sample in the test set”);
ii) prediction mode:
(A) receive mass spectrometry data of a sample (Hogan (¶0056 line 7-10), “uses this learned model on new inputs (the metabolic profiles of new samples) to make predictions of new outputs (biomarker identification in new samples)”);
(B) extract at least one feature from the mass spectrometry data (Hogan (¶0094 line 4-11), “Machine learning is a class of techniques that uses data to learn a model that maps an input (the metabolic profile of a sample; includes mass-to-charge ratio (m/z) and retention time for each sample) to its associated output (the influenza infection outcome of the sample) and uses this learned model on new inputs (the metabolic profiles of new samples) to make predictions of new outputs (the influenza outcomes of new samples”), wherein the at least one feature is statistical features (Hogan (¶0050 line 6-8), “typically features of interest are filtered after data acquisition applying different statistical methods followed by their identification”) and/or morphological features (Hogan (¶0055 line 1-6), “the presently contemplated methods and systems provide not only the feature importance, but also the direction of the difference (relative abundance of the differentiating compound). Furthermore, these methods and systems provide the necessary infrastructure to automate potential biomarker identification”); and
(C) generate an output of determining whether an analyte is present in the sample (Hogan (¶0056 line 7-10), “uses this learned model on new inputs (the metabolic profiles of new samples) to make predictions of new outputs (biomarker identification in new samples)”).

As Claim 15-20, the Claim is rejected for the same reasons as Claim 2-7, respectively.
As Claim 21, the Claim is rejected for the same reasons as Claim 14.
As Claim 22, the Claim is rejected for the same reasons as Claims 15-17.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Farkas et al. (U.S. 2019/0293620) teaches a method to train spectral analysis machine learning model.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NHAT HUY T NGUYEN whose telephone number is (571)270-7333. The examiner can normally be reached M-F: 12:00-8:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker Lamardo can be reached at 571-270-5871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NHAT HUY T NGUYEN/Primary Examiner, Art Unit 2147

Read full office action

Prosecution Timeline

Apr 18, 2023

Application Filed

Feb 25, 2026

Non-Final Rejection mailed — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/057,971

Patent 12632159

INTERFACE FOR DISPLAY OF INTERACTIVE CONTENT

3y 5m to grant Granted May 19, 2026

17/502,008

Patent 12626112

NPU, EDGE DEVICE AND OPERATION METHOD THEREOF

4y 7m to grant Granted May 12, 2026

17/936,173

Patent 12613628

PROVIDING A REPLY INTERFACE FOR A MEDIA CONTENT ITEM WITHIN A MESSAGING SYSTEM

3y 7m to grant Granted Apr 28, 2026

18/423,234

Patent 12530116

MEDIA CAPTURE LOCK AFFORDANCE FOR GRAPHICAL USER INTERFACE

1y 12m to grant Granted Jan 20, 2026

18/071,452

Patent 12504866

AUTOMATED TAGGING OF CONTENT ITEMS

3y 0m to grant Granted Dec 23, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

54%

Grant Probability

78%

With Interview (+24.2%)

3y 5m (~4m remaining)

Median Time to Grant

Low

PTA Risk

Based on 352 resolved cases by this examiner. Grant probability derived from career allowance rate.