Last updated: April 19, 2026
Application No. 18/140,210
Feature Recommendations for Machine Learning Models Using Trained Feature Prediction Model

Non-Final OA §101§103§112
Filed
Apr 27, 2023
Examiner
WU, NICHOLAS S
Art Unit
2148
Tech Center
2100 — Computer Architecture & Software
Assignee
Maplebear Inc. (Dba Instacart)
OA Round
1 (Non-Final)
Interview Optional

— +43.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 38 resolved cases, 2023–2026
Examiner Intelligence

WU, NICHOLAS S View full profile →
Grants 47% of resolved cases
Career Allow Rate
18 granted / 38 resolved
-7.6% vs TC avg
Strong +43% interview lift
Without
With
+43.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
44 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
26.7%
-13.3% vs TC avg
§103
52.6%
+12.6% vs TC avg
§102
3.1%
-36.9% vs TC avg
§112
17.4%
-22.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 38 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112: Indefiniteness 
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 5 and 14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 5, the claim recites the limitation wherein identifying the one or more candidate features comprises ranking values in the output vector; selecting a threshold number of top values in the output vector; and identifying the one or more candidate features corresponding to the identified top values in the output vector. There is insufficient antecedent basis for this limitation in the claim because the term “the output vector” lacks antecedent basis. For purposes of examination, the output vector is interpreted as a group of values.
Regarding claim 14, the claim is similar to claim 5 and rejected under the same rationales. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C 101 because the claimed invention is directed to an abstract idea without significantly more. 
Regarding claim 1, in step 1 of the 101 analysis set forth in MPEP 2106, the claim recites A method. The claim recites a method. A method is one of the four statutory categories of invention.  
In Step 2A, Prong 1 of the 101 analysis set forth in MPEP 2106, the examiner has determined that the following limitations recite a process that, under broadest reasonable interpretation, covers a mental process or mathematical concept but for the recitation of generic computer components:
…predict a probability that each of the plurality of features is to be selected as an input feature for the new ML model; (i.e., the broadest reasonable interpretation includes a step of evaluation and judgement and could be performed mentally or with pen and paper like guessing that important features in the past will be selected, which is either a mental process of evaluation/judgement (MPEP 2106)).
identifying, based on an output probability score of the feature prediction model, one or more candidate features in the plurality of features; (i.e., the broadest reasonable interpretation includes a step of observation, evaluation, and judgement and could be performed mentally or with pen and paper like selecting features based on the highest scores, which is either a mental process of observation/evaluation/judgement (MPEP 2106)).
selecting at least one candidate feature from the one or more candidate features; (i.e., the broadest reasonable interpretation includes a step of observation, evaluation, and judgement and could be performed mentally or with pen and paper like selecting the feature with the highest score, which is either a mental process of observation/evaluation/judgement (MPEP 2106)).
If the claim limitations, under their broadest reasonable interpretation, covers activities classified under Mental processes: concepts performed in the human mind (including observation, evaluation, judgement, or opinion) (see MPEP 2106.04(a)(2), subsection (III)) or Mathematical concepts: mathematical relationships, mathematical formulas or equations, or mathematical calculations (see MPEP 2106.04(a)(2), subsection (I)). Accordingly, the claim recites an abstract idea.
In Step 2A, Prong 2 of the 101 analysis, set forth in MPEP 2106, the examiner has determined that the following additional elements do not integrate this judicial exception into a practical application:
implemented at a computer system comprising a processor and a computer-readable medium, (i.e., the generic computer components recited in this limitation merely add the words “apply it”, or an equivalent, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f))).
the method comprising: receiving information about a new machine learning (ML) model to be trained, the information comprising metadata about the new ML model; (i.e., the broadest reasonable interpretation of receiving a data instance is mere data gathering, which is an insignificant extra solution activity (MPEP 2106.05(g))).
applying a trained feature prediction model to the information about the new ML model and metadata about a plurality of features that were used to train a plurality of existing ML models, wherein the feature prediction model is trained to… (i.e., the generic computer components recited in this limitation merely add the words “apply it”, or an equivalent, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f))).
presenting in a user interface a suggestion to use the one or more candidate features with the new ML model; (i.e., the broadest reasonable interpretation of presenting information is mere data outputting, which is an insignificant extra solution activity (MPEP 2106.05(g))).
and causing the new ML model to be trained using a set of input features, the set of input features including the selected candidate feature. (i.e., the generic computer components recited in this limitation merely add the words “apply it”, or an equivalent, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f))).
Since the claim does not contain any other additional elements, that amount to integration into a practical application, the claim is directed to an abstract idea. 
In Step 2B of the 101 analysis set forth in the 2019 PEG, the examiner has determined that the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception:
Regarding limitations (V) and (VII), under the broadest reasonable interpretation, recite steps of mere data gathering/outputting, which has been recognized by the courts as being well-understood, routine, and conventional functions. Specifically, the courts have recognized computer functions directed to mere data gathering/outputting as well-understood, routine, and conventional functions when they are claimed in a merely generic manner or as insignificant extra-solution activity when considering evidence in view of Berkheimer v. HP, Inc., 881 F.3d 1360, 1368, 125 USPQ2d 1649, 1654 (Fed. Cir. 2018), see USPTO Berkheimer Memorandum (April 2018)). 
Examiner uses Berkheimer: Option 2, a citation to one or more of the court decisions discussed in MPEP 2106.05(d)(II) as noting well-understood, routine, and conventional nature of the additional elements:
Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607, 610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network). See MPEP 2106.05(d)(II).
Further, limitation (IV), under the broadest reasonable interpretation, merely recite steps that apply generic computer components to perform a judicial exception, which represents merely adding the words “apply it”, or an equivalent, which are not indicative of an inventive concept (MPEP 2106.05(f)). Similarly, limitation (VI), under the broadest reasonable interpretation, merely recite steps that apply a generic machine learning model to output a prediction, which represents merely adding the words “apply it”, or an equivalent, which are not indicative of an inventive concept (MPEP 2106.05(f)). Similarly, limitation (VIII), under the broadest reasonable interpretation, merely recite steps that apply training a generic machine learning model, which represents merely adding the words “apply it”, or an equivalent, which are not indicative of an inventive concept (MPEP 2106.05(f)). Considering additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.
Regarding claim 2, it is dependent upon claim 1 and fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception. For example, claim 2 recites wherein the feature prediction model includes a deep neural network trained using a training dataset containing metadata about a plurality of ML models, and metadata about a plurality of features that are used to train the plurality of ML models. Under the broadest reasonable interpretation, the limitations merely recite steps that apply a generic ensemble learning method to train a model, which represents merely adding the words “apply it”, or an equivalent, which are not indicative of an inventive concept (MPEP 2106.05(f)). Therefore, claim 2 does not solve the deficiencies of claim 1.
Regarding claim 3, it is dependent upon claim 2 and fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception. For example, claim 3 recites wherein each of the plurality of ML models is labeled by a binary vector with a size equal to a total number of the plurality of features, each binary vector representing whether each of the plurality of features is used with the ML model. Under the broadest reasonable interpretation, the limitations recite assigning a vector to each model based on the number of features which is a step of observation, evaluation, and judgement which can be performed mentally or with pen and paper. The steps of observation, evaluation, and judgement are mental processes. Therefore, claim 3 does not solve the deficiencies of claim 2.
Regarding claim 4, it is dependent upon claim 2 and fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception. For example, claim 4 recites wherein an output of the feature prediction model includes a probability vector with a size equal to a total number of the plurality of features, each probability vector representing a probability of each of the plurality of features to be used with the new ML model. Under the broadest reasonable interpretation, the limitations recite steps of mere data outputting, which has been recognized by the courts as being well-understood, routine, and conventional functions. Specifically, the courts have recognized computer functions directed to mere data outputting as well-understood, routine, and conventional functions when they are claimed in a merely generic manner or as insignificant extra-solution activity (MPEP 2106.05(g)). Therefore, claim 4 does not solve the deficiencies of claim 2.
Regarding claim 5, it is dependent upon claim 4 and fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception. For example, claim 5 recites wherein identifying the one or more candidate features comprises ranking values in the output vector; selecting a threshold number of top values in the output vector; and identifying the one or more candidate features corresponding to the identified top values in the output vector. Under the broadest reasonable interpretation, the limitations recite ranking, selecting, and identifying candidate features which is a step of observation, evaluation, and judgement which can be performed mentally or with pen and paper. The steps of observation, evaluation, and judgement are mental processes. Therefore, claim 5 does not solve the deficiencies of claim 4.
Regarding claim 6, it is dependent upon claim 1 and fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception. For example, claim 6 recites further comprising building an index for nearest neighbors and using approximated nearest neighbor search to find top-k features as the one or more candidate features. Under the broadest reasonable interpretation, the limitations recite finding the top features by considering neighboring features which is a step of observation, evaluation, and judgement which can be performed mentally or with pen and paper. The steps of observation, evaluation, and judgement are mental processes. Therefore, claim 6 does not solve the deficiencies of claim 1.
Regarding claim 7, it is dependent upon claim 1 and fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception. For example, claim 7 recites wherein the feature prediction model is a two-tower model, including a feature tower and a model tower, Under the broadest reasonable interpretation, the limitations merely recite steps that apply a generic cascaded model method, which represents merely adding the words “apply it”, or an equivalent, which are not indicative of an inventive concept (MPEP 2106.05(f)). Claim 7 also recites the feature tower is configured to receive metadata about a feature as input to output a feature embedding, and the model tower is configured to receive the metadata about the new ML model to output a model embedding. Under the broadest reasonable interpretation, the limitations recite steps of mere data gathering, which has been recognized by the courts as being well-understood, routine, and conventional functions. Specifically, the courts have recognized computer functions directed to mere data gathering as well-understood, routine, and conventional functions when they are claimed in a merely generic manner or as insignificant extra-solution activity (MPEP 2106.05(g)). Therefore, claim 7 does not solve the deficiencies of claim 1.
Regarding claim 8, it is dependent upon claim 7and fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception. For example, claim 8 recites wherein the two-tower model includes an output layer that takes, as input, the feature embedding generated by the feature tower and the model embedding generated by the model tower to output a probability score for a feature-model pair, indicating a probability that the feature is to be selected for training the new ML model. Under the broadest reasonable interpretation, the limitations recite steps of mere data outputting, which has been recognized by the courts as being well-understood, routine, and conventional functions. Specifically, the courts have recognized computer functions directed to mere data outputting as well-understood, routine, and conventional functions when they are claimed in a merely generic manner or as insignificant extra-solution activity (MPEP 2106.05(g)). Therefore, claim 8 does not solve the deficiencies of claim 7.
Regarding claim 9, it is dependent upon claim 7 and fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception. For example, claim 9 recites wherein each of the feature tower or model tower includes a sentence transformer configured to receive, as input, a sentence generated based on the feature metadata or the model metadata to output the feature embedding or the model embedding. Under the broadest reasonable interpretation, the limitations merely recite steps that apply a generic transformer model, which represents merely adding the words “apply it”, or an equivalent, which are not indicative of an inventive concept (MPEP 2106.05(f)). Therefore, claim 9 does not solve the deficiencies of claim 7.
Regarding claim 10, in step 1 of the 101 analysis set forth in MPEP 2106, the claim recites A non-transitory computer-readable medium. The claim recites a non-transitory computer-readable medium which is interpreted as an article of manufacture. An article of manufacture is one of the four statutory categories of invention. For the Step 2A/2B analyses, since claim 10 is similar to claim 1 it is rejected under the same rationales as claim 1. 
The additional limitation below fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception.  
A non-transitory computer-readable medium having instructions encoded thereon that, when executed by a processor, cause the processor to: (i.e., the generic computer components recited in this limitation merely add the words “apply it”, or an equivalent, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f))).
Considering additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.
Regarding claims 11-18, the claims are similar to claims 2-9 and rejected under the same rationales. 
Regarding claim 19, in step 1 of the 101 analysis set forth in MPEP 2106, the claim recites A computer system, comprising: a processor;. The claim recites system comprising hardware components which is interpreted as a machine. A machine is one of the four statutory categories of invention. For the Step 2A/2B analyses, since claim 19 is similar to claim 1 it is rejected under the same rationales as claim 1. 
The additional limitation below fails to resolve the deficiencies identified above by integrating the judicial exception into a practical application, or introducing significantly more than the judicial exception.  
A computer system, comprising: a processor; and a non-transitory computer-readable medium having instructions encoded thereon that, when executed by the processor, cause the processor to: (i.e., the generic computer components recited in this limitation merely add the words “apply it”, or an equivalent, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f))).
Considering additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.
Regarding claim 20, the claim is similar to claim 2 and rejected under the same rationales. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 10, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Fix, et al., US Pre-Grant Publication 2023/0099502A1 (“Fix”) in view of Whitney, et al., US Pre-Grant Publication 2022/0327401A1 (“Whitney”).
Regarding claim 1, Fix discloses:
A method, implemented at a computer system comprising a processor and a computer-readable medium, the method comprising: (Fix, ⁋5, “In another example, a device may include a processing system including at least one processor and a non-transitory computer-readable medium storing instructions which, when executed by the processing system [A method, implemented at a computer system comprising a processor and a computer-readable medium, the method comprising:]”).
receiving information about a new machine learning (ML) model to be trained, the information comprising metadata about the new ML model; (Fix, ⁋28, “For instance, in one example, the human user may provide as input to the platform a set of metadata describing the new machine learning model. The metadata may describe one or more parameters of the new machine learning model, such as the features the new machine learning model will take as input, the target (e.g., prediction output) of the new machine learning model, one or more performance criteria (e.g., speed, deployment, quality, metrics for evaluation, etc.), source and/or target domain, and/or other parameters [receiving information about a new machine learning (ML) model to be trained, the information comprising metadata about the new ML model;].”).
…the information about the new ML model and metadata about a plurality of features that were used to train a plurality of existing ML models, (Fix, ⁋39, “if the event was triggered by the human user performing training of the new machine learning model (e.g., providing training data, providing target performance metrics or criteria for the machine learning output, etc.) […the information about the new ML model], then the processing system may suggest actions or data that can be used to tune and improve the new machine learning model. In one example, a feature store or repository may store a plurality of features extracted from a plurality of existing machine learning models [and metadata about a plurality of features that were used to train a plurality of existing ML models,].”).
…predict a probability that each of the plurality of features is to be selected as an input feature for the new ML model; (Fix, ⁋41, “the processing system may rank any portions of the existing data which are suggested for reuse, e.g., in order of most suitable for reuse to least suitable for reuse, or vice versa. In one example, the rankings of the data may be based on some assessed relevance to the inputs that triggered the event (e.g., strongest metadata match) […predict a probability that each of the plurality of features is to be selected as an input feature for the new ML model;].”).
identifying, based on an output probability score…, one or more candidate features in the plurality of features; (Fix, ⁋42, “where the suggestions include features to be added to a set of training data, the suggested features may be ranked according to the magnitudes of the effects that addition of the suggested features is expected to have on the performance of the new machine learning model”).
presenting in a user interface a suggestion to use the one or more candidate features with the new ML model; (Fix, ⁋43, “In step 212, the processing system may receive a user feedback in response to the suggestion. In one example, the user feedback may comprise active feedback, such as the selection by the human user of a suggested existing machine learning model or feature of an existing machine learning model for reuse in the new machine learning model [presenting in a user interface a suggestion to use the one or more candidate features with the new ML model;].”).
selecting at least one candidate feature from the one or more candidate features; and causing the new ML model to be trained using a set of input features, the set of input features including the selected candidate feature. (Fix, ⁋48, “during the building of new machine learning models, in order to identify existing machine learning models and features that may share metadata properties with the new machine learning models. The existing machine learning models and features [selecting at least one candidate feature from the one or more candidate features;] may, in turn, be reused in the building of the new machine learning models, making for a process that is not only more efficient, but also leverages knowledge of what has worked (or not worked) in the past [and causing the new ML model to be trained using a set of input features, the set of input features including the selected candidate feature.].”).
While Fix teaches a system for recommending features for a new model by matching the new model’s information to pre-existing features from prior machine learning models, Fix does not explicitly teach:
applying a trained feature prediction model to the information about the new ML model and metadata about a plurality of features 
wherein the feature prediction model is trained to predict a probability that each of the plurality of features is to be selected
probability score of the feature prediction model
Whitney teaches:
applying a trained feature prediction model to the information about the new ML model and metadata about a plurality of features (Whitney, ⁋37, “In some embodiments, the recommendation engine 102 can itself be configured as a machine learning model. The recommendation engine 102 can be trained to use recommendation engine data 106, including, e.g., inputs 113 and feature data 105, to determine recommendations 114 that have relatively higher probability of use by the user 110 [applying a trained feature prediction model to the information about the new ML model and metadata about a plurality of features].”).
wherein the feature prediction model is trained to predict a probability that each of the plurality of features is to be selected (Whitney, ⁋37, “In some embodiments, the recommendation engine 102 can itself be configured as a machine learning model. The recommendation engine 102 can be trained to use recommendation engine data 106, including, e.g., inputs 113 and feature data 105, to determine recommendations 114 that have relatively higher probability of use by the user 110 [wherein the feature prediction model is trained to predict a probability that each of the plurality of features is to be selected].”).
probability score of the feature prediction model (Whitney, ⁋37, “In some embodiments, the recommendation engine 102 can itself be configured as a machine learning model. The recommendation engine 102 can be trained to use recommendation engine data 106, including, e.g., inputs 113 and feature data 105, to determine recommendations 114 that have relatively higher probability of use by the user 110 [probability score of the feature prediction model].”).
Fix and Whitney are both in the same field of endeavor (i.e. feature selection). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Fix and Whitney to teach the above limitation(s). The motivation for doing so is that having a recommender engine find reusable features reduces the computational overhead of creating a new model (cf. Whitney, ⁋2, “Feature engineering is even more difficult when a machine learning model is intended for real-time production, in which case the machine learning model may be designed to gather its features and return scores in fractions of a second. Feature reuse can streamline feature engineering, and so feature reuse can be crucial to the business success of a mature data science organization.”).
Regarding claim 10, the claim is similar to claim 1 and rejected under the same rationales. Fix further teaches the additional limitations A non-transitory computer-readable medium having instructions encoded thereon that, when executed by a processor, cause the processor to: (Fix, ⁋4, “a non-transitory computer-readable medium may store instructions which, when executed by a processing system including at least one processor, cause the processing system to perform operations [A non-transitory computer-readable medium having instructions encoded thereon that, when executed by a processor, cause the processor to:].”).
Regarding claim 19, the claim is similar to claim 1 and rejected under the same rationales. Fix further teaches the additional limitations A computer system, comprising: a processor; and a non-transitory computer-readable medium having instructions encoded thereon that, when executed by the processor, cause the processor to: (Fix, ⁋5, “In another example, a device may include a processing system including at least one processor and a non-transitory computer-readable medium storing instructions which, when executed by the processing system [A computer system, comprising: a processor; and a non-transitory computer-readable medium having instructions encoded thereon that, when executed by the processor, cause the processor to:]”).

Claims 2-5, 11-14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Fix, et al., US Pre-Grant Publication 2023/0099502A1 (“Fix”) in view of Whitney, et al., US Pre-Grant Publication 2022/0327401A1 (“Whitney”) and further in view of Oracle, Non-Patent Literature “What Is Deep Learning?” (“Oracle”).
Regarding claim 2, Fix in view of Whitney teaches the method of claim 1. Fix further teaches containing metadata about a plurality of ML models, and metadata about a plurality of features that are used to train the plurality of ML models. (Fix, ⁋28, “The metadata may describe one or more parameters of the new machine learning model, such as the features the new machine learning model will take as input, the target (e.g., prediction output) of the new machine learning model, one or more performance criteria (e.g., speed, deployment, quality, metrics for evaluation, etc.), source and/or target domain, and/or other parameters [containing metadata about a plurality of ML models,].”, and Fix, ⁋34, “when metadata describing the input of the existing machine learning model is similar to metadata describing the input of the new machine learning model”).
While the combination teaches feature prediction machine learning model, the combination does not explicitly teach wherein the feature prediction model includes a deep neural network trained using a training dataset.
Oracle teaches wherein the feature prediction model includes a deep neural network trained using a training dataset (Oracle, pg. 2, “Deep neural networks, which are behind deep learning algorithms, have several hidden layers between the input and output nodes—which means that they are able to accomplish more complex data classifications. A deep learning algorithm must be trained with large sets of data, and the more data it receives, the more accurate it will be [wherein the feature prediction model includes a deep neural network trained using a training dataset]”). 
Fix, in view of Whitney, and Oracle are both in the same field of endeavor (i.e. machine learning). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Fix, in view of Whitney, and Oracle to teach the above limitation(s). The motivation for doing so is that using a deep learning model improves the efficiency of feature engineering by finding hidden insights in raw data (cf. Oracle, pg. 3, “One major benefit of deep learning is that its neural networks are used to reveal hidden insights and relationships from data that were previously not visible…Feature engineering: A deep learning algorithm can save time because it does not require humans to extract features manually from raw data.”).
Regarding claim 3, Fix in view of Whitney and Oracle teaches the method of claim 2. Fix further teaches wherein each of the plurality of ML models is labeled by a binary vector with a size equal to a total number of the plurality of features, each binary vector representing whether each of the plurality of features is used with the ML model. (Fix, ⁋36, “the similarity between metadata may be assigned a score, and an existing machine learning model may be identified as a candidate for reuse in building the new machine learning model when the score at least meets a predefined threshold score. For instance, each parameter by which similarity may be measured (e.g., input, target, users, datasets, etc.) may be assigned an individual score, and the individual scores may be combined to form a final score [wherein each of the plurality of ML models is labeled by a binary vector with a size equal to a total number of the plurality of features,]. In one example, an individual score may comprise a binary score (e.g., zero for no match, one for match) [each binary vector representing whether each of the plurality of features is used with the ML model.].”). 
Regarding claim 4, Fix in view of Whitney and Oracle teaches the method of claim 2. Fix further teaches wherein an output of the feature prediction model includes a probability vector with a size equal to a total number of the plurality of features, each probability vector representing a probability of each of the plurality of features to be used with the new ML model. (Fix, ⁋36, “the similarity between metadata may be assigned a score, and an existing machine learning model may be identified as a candidate for reuse in building the new machine learning model when the score at least meets a predefined threshold score. For instance, each parameter by which similarity may be measured (e.g., input, target, users, datasets, etc.) may be assigned an individual score, and the individual scores may be combined to form a final score [wherein an output of the feature prediction model includes a probability vector with a size equal to a total number of the plurality of features,]. In one example, an individual score may comprise a binary score (e.g., zero for no match, one for match). In another example, an individual score may comprise a value that falls on a scale (e.g., zero for no match, fifty for a semantic or conceptual match, one hundred for an identical match) [each probability vector representing a probability of each of the plurality of features to be used with the new ML model.].”).  
Regarding claim 5, Fix in view of Whitney and Oracle teaches the method of claim 4. Fix further teaches wherein identifying the one or more candidate features comprises ranking values in the output vector; selecting a threshold number of top values in the output vector; and identifying the one or more candidate features corresponding to the identified top values in the output vector. (Fix, ⁋41, “In one example, the processing system may rank any portions of the existing data which are suggested for reuse, e.g., in order of most suitable for reuse to least suitable for reuse, or vice versa [wherein identifying the one or more candidate features comprises ranking values in the output vector;]. In one example, the rankings of the data may be based on some assessed relevance to the inputs that triggered the event (e.g., strongest metadata match). In another example, the rankings of the data may be based on reuse of the data by other users (e.g., existing machine learning models or features that are most frequently reused by other users, that are most highly rated by other users, etc.). In another example, the rankings of the data may be based on some target performance metric [selecting a threshold number of top values in the output vector; and identifying the one or more candidate features corresponding to the identified top values in the output vector.] for the new machine learning model and/or a performance metric of an existing machine learning model from which the existing data is extracted (e.g., error rate, accuracy, etc.).”).
Regarding claim 11-14, the claim is similar to claim 2-5 and rejected under the same rationales.
Regarding claim 20, the claim is similar to claim 2 and rejected under the same rationales.

Claims 6 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Fix, et al., US Pre-Grant Publication 2023/0099502A1 (“Fix”) in view of Whitney, et al., US Pre-Grant Publication 2022/0327401A1 (“Whitney”) and further in view of Fu, et al., Non-Patent Literature “Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph” (“Fu”).
Regarding claim 6, Fix in view of Whitney teaches the method of claim 1. However, the combination does not explicitly teach further comprising building an index for nearest neighbors and using approximated nearest neighbor search to find top-k features as the one or more candidate features.
Fu teaches further comprising building an index for nearest neighbors and using approximated nearest neighbor search to find top-k features as the one or more candidate features. (Fu, pg. 8 col. 1, “For each node, we generate a candidate neighbor set and select neighbors for it from the candidate sets. This can be achieved by the following steps. For a given node p, (1) we treat it as a query and perform Algorithm 1 starting from the Navigating Node on the prebuilt kNN graph [further comprising building an index for nearest neighbors]. (2) During the search, each visited node q (i.e., the distance between p and q is calculated) will be added to the candidate set (the distance is also recorded). (3) Select at most m neighbors for p from the candidate set with the edge selection strategy of MRNG [using approximated nearest neighbor search to find top-k features as the one or more candidate features.].”).
Fix, in view of Whitney, and Fu are both in the same field of endeavor (i.e. data processing). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Fix, in view of Whitney, and Fu to teach the above limitation(s). The motivation for doing so is using an approximated nearest neighbor improves the search time to find related data (cf. Fu, abstract, “A scalable ANNS algorithm should be both memory efficient and fast… In this paper, to further improve the search-efficiency and scalability of graph-based methods, we start by introducing four aspects: (1) ensuring the connectivity of the graph; (2) lowering the average out-degree of the graph for fast traversal; (3) shortening the search path; and (4) reducing the index size.”).
Regarding claim 15, the claim is similar to claim 6 and rejected under the same rationales.

Allowable Subject Matter
Claims 7-9 and 16-18 would be allowable if rewritten or amended to overcome the rejection(s) under 35 U.S.C. 101 set forth in this Office action and to include all of the limitations of the base claim and any intervening claims. The following is a statement of reasons for the indication of allowable subject matter:
Regarding claims 7-9, Below are the closest cited references, each of which disclose various aspects of claim 7:
Li, et al., “IntTower: The Next Generation of Two-Tower Model for Pre-Ranking System” discloses a two-tower neural network model that ranks the relevance between item and user feature spaces for recommendations. While Li teaches a two-tower neural network model, Li does not explicitly teach a tower dedicated to model embeddings as claimed in claim 7. 
Hata, et al., US20240310172A1 discloses using a Siamese neural network to measure similarities between features by using a contrastive approach. The Siamese model uses two identical sub-networks using convolution and pooling layers to measure similarity between two features in a feature space. While Hata teaches a two-tower neural network in the form of a Siamese neural network, Hata does not explicitly teach a dedicated model embeddings tower as claimed in claim 7.  
While the above prior arts disclose the aforementioned concepts, however, none of the prior arts, individually or in reasonable combination, discloses all the limitations in the manner recited in claim 7. Specifically, the claim requires that the two-tower neural network has one tower dedicated to model embeddings and the other tower dedicated to feature embeddings. While the references cited above mention aspects of two-tower models that have dedicated feature towers, they do not recite a dedicated model embedding tower. 
Therefore, claim 7 is allowable over prior art and the dependent claims 8 and 9 are also considered allowable over the prior art at least by virtue of their dependence.
Regarding claims 16-18, the claims are similar to claims 7-9 and are allowable over the art for the same rationales.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS S WU whose telephone number is (571)270-0939. The examiner can normally be reached Monday - Friday 8:00 am - 4:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle Bechtold can be reached at 571-431-0762. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/N.S.W./Examiner, Art Unit 2148                                                                                                                                                                                                        /MICHELLE T BECHTOLD/Supervisory Patent Examiner, Art Unit 2148
Read full office action
Prosecution Timeline

Apr 27, 2023
Application Filed
Jan 23, 2026
Non-Final Rejection — §101, §103, §112
Apr 01, 2026
Interview Requested
Apr 13, 2026
Applicant Interview (Telephonic)
Apr 13, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

18/882,311
Patent 12488244
APPARATUS AND METHOD FOR DATA GENERATION FOR USER ENGAGEMENT
2y 5m to grant Granted Dec 02, 2025
17/444,687
Patent 12423576
METHOD AND APPARATUS FOR UPDATING PARAMETER OF MULTI-TASK MODEL, AND STORAGE MEDIUM
2y 5m to grant Granted Sep 23, 2025
17/265,476
Patent 12361280
METHOD AND DEVICE FOR TRAINING A MACHINE LEARNING ROUTINE FOR CONTROLLING A TECHNICAL SYSTEM
2y 5m to grant Granted Jul 15, 2025
17/191,518
Patent 12354017
ALIGNING KNOWLEDGE GRAPHS USING SUBGRAPH TYPING
2y 5m to grant Granted Jul 08, 2025
17/161,152
Patent 12333425
HYBRID GRAPH NEURAL NETWORK
2y 5m to grant Granted Jun 17, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
47%
Grant Probability
90%
With Interview (+43.1%)
3y 9m
Median Time to Grant
Low
PTA Risk
Based on 38 resolved cases by this examiner. Grant probability derived from career allow rate.