Last updated: April 19, 2026
Application No. 18/204,108
MACHINE LEARNING MODEL SEARCH USING META DATA

Non-Final OA §112
Filed
May 31, 2023
Examiner
LU, HWEI-MIN
Art Unit
2142
Tech Center
2100 — Computer Architecture & Software
Assignee
Datarobot Inc.
OA Round
1 (Non-Final)
This examiner grants 62% of cases after interview

— +39.5% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 217 resolved cases, 2023–2026
Examiner Intelligence

LU, HWEI-MIN View full profile →
Grants 62% of resolved cases
Career Allow Rate
134 granted / 217 resolved
+6.8% vs TC avg
Strong +40% interview lift
Without
With
+39.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
37 currently pending
Career history
254
Total Applications
across all art units
Statute-Specific Performance

§101
11.2%
-28.8% vs TC avg
§103
43.8%
+3.8% vs TC avg
§102
9.4%
-30.6% vs TC avg
§112
33.0%
-7.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 217 resolved cases
Office Action

§112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This office action is in responsive to communication(s): original application filed on 05/31/2023, said application claims a priority filing date of 06/03/2022.  Claims 1-20 are pending. Claims 1, 12, and 19 are independent.

Specification
The disclosure is objected to because of the following informalities: 
in ¶ [0062], "... search the historic machine learning projects with the blueprints 125, the models 1210, the scores 130, and the blueprint or model 145 ..." appears to be "... search the historic machine learning projects with the blueprints 125, the models 120, the scores 130, and the blueprint or model 145 ...".  
Appropriate correction is required.
The use of the term "Wi-Fi" in ¶¶ [0033] and [0079], which is a trade name or a mark used in commerce, has been noted in this application. The term should be accompanied by the generic terminology; furthermore the term should be capitalized wherever it appears or, where appropriate, include a proper symbol indicating use in commerce such as ™, SM, or ® following the term.
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) are permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.

Claim Objections
Claims 1-2, 8-9, 12-13, and 19-20 are objected to because of the following informalities: 
in Claim 1, line 8; Claim 12, line 9; and Claim 19, line 8, "… responsive to execution of the search with the list of features …" appears to be "… responsive to the execution of the requested search with the list of features …";
in Claim 1, lines 11-12; Claim 12, lines 12-13; and Claim 19, lines 11-12, "… a plurality of models trained by machine learning to determine a target …" appears to be "… a plurality of models trained by the machine learning to determine a target …";
in Claim 1, line 13 and Claim 19, line 13, "… train, via machine learning, the model of the blueprint …" appears to be "… train, via the machine learning, the model of the blueprint …";
in Claim 12, line 14, "… training, by the data processing system via machine learning, the model of the blueprint …" appears to be "… training, by the data processing system via the machine learning, the model of the blueprint …";
in Claim 2, lines 2-3; Claim 13, lines 2-3; and Claim 20, lines 2-3, "… the plurality of models trained by machine learning to determine the target …" appears to be "… the plurality of models trained by the machine learning to determine the target …";
in Claim 8, lines 7-8, "… train the second model of the second blueprint by machine learning to determine the target …" appears to be "… train the second model of the second blueprint by the machine learning to determine the target …";
in Claim 9, line 3, "… the graphical user interface comprising a button to execute the search …" appears to be "… a button to execute the requested search …".  
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 12, and 19 recite the limitation "… identify(ing) … a list of features with which to execute the requested search … responsive to execution of the search with the list of features … a plurality of models trained by machine learning to determine a target based on a list of features …" in lines 6-12, 6-13, and 6-12, respectively, which rendering these claims indefinite because it is unclear whether the third instance of "a list of features" is the same as or different to the first two instance of "a list of features".  Clarification is required.
Claims 2-11, 13-18, and 20 are rejected for fully incorporating the deficiency of their respective base claims.
Claims 2, 13, and 20 recite the limitation "… the project comprising: the plurality of blueprints, the plurality of blueprints including the plurality of models trained by machine learning to determine the target …" in lines 1-3, which rendering these claims indefinite because "… from a client device, a request to search for one or more blueprints including one or more models to add to a project … a plurality of projects established via input from a plurality of client devices different from the client device, the plurality of projects including a plurality of blueprints, the plurality of blueprints including a plurality of models trained by machine learning to determine a target …" is also recited in their respective based claim and it is unclear (1) "one or more blueprints" and "one or more models" added to the project recited in their respective based claim are the same or different to "the plurality of blueprints" and "the plurality of models" included in the project recited here; (2) it is unclear "the plurality of blueprints" and "the plurality of models" are comprised in "the project" (established in "a client device") as recited here or included in "a plurality of projects" (established from "a plurality of client devices" different from the client device) recited in their respective based claim.  Clarification is required.
Claims 2, 13, and 20 recite the limitation "... determine the target based on the list of features to determine the target … display an indication of the list of features … receive … a user input that identifies the list of features" in lines 3-9, 3-11, and 3-9, respectively, which rendering these claims indefinite because "… identify(ing) … a list of features with which to execute the requested search … responsive to execution of the search with the list of features … a plurality of models trained by machine learning to determine a target based on a list of features …" is also recited in their respective based claim, and it is unclear which instance of "a list of features" recited in their respective based claim is referred by three instances of "the list of features" recited here when the third instance of "a list of features" is different to the first two instance of "a list of features" recited in their respective based claim.  Clarification is required.
Claims 5 and 16 recite the limitation "… a plurality of performance levels of the plurality of blueprints including the plurality of models of the project and …" in lines 2-3, which rendering these claims indefinite because  "… from a client device, a request to search for one or more blueprints including one or more models to add to a project … a plurality of projects established via input from a plurality of client devices different from the client device, the plurality of projects including a plurality of blueprints, the plurality of blueprints including a plurality of models trained by machine learning to determine a target …" is also recited in their respective based claim and it is unclear (1) "one or more blueprints" and "one or more models" added to the project recited in their respective based claim are the same or different to "the plurality of blueprints" and "the plurality of models" of the project recited here; (2) it is unclear "the plurality of blueprints" and "the plurality of models" are belong to "the project" (established in "a client device") as recited here or included in "a plurality of projects" (established from "a plurality of client devices" different from the client device) recited in their respective based claim.  Clarification is required.
Claim 8 recites the limitation "… search the plurality of projects based on the list of features …" in line 4, which rendering the claim indefinite because "… identify … a list of features with which to execute the requested search … responsive to execution of the search with the list of features … a plurality of models trained by machine learning to determine a target based on a list of features …" is also recited in its based claim, and it is unclear which instance of "a list of features" recited in its based claim is referred by "the list of features" recited here when the third instance of "a list of features" is different to the first two instance of "a list of features" recited in its based claim.  Clarification is required.
Claim 11 recites the limitation "… search the plurality of projects with characteristics of at least one of the list of features …" in lines 2-3, which rendering the claim indefinite because "… identify … a list of features with which to execute the requested search … responsive to execution of the search with the list of features … a plurality of models trained by machine learning to determine a target based on a list of features …" is also recited in its based claim, and it is unclear which instance of "a list of features" recited in its based claim is referred by "the list of features" recited here when the third instance of "a list of features" is different to the first two instance of "a list of features" recited in its based claim.  Clarification is required.

Allowable Subject Matter
Claims 1-20 would be allowable if rewritten or amended to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action.
The following is a statement of reasons for the indication of allowable subject matter:  
In regard to independent Claims 1, 12, and 19, prior arts of records, either singularly or in combination, do not teach or suggest the combination of claimed elements including "a system, comprising: a data processing system comprising one or more processors, coupled with memory, to: receive, via a graphical user interface from a client device, a request to search for one or more blueprints including one or more models to add to a project configured to deploy the one or more models trained via machine learning; identify, based on a selection received from the client device via the graphical user interface, a list of features with which to execute the requested search; provide, responsive to execution of the search with the list of features, a blueprint comprising a model selected from a plurality of projects established via input from a plurality of client devices different from the client device, the plurality of projects including a plurality of blueprints, the plurality of blueprints including a plurality of models trained by machine learning to determine a target based on a list of features; train, via machine learning, the model of the blueprint to determine the target and add the blueprint including the trained model to the project; and generate data causing the graphical user interface to display an indication of the blueprint including the trained model", "a method, comprising: receiving, by a data processing system comprising one or more processors coupled with memory, via a graphical user interface from a client device, a request to search for one or more blueprints including one or more models to add to a project configured to deploy the one or more models trained via machine learning; identifying, by the data processing system, based on a selection received from the client device via the graphical user interface, a list of features with which to execute the requested search; providing, by the data processing system, responsive to execution of the search with the list of features, a blueprint comprising a model selected from a plurality of projects established via input from a plurality of client devices different from the client device, the plurality of projects including a plurality of blueprints, the plurality of blueprints including a plurality of models trained by machine learning to determine a target based on a list of features; training, by the data processing system via machine learning, the model of the blueprint to determine the target and add the blueprint including the trained model to the project; and generating, by the data processing system, data causing the graphical user interface to display an indication of the blueprint including the trained model", or "receive, via a graphical user interface from a client device, a request to search for one or more blueprints including one or more models to add to a project configured to deploy the one or more models trained via machine learning; identify, based on a selection received from the client device via the graphical user interface, a list of features with which to execute the requested search; provide, responsive to execution of the search with the list of features, a blueprint comprising a model selected from a plurality of projects established via input from a plurality of client devices different from the client device, the plurality of projects including a plurality of blueprints, the plurality of blueprints including a plurality of models trained by machine learning to determine a target based on a list of features; train, via machine learning, the model of the blueprint to determine the target and add the blueprint including the trained model to the project; and generate data causing the graphical user interface to display an indication of the blueprint including the trained model" when interpreted as a whole.
Gur et al. (US 2021/0019665 A1, pub. date: 01/21/2021) discloses in ABSTRACT and ¶ [0004] that (1) implement a machine learning framework that operates to register a plurality of machine learning algorithms used to train machine learning models to perform related tasks, and to index the machine learning algorithms to generate and store a machine learning algorithm metadata model for each machine learning algorithm (2) the machine learning framework receives a user specification of an analytics pipeline task for which a machine learning model is to be trained, and converts the user specification to machine learning algorithm search criteria used to search the index to identify matching machine learning algorithms having a corresponding machine learning algorithm metadata model that matches the machine learning algorithm search criteria; and (3) the machine learning framework operates to output, via the user interface, information describing the at least one matching machine learning algorithm.  Gur further discloses in ¶¶ [0014]-[0023] that (1) The ML model is essentially a function of elements including the machine learning algorithm(s), configuration settings of the machine learning algorithm(s), data samples (or training data), features identified by the ML model, and the labels (or outputs) generated by the ML model; (2) a ML algorithm is the algorithm used to train a ML model whereas the ML model is the computer logic that captures the results of training on collections of data and persists the training for use on newly received data; (3) the ML pipeline may include any of a number of different types of ML algorithms; (4) examples of ML algorithms may include deep learning, perceptual clustering, scale-space time series order-preserving clustering, kernel density shape-based classification, modified shape context-based spatial pyramid kernels, predictive space-aggregated regression (PSAR), multiple kernel completion (MKC) learning, multi-layer random forests, structured output DNN, sequential path learning, multi-layer future fusion random forests, ambiguous random forests, conditional random fields semantic predictors, nuclear-norm-constrained MKL, scandent trees, multi-atlas learning-based fusion, multi-atlas affine kernel-based learning, etc.; (5) operations in training a ML model using such ML algorithms include the training operations themselves, prediction/classification operations, cross-validation operations, and clustering operations; (6) an analytics pipeline is a computer based processing pipeline comprising one or more stages of processing logic, e.g., pre-processing, analytics processing, and output processing, where one or more of the stages may use one or more trained ML models that operate on data that flows through the pipeline; (6) provide a mechanism for maintaining a repository of ML algorithms and for searching and retrieving ML algorithms for training ML models that are to be used as part of ML analytics pipelines to perform operations of an analytics task; (7) the "collections" of data upon which a ML model operates or upon which it is trained using a ML algorithm, are logical organizations of data, e.g., images, regions, pixels, atlases, text documents, exam questions, etc.; (8) collections of data may have associated metadata specifying various characteristics of the collections, including labels associated with the content of the collection of data, where the labels describe the content of the collection of data in some way, e.g., the metadata may specify a type of medical image, the type of technology utilized to generate the medical image, whether or not the medical image contains an anomaly, a source of the medical image, the domain and/or modality of the medical image, etc.; (9) the reusability of ML algorithms and training data used in training ML models becomes more attractive as the volume of ML algorithms increases; i.e., it would be desirable to have a mechanism for managing ML algorithms and providing a search engine capability for finding ML algorithms that provide functionality needed for performing a desired training of a ML model to perform a desired task; (10) moreover, it would be beneficial to have an automated tool for integrating a previously defined ML algorithm into a new implementation of a ML model for addressing a task without requiring manual configuring of the ML algorithm for the new task or ML model; (11) provide a ML framework comprising one or more universal ML application programming interfaces (APIs) and one or more databases of ML algorithms; (12) the ML framework abstract the details of the ML algorithms through the one or more universal ML APIs and provides an engine to automatically handle all connectivity and format details of ML algorithms so that individual algorithm developers become users through the one or more ML APIs; (13) the ML framework enables analytics developers to experiment with different ML algorithms and keep track of results obtained through such experimentation so as to understand as to which ML algorithms best suited for the particular task at hand; (14) the set of universal ML APIs enables common machine learning tasks, e.g., training, testing, prediction, and exposes standard ways to input features for ML algorithms; (15) the framework utilizes an object-relational model to persist information about ML algorithms in a ML algorithm metadata model represented in a relational database turned into an index that represents these instances of the ML algorithm metadata model; (16) ML algorithm metadata model itself includes a description of the aggregate collections, a description of features generated from collections, a description of the machine learning model, provenance information about the ML algorithm; (17) an ML model is the output of the ML training algorithm; (18) in general, machine learning algorithm development, for purposes of clarification, comprises a training stage and an inference stage: (a) in the training stage, data and labels are given to the ML algorithm to produce an ML model; and (b) in the inference stage, the ML model and unlabeled data are taken by the inference algorithm to produce a label for the data as output; and (19) it would be beneficial to have a mechanism that promotes and makes available ML algorithms in a manner that assists developers in generating trained ML models for their analytics pipelines.  Gur also discloses in ¶¶ [0036]-[0050] with FIGS. 1A-1B that (1) the framework 100 includes a data collections application programming interface (API) 110, a training API 112, prediction API 114, cross-validation (testing) API 116, clustering API 118, trained ML model registration logic 120, a ML algorithm search criteria generation engine 122, and a ML algorithm search engine 124; (2) the framework 100 maintains a machine learning (ML) repository 130 and ML index 132, wherein the ML repository 130 stores ML algorithms, used to train ML models, which are indexed in the ML index 132; (3) the stored ML algorithms are stored for selection by developers (users) when a new trained machine learning model is needed to perform a task, such as a task that is part of an analytics pipeline; (4) the ML index 132 provides indexes of the stored ML algorithms in the ML repository 130 such that the indexes may be searched based on developer (user) specified criteria for the task to be performed, and corresponding matching ML algorithms may be retrieved and utilized to train a ML model to perform the task; (5) the trained ML model may then be registered, by the trained ML model registration logic 120, in a ML model repository 140, indexed by a ML model index 142, for use in analytics pipelines to perform corresponding tasks; (6) the framework 100 may provide a user interface through which the framework 100 obtains a user specification of a task for which a ML model is to be trained; (7) the user's specified task is translated by the ML algorithm search criteria generation engine of the ML framework 100 to a set of training criteria for training a ML model where these training criteria are used as search criteria by the ML algorithm search engine 124 for use in searching the ML index 132; (8) the criteria are used as search terms for matching against ML algorithm metadata models stored in the ML index 132 and which represent registered ML algorithms; (9) the ML algorithm search engine 124 performs a search of the ML index 132 based on these search terms in order to find one or more corresponding ML algorithms that have metadata descriptions matching the search criteria; (10) the matching ML algorithms represent ML algorithms that may be used to train an ML model for the specified task; (11) the developer can query the framework 100 to see if there are any ML models that already exist that can solve the various parts of the problem in developing the analytics pipeline; (12) the search criteria will reflect the modality (X-ray), the viewpoints (AP, PA), the technical quality and target findings the developer is interested in when building the analytics pipeline; (13) the ML algorithm search engine 124 returns a list of suitable ML models generated by several developers for one or more of the tasks in the proposed analytics pipeline as search results; (14) the developer can then discover their APIs in the framework 100 and compose the analytics pipeline to achieve the task at hand; (15) for instances where ML models do not already exist for the particular task, ML algorithms that may be used to train a new ML model to perform the corresponding task may be identified via a search of the ML algorithm index, with subsequent retrieval and utilization of the ML algorithm to train a new ML model to perform the required task; (16) a user may specify a task to be performed through a user interface 150 of the framework 100, which may provide portions of the user interface where the user may specify the functions to be performed, e.g., medical image analysis, patient medical record textual analysis, etc., the type of data upon which the task is to be performed, e.g., particular type of medical image to be processed, type of text to be processed, the type of output desired, e.g., output labels for classifications of the medical image, output labels for classifying the patient medical information extracted from the medical records of the patient, etc.; (17) these task characteristics are converted to corresponding search criteria for searching the model index 142 to determine if there are existing trained ML models registered in the trained model repository 140 that satisfy the search criteria of the task, such as by performing natural language processing of the task characteristics to extract keywords indicative of search criteria, using mapping data structures to map task characteristics to search criteria, using the task characteristics themselves as search criteria, or the like; (18) a task may comprise multiple different functions that need to be performed where each function may have its own corresponding search criteria obtained from the definition of the task and thus, will have its own set of trained ML models which may be found to perform that function of the overall task; (19) if there is not an existing ML model that handles the required task identified via the search of the trained model index 142, then a search of ML algorithms which may be used to train a ML model may be performed using the ML index 132 and corresponding repository of ML algorithms 130; i.e., a similar search of the ML index 132 is performed based on the search criteria to identify ML algorithms whose resulting ML model will perform the desired task; (20) an identification of these matching ML algorithms may then be returned to the user via a user interface 150 for selection and execution of the ML algorithms to train a ML model to perform the desired task, which may then be registered and indexed in the trained model repository 140 and trained model index 142; (21) the ML index 132 represents an ML algorithm registered in the ML repository 130 as a ML metadata model comprising a description of the aggregate collections upon which the ML model trained by the ML algorithm operates, a description of features generated from collections by the ML model trained by the ML algorithm, a description of the ML model that is trained by the ML algorithm, and provenance information about the ML algorithm; (22) a similar metadata structure is generated for trained ML models themselves in the trained model index 142 for trained ML models stored in the trained model repository 140; e.g., the ML metadata model may store information such as the location of the ML model file in storage, the imaging modality and specialty the ML model serves, the ML algorithm used to train the ML model, various parameters or settings associated with the ML model, an identification of the training data collection used to train the ML model, the types of labels produced by the ML model, as well as provenance information that includes information on the person(s) that created the ML model and the features that person used to train the ML mode; (23) this allows tracking numerous ML models and ML algorithms and using them for deployment of ML models in analytics pipelines and/or training of ML models for deployment in analytics pipelines; (24) the conversion of the task specification input by the user (human developer) converts the task specification to corresponding types of information as is used to define the ML algorithm metadata model, e.g., converting task characteristics specified by the user to corresponding ones of search criteria specifying data collections for which the task is to be performed, search criteria of the features to be generated as output by the trained ML model from the data collections, a description of the type of ML model that is to be used, and provenance information about the ML algorithm that the user wishes to utilize; (25) the ML algorithm search engine 124 searches the ML index 132 based on the search criteria obtained from the task characteristics generated by the analytics pipeline creation engine 122; (26) ML algorithm metadata models found as matching the search criteria based on the task characteristics, using the ML index 132, may be presented to the user as recommendations for use in training ML models for use in performing the specified task, or a corresponding portion of the task; (27) these found ML algorithms may be presented to the user in a ranked listing based on a degree of matching of the ML algorithm's metadata with the search criteria; (28) threshold values may be established for determining a threshold degree of matching between ML algorithm metadata model and search criteria in order to determine that there is a match; (29) a highest ranking matching ML algorithm may be automatically selected for training a ML model to perform the specified task, or a corresponding portion of the task; (30) a user selection of a ML algorithm to use to train a ML model to perform a specified task may be received via the user interface 150 to initiate a training operation to utilize the selected ML algorithm for training a corresponding ML model, after which the ML model may be registered in the trained model repository 140 and indexed in the trained model index 142, as well as deployed as part of an analytics pipeline; (31) the ML framework 100 may utilize the universal APIs 110-118 to train a ML model using a selected ML algorithm that is selected using the mechanism described above; (32) the data collections API 110 provides computer logic for forming logical collections of data samples and associating features with data samples and collections, where the "features" are feature vectors used to train the ML model; (33) the data collections API 110 further provides computer logic for associating output labels of a trained ML model with the data collections and persisting the logical data collections in the ML index 132; (34) the training API 112 provides computer logic for producing trained ML models using the selected ML algorithm( s) to train the ML model; (35) the training API 112 further provides computer logic for describing the trained ML models and associating them with data collections used for training the ML models; (36) the prediction API 114 provides computer logic for classifying new data instances based on a previously trained ML model and allows searching of prior trained ML models based on various attributes; (37) the cross-validation (testing) API 116 provides computer logic for enabling selection of datasets for testing a trained ML model, supporting n-fold cross validation, computing the confusion matrix, and enabling persistence of the trained ML model and performance models, where "persistence" as the term is used herein means storing the ML model and indexing its metadata; (38) the clustering API 118 provides computer logic for performing a ML technique that takes a group of feature vectors and the number of clusters and groups the feature vectors into clusters based on a vector similarity measure; (39) in response to a user requesting a ML model to be trained for a specified task, the ML algorithm search criteria generation engine 122 translates the user input specifying the task definition, as received via the user interface 150, into search criteria that are used by the ML algorithm search engine 124 to search the ML index 132 to find one or more matching ML algorithms that are able to be used to train an ML model to perform the task, or a portion of the task; (40) the ML algorithm information for the matching ML algorithms may be output to the user via the user interface 150 as ML algorithm recommendations, and the user may select an ML algorithm to utilize as well as provide the ML algorithm parameters for configuring the ML algorithm to train the ML model to the user's specifications; (41) the ML framework 100 may select the ML algorithm automatically, such as by selecting a highest ranking matching ML algorithm for use in training the ML model and utilizing a default set of ML algorithm parameters; (42) based on the user's selection, or automated selection, of the ML algorithm, the universal APIs 110-118 are utilized along with the selected ML algorithm to train an ML model, test (cross-validate) the trained model, and provide performance information regarding the training/testing of the ML model; e.g., once trained using the raw labeled data (training data), unlabeled data may be input to the trained ML model to generate labeled data and a corresponding confusion matrix which describes the performance of the trained ML model. This information may be used to present the results and performance information for the trained ML model to a user via the user interface 150; (43) the operation shown in FIG. 1B outlines the operation of the ML framework 100 after the selection of a ML algorithm is obtained via the ML framework 100; (44) the selected ML algorithm is retrieved from the ML repository 130 (step 160), the ML APIs for the selected ML algorithm are retrieved (step 162); (45) the operation then transforms features using the ML APIs for the selected ML algorithm (step 164); (46) the selected ML algorithm is then invoked (step 166) to train the ML model which is then stored in the trained model repository (step 168); and (47) in addition, the framework 100 utilizes the prediction and cross-validation (testing) APIs 114 and 116 to evaluate the training of the ML model (steps 170 and 172).  Gur further teaches in ¶¶ [0051]-[0052] with FIG. 2 that (1) the operation starts by receiving a request to configure a machine learning model, such as a neural network or the like, to perform a task, such as a task that may be implemented as part of an analytics pipeline (step 210); (2) the definition of the task may further specify the data upon which the task is to be performed or trained; (3) the task characteristics are translated to search criteria (step 220) and a search of machine learning algorithm metadata models stored in a machine learning index is performed based on the search criteria (step 230); (4) the search results are obtained and presented to a user that submitted the request for selection of a machine learning algorithm for use in training a machine learning model to perform the task (step 240); (5) the user may specify parameters for the machine learning algorithm (step 250); (6) the universal APIs of the machine learning framework are then utilized to perform the training and testing of the trained machine learning model (step 260); (7) a data collections API is used to generate data collections from the training data, the training API is used to perform the training of the machine learning model given the data collections, the selected machine learning algorithm and its parameters, and the training data (raw labeled data); (8) the cross-validation (testing) API and prediction API are used to evaluate the training of the trained machine learning model and present performance information to the user via the user interface; and (9) the trained model is registered with a trained model repository and corresponding trained model index for use in analytics pipelines (step 270).
Aftab et al. (US 2020/0193221 A1, pub. date: 06/18/2020) discloses in ABSTRACT and ¶¶ [0002]-[0004] that (1) designing, creating, and deploying composite machine learning applications in cloud environments; (2) present a design studio canvas upon which a user can design a composite machine learning application from at least one of a plurality of building blocks stored in a design studio catalog; (3) receive input to design, on the design studio canvas, a visual representation of the composite machine learning application; (4) save the visual representation of the composite machine learning application; (5) in response to saving the visual representation of the composite machine learning application, can generate a composition dump file that includes a graph structure of the composite machine learning application; (6) the plurality of building blocks stored in the design studio catalog can include a plurality of machine learning models; (6) the machine learning models can be onboarded to the design studio catalog by machine learning modelers; (7) the plurality of building blocks also can include one or more data collection function; (8) the plurality of building blocks further include one or more data transformation functions; (9) validate the composition dump file based upon one or more validation rules; (10) upon successful validation, generate, from the composition dump file, a blueprint file for the composite machine learning application, and can store the blueprint file in a repository; and (11) deploy, based upon the blueprint file, the composite machine learning application on one or more target cloud environments.  Aftab further discloses in ¶¶ [0024]-[0031] that (1) developing domain specific machine learning applications using basic building blocks and composing the building blocks together based upon the concept of requirements exposed by one component and capabilities offered by another; (2) the composition and deployment of machine learning applications on many targets, such as, e.g., OPENSTACK cloud, AT & T Integrated Cloud ("AIC"), MICROSOFT AZURE cloud, and other cloud platforms; (3) define the hooks for the basic building blocks based upon whether the basic building blocks can either be connected together or not in an intuitive, graphical user interface-based machine learning design studio; (4) once the hooks have been defined, the building blocks can be ingested by a composition tool; (5) describe a machine learning model-driven automated composition process of developing machine learning applications; (6) uniquely, the model-driven automated composition process uses the metadata in a machine learning model and does not rely on the user to dictate the composition of building blocks in the design studio; (7) provide the ability to compose models developed in different programming languages and/or different machine learning toolkits; (8) The building blocks (e.g., machine learning models) are wrapped in protocol buffer (i.e., Protobuf) model runners that enable the building blocks to be programming language and machine learning toolkit agnostic; (9) in this manner, the machine learning models can communicate with each other irrespective of the programming language in which they were developed and/or the machine learning toolkit (e.g., Scikit Learn, Tensor Flow, or H20) used to build and train the machine learning models; (10) provide support for split and join capabilities; (11) the design studio allows users not only to compose building blocks as a linear cascaded composition of heterogeneous machine learning models, but also provides the flexibility to compose directed acyclic graphs ("DAG") based upon composite solutions where an output port can fan out into multiple outgoing links that feed other machine learning models and an input port can support a multiple fan-in capability to allow multiple machine learning models to feed their output into an input port of a machine learning model; (12) along with the capability to compose DAGs, the design studio supports corresponding split and join semantics; (13) various split and join semantics provide one-to-many and many-to-one connectivity semantics; (14) provide validation, blueprint generation, and deployment; (15) the design studio enables a validation to be performed on the composite solution before submitting the solution for cloud deployment; (16) the design studio creates a blueprint of the validated composite solution; (17) this blueprint is used by a deployer to deploy the composite solution in the target cloud; (18) the metadata and operations described in the machine learning model and in the blueprint are interpreted by a cloud orchestrator to deploy the composite application in the target cloud; (19) describe independent building blocks to be chained together using a model connector; (20) although each building block is unaware of any other building blocks to which they might be connected at runtime, the concept of a model connector introduced herein enables communication between building blocks at run time; (21) solve at least the problem of composing a machine learning application out of pre-defined building blocks and the sub-sequent problem of deploying the composite machine learning application on a target cloud environment; and (22) introduce the concept of composition based upon metadata generated by an on-boarding mechanism associated with the design studio.  Aftab also discloses in ¶¶ [0033]-[0043] with FIG. 1 that (1) a machine learning design studio ("design studio") 102 provides a visual/graphical application composition experience through which users, such as machine learning application experts/designers, can visually design composite machine learning applications 104 on a machine learning design studio canvas ("canvas") 106 from building blocks 108 stored in a machine learning design studio catalog ("catalog") 110; (2) the design studio 102 enables the composition of the building blocks 108 into complete analytic applications (i.e., the composite machine learning applications 104) useful for a given purpose, such as, e.g., some kind of predictive analysis or to produce a recommendation; (3) the canvas 106 provides a graphical user interface ("GUI") design environment through which users can drag, drop, and visually compose graphical representations of the building blocks 108 into the composite machine learning applications 104; (4) the canvas 106 also provides visual cues to guide users as to which of the building blocks 108 can be connected together; (5) the illustrated building blocks 108 stored in the catalog 110 include data collection/ingestion functions 112 (e.g., data brokers), data transformation functions 114 (e.g., split, join, merge, filter, clean, normalize, and label functions), and machine learning models 116 (e.g., models that implement various algorithms, such as prediction, regression, classification, and the like); (6) the building blocks 108 can be developed in different programming languages, such as Python, R, Java, and the like, and developed/trained in different machine learning toolkits, such as Scikit Learn, Tensor Flow, H20, and the like; (7) the building blocks 108 are converted into microservices with well-defined application programming interface ("APIs"); (8) in support of language heterogeneity, all communication between the machine learning models 116 are accomplished using protocol buffer (Protobuf) formatted messages; (9) each machine learning model 116 is wrapped with a model runner that converts an outgoing message into Protobuf format and each incoming Protobuf message is converted to the native language specific format; (10) the use of model runners allows building blocks 108 developed in different programming languages to communicate with each other; (11) the basis of model-driven machine learning application composition is defining hooks for composition; (12) each of the building blocks 108 has an associated Protobuf file; (13) a Protobuf file describes the set of operations (e.g., services) supported by a specific building block 108, and the messages that are consumed and produced by each operation; (14) each of the building blocks 108 is uploaded to the catalog 110 along with its Protobuf file; (15) each message is specified by a message signature, wherein the message signatures of input and output messages consumed and produced by the building blocks 108 are used in the definition of hooks for composition — that is, requirements and capabilities of the building blocks 108; (16) each of the building blocks 108 is represented by its Topology and Orchestration Specification for Cloud Applications ("TOSCA") model; (17) in this manner, the building blocks 108 can be used to compose the composite ML applications 104 within the design studio 102 and to facilitate deployment on target cloud environments 118; (18) in the TOSCA Model, each resource (i.e., one of the building blocks 108) exposes certain requirements and offers certain capabilities, wherein these requirements and capabilities form the basis for composition and ensure only compatible hooks (i.e., ports, interfaces) are chained together; (19) the composition engine 120 is a backend for composition graphs created by the design studio 102 in the canvas 106; (20) the composition engine 120 generates composition dump ("CDUMP") files 122 for the composite machine learning applications 104, validates the CDUMP files 122, and generates blueprint files 124; (21) the CDUMP files 122 are simple graph structures consisting of arrays of nodes, relations, inputs, and outputs; (22) the composition engine 120 writes these graph structures as JavaScript Object Notation ("JSON") objects 126 that can be read back into the design studio 102 to recreate the in-memory graph representations on the canvas 106; (23) in response to save requests from the design studio 102, the composition engine 120 stores the CDUMP file 122 of the active design studio project in a repository 128; (24) when a user requests to open the composite machine learning application 104 in the design studio 102, the composition engine 120 retrieves the CDUMP file 122 from the repository 128, and the UI layer of the design studio 102 interprets the CDUMP file 122 for presentation on the canvas 106; (25) the blueprint file 124 represents a deployment model of the composite machine learning application 104 that was designed and assembled in the canvas 106; (26) the blueprint file 124 (i.e., deployment model) identifies the components (i.e., the building blocks 108) of the composite machine learning application 104, identifies the location from where docker images 130 of the building blocks 108 can be downloaded for deployment in the target cloud environment 118, and identifies the connectivity relationship between the components; (27) the building blocks 108 are standard microservices that expose standard representational state transfer ("REST")-based interfaces; (28) the building blocks 108 each consumes an input message and produces an output message; (29) the building blocks 108 are not aware of their environment that is, each of the building blocks 108 do not know to which other building blocks they might be connected during run time; (30) at design time  the design studio 102 captures this connectivity information in the blueprint file 124; (31) the connectivity information identifies the sequence in which the building blocks 108 need to be invoked; (32) the composition engine 120 contains and maintains in-memory graph representations that respond to editing operations performed in the design studio 102 on the canvas 106 to perform editing operations, such as, e.g., adding nodes and links, deleting nodes and links, modifying node and link properties; (32) the composition engine 120 exposes composition engine APIs 132A-132N for the UI layer of the design studio 102 to call for performing all user-requested action in the UI layer, such as, e.g., (a) retrieving all of the building blocks 108 and the composite machine learning application 104 from the repository 128 into the UI layer; (b) adding, deleting, or modifying nodes and/or links; (c) saving the composite machine learning application 104; (d) validating the composite machine learning application 104; and (e) retrieving the composite learning applications 104; (33) a blueprint deployer 134 retrieves the blueprint file 124 of the composite machine learning application 104 from the repository 128; (34) the blueprint deployer 134 retrieves the docker images 130 of the building blocks 108 from the URLs specified in the blueprint file 124; (35) the blueprint deployer 134 utilizes target cloud APIs 136A-136N to create, based upon the docker images 130, docker containers ("containers") 138A-138E on virtual machines 140A-140B in the target cloud environment 118, and assigns IP addresses and ports to the containers 138A-138E; (36) the blueprint deployer 134 provides model chaining information to a run time model connector 142 based upon the connectivity information in the blueprint file 124; (37) the blueprint deployer 134 then starts the containers 138A-138E; (38) the blueprint deployer 134 creates a docker information file 144 that contains the associations between the building blocks 108 of the composite machine learning application 104 and the IP addresses and ports of the containers 138A-138E; (38) execution of the composite machine learning application 104 is facilitated by the run time model connector 142; (39) the run time model connector 142 enables communication between the building blocks 108 of the composite machine learning application 104; (40) the blueprint file 124 (produced by the composition engine 120) and the docker information file 144 (produced by the blueprint deployer 134) are fed to the run time model connector 142, which interprets the connectivity information provided in the blue print file 124, assigns IP addresses and ports to the building blocks 108, and feeds the output of one building block 108 to the input of the next building block 108; and (41) the building block(s) 108 of the composite machine learning application 104 that is/are responsible for performing the data collection/ingestion function(s) 112 can point to one or more data sources 146 from which to collect/ingest data for the composite machine learning application 104 during run time.  Aftab further teaches in ¶¶ [0051]-[0054] with FIG. 5 that (1) a machine learning design studio GUI ("design studio GUI") 500 shows a graphical representation of the catalog 110 from which users can select graphical representations of the building blocks 108 — that is, the data collection/ingestion functions 112, the data transformation functions 114, and the machine learning models 116 to design the composite machine learning application 104 on the canvas 106; (2) the design studio GUI 500 also shows a validation console 502, a properties box ("properties") 504, a matching models box ("matching models") 506, a My Composite Machine Learning Applications box 508, a probe checkbox 510 a validate option 512, a save option 514, and a deploy option 516; (3) the properties box 504 provides a view of the properties of the building blocks 108, the operations exposed thereby (via ports), and the details of the message signatures associated therewith; (4) when a user clicks on an input or output port of a given building block 108, all the machine learning models 116 that are compatible with that port and can be connected to that port are displayed in the matching models box 506; (5) the user can then drag visual representations of the machine learning models 116 that are compatible into the canvas 106 for composition; (6) the composite machine learning applications 104 created by the user but not yet made public are shown in the My Composite Machine Learning Applications box 508; (7) the user can drag and drop them from this box into the canvas 106 and update as needed; (8) the design studio GUI 500 allows the user to insert a probe capability between a pair of ports; (9) when the probe checkbox 510 is checked, at run time, the run time model connector 142 will forward any message flowing between a pair of ports to a probe, where it can be visualized by the user; (10) the save option 514 allows the user to save the current design shown on the canvas 106; (11) selection of the save option 514 prompts the composition engine 120 to create the CDUMP file 122 for the current design and to stores the CDUMP file 122 in the repository 128; (12) once the composite machine learning application 104 is saved, the user can click on the validate option 512, which prompts the composition engine 120 to execute a set of validation rules to validate the composite machine learning application 104; (13) if the composite machine learning application 104 is successfully validated, the composition engine 120 creates the blueprint file 124 for the composite machine learning application 104 and stores the blueprint file 124 in the repository 128 for later use by the blueprint deployer 134; (14) all validation-related errors and/or success messages and other information can be presented to the user in the validation console 502; (15) the deploy option 516 remains greyed out and gets activated only if the validation was successful; (16) when clicked, the user can be directed to a deployment interface to initiate deployment of the composite machine learning application 104; (17) the design studio 102 lets the user not only com pose the building blocks 108 as a linear cascaded composition of heterogeneous machine learning models, but also provides the flexibility to compose DAG-based composite solutions where an output port of one model might fan out into multiple outgoing links feeding other models and an input port that support multiple fan-in capability to allow multiple models to feed their outputs into an input port of the model; (18) along with this capability, the design studio 102 supports corresponding split and join (collation) semantics that are used to provide one-to-many and many-to-one connectivity between models; (19) the use of DAG topology by the design studio 102 operates under the assumption that each model in the composite machine learning application 104 consumes one message (i.e., an input message 302) and produces one message (i.e., an output message 310); and (20) the design studio 102 also follows REST-based communication standards to maintain a single request to single response communication style.  Aftab also teaches in ¶¶ [0063]-[0065] with FIGS. 1, 5, and 7 that (1) the method 700 begins and proceeds to operation 702, where the design studio 102 ingests the building blocks 108 after onboarding; (2) from operation 702, the method 700 proceeds to operation 704, where the design studio 102 stores the building blocks in the catalog 110; (3) from operation 704, the method 700 proceeds to operation 706, where the design studio 102 presents the canvas 106 upon which the user can visually design the composite machine learning application 104 from visual representations of the building blocks 108; (4) from operation 706, the method 700 proceeds to operation 708, where the design studio 102 receives input from the user to design, on the canvas 106, a visual representation of the composite machine learning application 104 from the building blocks 108 available in the catalog 110; (5) from operation 708, the method 700 proceeds to operation 710  where the design studio 102 receives a request to save (e.g., via the save option 514 shown in the design studio GUI 500 see FIG . 5) the composite machine learning application 104; (6) in response to the save request received at operation 710, the composition engine 120 (i.e., the backend processing portion of the design studio 102) generates, at operation 712, the CDUMP file 122 for the composite machine learning application 104; (7) from operation 712, the method 700 proceeds to operation 714, where the composition engine 120 validates the CDUMP file 122 based upon one or more validation rules; (8) results of the validation operation can be presented in the validation console 502 shown in FIG . 5; (9) from method 700 ends if the composition engine 120 is unable to validate the CDUMP file 122; (10) from operation 714, the method 700 proceeds to operation 716, where the composition engine 120 generates the blueprint file 124 for the composite machine learning application 104 and stores the blueprint file in the repository 128; (112) from operation 716, the method 700 proceeds to operation 718, where the blueprint deployer 134 uses the blueprint file 124 to deploy the composite machine learning application 104 in the target cloud environment 118; (13) from operation 718, the method 700 proceeds to operation 720, where the run time model connector 142, at run time, enables communication between the building blocks 108 of the composite machine learning application 104 based upon the docker information file 144 provided by the blueprint deployer 134; and (14) from operation 720, the method 700 proceeds to operation 722, where the method 700 ends.  Aftab further discloses in ¶ [0066] with FIGS. 1 and 8 that (1) the method 800 begins with a request (not shown) to deploy received from the user by the design studio 102 and proceeds to operation 802, where the blueprint deployer 134 retrieves the blueprint file 124 from the repository 128; (2) from operation 802, the method 800 proceeds to operation 804, where the blueprint deployer 134 retrieves the docker images 130 of the building blocks 108 from the URLs specified in the blueprint file 124; (3) from operation 804, the method 800 proceeds to operation 806, where the blueprint deployer 134 creates the containers 138 from the docker images 130, and assigns IP addresses and ports to the containers 138; (4) from operation 806, the method 800 proceeds to operation 808, where the blueprint deployer 134 chains the containers 138 together, via the run time model connector 142, based upon the connectivity information provided in the blueprint file 124; (5) from operation 810, the blueprint deployer 134 starts the containers 138 in the virtual machines 140 deployed in the target cloud environment 118; (6) from operation 810, the method 800 proceeds to operation 812, where the blueprint deployer 134 creates the docker information file 144 that contains associations between the building blocks 108 and the assigned container's IP address(es) and port(s); and (7) from operation 812, the method 800 proceeds to operation 814, where the method 800 ends.  Aftab also discloses in ¶ [0067] with FIGS. 1 and 9 that (1) the method 900 begins and proceeds to operation 902, where the run time model connector 142 receives the blueprint file 124 and the docker information file 144 from the blueprint deployer 134; (2) from operation 902, the method 900 proceeds to operation 904, where the run time model connector 142 interprets the connectivity information provided in the blueprint file 124; (3) from operation 904, the method 900 proceeds to operation 906, where the run time model connector 142 assigns IP address(es) and port(s) to the building blocks 108 in accordance with the docker information file 144; (4) from operation 906, the method 900 proceeds to operation 908, where the run time model connector 142 executes the composite machine learning application 104 such that the output of previous machine learning models 116 is fed into the input of next machine learning models 116 in a sequence as dictated by the blueprint file 124; and (5) from operation 908, the method 900 proceeds to operation 910, where the method 900 ends.  Aftab further teaches in ¶¶ [0075]-[0083] with FIG. 11 that (1) the machine learning system 1100 includes one or more machine learning models 116, wherein the machine learning model(s) 116 can be created by the machine learning system 1100 based upon one or more machine learning algorithms 1102; (2) the machine learning algorithm(s) 1102 can be any existing, well-known algorithm, any proprietary algorithms, or any future machine learning algorithm; (3) some example machine learning algorithms 1102 include, but are not limited to, gradient descent, linear regression, logistic regression, linear discriminant analysis, classification tree, regression tree, Naive Bayes, K-nearest neighbor, learning vector quantization, support vector machines, and the like; (4) the machine learning system 1100 can control the creation of the machine learning models 116 via one or more training parameters; (5) the training parameters are selected by one or more users, such as the modelers that onboard their machine learning models 116 into the catalog 110; (6) alternatively, the training parameters are automatically selected based upon data provided in one or more training data sets 1104; (7) the training parameters can include, e.g., a learning rate, a model size, a number of training passes, data shuffling, regularization, and/or other training parameters; (8) the machine learning algorithm 1102 can update the weights for every data example included in the training data set 1104; (9) the size of an update is controlled by the learning rate; (10) a learning rate that is too high might prevent the machine learning algorithm 1102 from converging to the optimal weights; (11) a learning rate that is too low might result in the machine learning algorithm 1102 requiring multiple training passes to converge to the optimal weights; (12) the model size is regulated by the number of input features ("features") 1106 in the training data set 1104; (13) a greater the number of features 1106 yields a greater number of possible patterns that can be determined from the training data set 1104; (14) the model size should be selected to balance the resources (e.g., compute, memory, storage, etc.) needed for training and the predictive power of the resultant machine learning model 116; (15) the number of training passes indicates the number of training passes that the machine learning algorithm 1102 makes over the training data set 1104 during the training process; (16) the number of training passes can be adjusted based, e.g., on the size of the training data set 1104, with larger training data sets being exposed to fewer training passes in consideration of time and/or resource utilization; (17) the effectiveness of the resultant machine learning model 116 can be increased by multiple training passes; (18) data shuffling is a training parameter designed to prevent the machine learning algorithm 1102 from reaching false optimal weights due to the order in which data contained in the training data set 1104 is processed; e.g., data provided in rows and columns might be analyzed first row, second row, third row, etc., and thus an optimal weight might be obtained well before a full range of data has been considered; (19) by data shuffling, the data contained in the training data set 1104 can be analyzed more thoroughly and mitigate bias in the resultant machine learning model 116; (20) regularization is a training parameter that helps to prevent the machine learning model 116 from memorizing training data from the training data set 1104; i.e., the machine learning model 116 fits the training data set 1104, but the predictive performance of the machine learning model 116 is not acceptable; (21) regularization helps the machine learning system 110 avoid this overfitting/memorization problem by adjusting extreme weight values of the features 1106; e.g., a feature that has a small weight value relative to the weight values of the other features in the training data set 1104 can be adjusted to zero; (22) the machine learning system 1100 can determine model accuracy after training by using one or more evaluation data sets 1108 containing the same features 1106' as the features 1106 in the training data set 1104; (23) this also prevents the machine learning model 116 from simply memorizing the data contained in the training data set 1104; (24) the number of evaluation passes made by the machine learning system 1100 can be regulated by a target model accuracy that, when reached, ends the evaluation process and the machine learning model 116 is considered ready for deployment; (25) after deployment, the machine learning model 116 can perform prediction 1112 with an input data set 1110 having the same features 1106" as the features 1106 in the training data set 1104 and the features 1106 ' of the evaluation data set 1108; (26) the results of the prediction 1112 are included in an output data set 1114 consisting of predicted data; and (27) the machine learning model 116 can perform other operations, such as regression, classification, and others. 
Wang et al. (US 2022/0051049 A1, filed on 08/11/2020) discloses in ABSTRACT and ¶¶ [0004]-[0006] that (1) automatically select a machine learning model pipeline using a meta-learning machine learning model; (2) receive ground truth data and pipeline preference metadata; (3)  determine a group of pipelines appropriate for the ground truth data; (4) each of the pipelines includes an algorithm and at least one pipeline includes an associated data preprocessing routine; (5) generate a target quantity of hyperparameter sets for each of the pipelines; (6) apply the preprocessing routines to the ground truth data to generate sets of preprocessed ground truth data for each pipeline; (7) rank the performance of each hyperparameter set for the group of pipelines to establish a preferred set of hyperparameters for each of pipelines; (8) apply a sentence embedding algorithm to select favored data features for scoring; (9) apply each of the pipelines with associated preferred set of hyperparameters to score the favored data features of an appropriately preprocessed set of ground truth data and rank the pipeline performance accordingly; and (10) select a candidate pipeline in accordance, with the pipeline performance ranking.  Wang further discloses in ¶¶ [0019]-[0033] with FIGS. 1-4 that (1) a source of Ground Truth Data (GTD) 106 useful for training and validating the models to be selected; (2) a source of pipeline preference metadata PPM 108 provides desired attributes for the pipelines to be selected; (3) the Pipeline Preference Metadata (PPM) 108 may be provided by a user and can include a variety of pipeline selection criteria including (a) constraints on number of pipelines to be selected; (b) maximum or minimum selection run time; (c) pipeline stability; (d) maximum and minimum model training time, desired model accuracy threshold; and (e) forced pipelines and features that must be selected; (4) a source of hyperparameter metadata 110 that provides information about hyperparameter values (not shown) to be assigned to the algorithms selected; (5) the hyperparameter metadata 110 can indicate which hyperparameters are known by those skilled in the art to be acceptable for each of the algorithms available for selection; (6) the hyperparameter metadata 110 may also include a target quantity of hyperparameter sets to be generated and ranked for each pipeline selected; (7) receive algorithm/data type matching metadata 112 that indicates which of several available algorithms are appropriate for modeling various types of data; (8) receive algorithm appropriate preprocessing routines metadata 114 which indicates which of several available data preprocessing routines are suitable for treating raw data for use with algorithms selected; (9) a Pipeline Generation Module (PGM) 116 that uses the algorithm/data-type matching metadata 112, and algorithm-appropriate preprocessing routines metadata to generate multiple pipelines in accordance with the using the pipeline preference metadata 108; (10) the PGM 116 may also accept input from a user to guide pipeline generation; (11) a Data Pre-processing Module (DPM) 118 applies each of the preprocessing routines identified as appropriate for the algorithms in the pipelines generated by the PGM; (12) a Hyperparameter Generation Module (HGM) 120 generates a targeted quantity of hyperparameter sets for the algorithms associated with each of the pipelines generated by the PGM 116; (13) a Hyperparameter Optimizing Module (HOM) 122 identifies a preferred hyperparameter set for the algorithms in each pipeline; (14) an Assembled Pipeline Comparison Module (APCM) 124 executes each of the pipelines generated by the PGM, using the favored hyperparameter sets identified for each algorithm by the HOM 122; (15) a Data Processing Optimization Module (DPOM) 126 uses feature engineering to determine the most revealing data attributes; (16) a Pipeline Validation User Interface (PVUI) 128 allows a user to examine pipeline execution results to correct, remove selected pipelines, and otherwise give input regarding pipeline performance to increase result interpretability and user confidence; (17) an Ensemble Assembly Module (EAM) 130 combines multiple pipelines into a cooperative bundle; (18) an Ensemble Pipeline Application Module 132 applies the pipelines in the ensemble to provided data 106 which can indicate whether multiple pipelines provide results that agree; (19) send data analysis results to a user display, recording device, or other output device 134 for acceptance and application by a user; (20) at block 202 receive Ground Truth Data 106 which is deemed to be accurate, and this data is used to train the pipeline models selected by the server computer; (21) a portion (e.g., 80%) of the GTD 106 is used as pipeline training data, and the remainder (e.g., 20%) of the data is reserved as holdout data for validation of the pipelines selected; (22) at block 204 receive PPM 108 which includes preference information (e.g., from a user or other guiding source selected by one of ordinary skill in this field) that gives parameters for the PGM 116; (23) the PPM 108 may include information that instructs the server computer 102 regarding how many pipelines to target for assembly, desired testing, modeling, and training run time ranges, desired performance (e.g., accuracy, stability, or other value selected by one of ordinary skill in this field) thresholds, certain required pipeline arrangements, features to include or an order to stop or pause pipeline generation to allow for pipeline inspection; (24) at block 206 receive hyperparameter metadata which, in addition to target hp set quantities, may include values appropriate, e.g., for each of the algorithms included in pipelines generated by the PGM 116 of the server computer 102; (25) the hyperparameter metadata 110 may also include information about which hyperparameters are most likely to produce desired results (e.g., accuracy, computation time, consistency, and other desirable attributes known to those of skill in this art) when used with the associated pipeline algorithms; (26) at block 208, algorithm/data-type matching metadata 112; (27) at block 210, algorithm-appropriate preprocessing routine metadata 114, which indicates which pre-processing routines are for best suited for the various algorithms which may be selected; (27) this preprocessing routine metadata 114 is applied, along with algorithm/data-type matching metadata 112, by the PGM 116 in block 212 to assemble a set of pipelines that meets the characteristics set forth in the PPM 108 (e.g., a targeted number of pipelines, data-type matching algorithms, and appropriate preprocessing routines; (28) make, via the PGM 116 at block 212 a set of pipelines 402 that meet the criteria indicated by the PPM 108; (29) pipeline generation occur iteratively, in conjunction with decision block 214, with the server computer 102 iteratively deciding after generating each pipeline 402, whether more pipelines are needed (e.g., pipeline target quantity has been met or a user has indicated that a current set of pipelines is deemed sufficient; (30) at block 216, the DPM 118 modifies GTD 106 as necessary by applying the preprocessing routines 406 selected for each algorithm 404 associated with the pipelines 402; (31) generate, via the HGM 120 at block 218, unique sets of hyperparameters for the algorithm associated with each pipeline 402; (32) the hyperparameter set quantity and values are chosen in accordance with the hyperparameter metadata 110, wherein these hyperparameter sets represent alternate, viable options for algorithm testing as known in this field and are passed on for downstream pipeline optimization; (33) the hyperparameter metadata 110 may also include a selection algorithm that indicates which of the available hyperparameter values are most likely to achieve performance matching preselected performance criteria; (34) when present, the HGM 120 may use such a selection algorithm to choose hyperparameter values statistically-likely to generate pipelines 402 that exceed related performance thresholds; (35) the server computer 102, via the HOM 122 at block 220, iteratively runs a training portion of the preprocessed GTD 106 through each of the pipelines 402 with the hyperparameter sets generated by the PGM 116; (36) the HOM 122 assess performance of each pipeline 402 iteratively, comparing performance for each of the associated hyperparameter sets; (37) the HOM 122 determines favored hyperparameter sets for each pipeline 402; (38) the server computer 102, via APCM 124 at block 222, executes each assembled pipeline with the top hyperparameter sets identified by the HOM 122 and ranks the pipelines (e.g., according to measured performance); (39) the server computer 102, via the DPOM 126 in block 224, determines which features (including sentence length, number of unique words, total number of verbs and, total number of nouns and pronouns, and other attributes identified by one skilled in this field) to track when applying the selected pipelines 402 and generates a provisional list of assessment features; (40) the DPOM 126 iteratively runs the pipelines 402, each with favored hyperparameter values, and progressively removes one assessment feature from the provisional list being tracked until performance regarding a selected performance metric undergoes a meaningful step change; (41) the DPOM 126 will reintroduce the attribute most recently removed from the provisional feature list for the pipeline being measured and formalize that list as the group of most-telling attributes for the given pipeline 402 as tested; (42) the DPOM progressively identifies a group of most-telling attributes for each pipeline 402; (43) with the DPOM 106, the server computer 102 selects groups of data features to consider which strike a balance between pipeline performance and data processing time, by reducing the number of features considered; (44) the server computer 102 presents to a user for feedback, via the (PVUI) 128 at block, results of applying the pipelines 402 generated by the PGM 116, having top hyperparameter sets identified by the HOM 122 and considering most-telling attributes groups to a remaining hold-out portion of GTD 106 processed according to the routines 406 identified by as ranked by the DPOM 126; (45) the group of pipelines 402 for which results are provided is called a list of candidate pipelines, and the PVUI 128 allows a user to assess and interactively select and modify the pipelines 402 on this list; (46) pipeline performance details are included to provide a high degree of interpretability (e.g., including showing raw GTD to allow users to identify (a) when such data is possibly mislabeled to forgive apparently-poor pipeline performance; which data attributes were graded; (b) what various pipelines provided as results and times when certain pipelines agree; (c) highlight key terms to reveal potential oversights in a given model; and (d) other pipeline aspects selected by one skilled in this field to establish user trust for the selected pipelines); (47) this degree of interpretability allows a user to selectively remove or choose certain pipelines from the candidate pipeline list; (48) the PVUI 128 may request user input before a target quantity of pipelines 402 is generated, allowing a user to indicate satisfaction with a given list of pipelines, even if additional pipelines could be generated; (49) the server computer 102, via the PVUI 226, selects (possibly with user input) a final group of pipelines 402 from the candidate list (which may remain unchanged) and passes the final group of pipelines on for further processing; (50) the server computer 102, via Ensemble Assembly Module 130 at block 228 collects the final group of pipelines 402 into a cooperative group that will collectively assess data provided; (51) the server computer 102, at block 230, applies the ensemble or group of pipelines 402 to user data and generates results; and (52) the server computer 102, at block 232 provides results (e.g., through a display, recording device, or some other arrangement selected by on skilled in this field) for further storage or use.
COLLOMOSSE (US 2021/0397942 Al, pub. date: 12/23/2021) discloses in ABSTRACT and ¶¶ [0004]-[0006] that (1) learning structural similarity of user experience (UX) designs using machine learning; (2) generating a representation of a layout of a graphical user interface (GUI), the layout including a plurality of control components, each control component including a control type, geometric features, and relationship features to at least one other control component; (3) generating a search embedding for the representation of the layout using a neural network; (4) querying a repository of layouts in embedding space using the search embedding to obtain a plurality of layouts based on similarity to the layout of the GUI in the embedding space; (5) use machine learning techniques to encode user experience layouts into search embeddings and use the search embeddings to identify layouts that are structurally similar to one another; (6)  generate a graph representation of a UX layout, wherein the graph representation may include nodes and edges, where each node corresponds to a different component of the user interface (e.g., buttons, icons, sliders, text, images, etc.) and each edge corresponds to how a given pair of nodes are related to one another (e.g., relative distance, aspect ratio, orientation, nesting, etc.); (7) this encodes the structure of the layout and not merely the look and feel of the layout into a graph representation, wherein this graph representation can then be processed by a layout encoder, such as a graph convolutional network (GCN) to generate a search embedding for the layout; (8) the layout encoder may be trained as part of a hybrid layout encoder-layout decoder architecture, where the layout encoder (e.g., a GCN) encodes an input layout into a search embedding, and the layout decoder (e.g., a convolutional neural network) decodes the search embedding into a raster representation; (9) the raster representation may have a number of channels equal to the number of different classes of components of a UX layout that the layout encoder is configured to encode; (10) the resulting raster representation can be compared to a ground truth raster representation of the input layout and the GCN-CNN network can be trained based on the difference between the raster representations; (11) once trained, the layout encoder can be used independently of the layout decoder to encode layouts into search embeddings; and (12) similar layouts may be identified by determining a distance between their search embeddings in the embedding space.  COLLOMOSSE further discloses in ¶¶ [0023]-[0024] that (1) a layout management system that uses machine learning to learn structural similarities between UX layouts which can be used to more accurately identify similar UX layouts; (2) layout is fundamental to UX design, where arrangements of user interface components form the blueprints for interactive applications; (3) repositories of UX layouts are openly shared online in creative portfolio websites, e.g. Behance.net, etc., embodying the best practices and creativity of thousands of UX design professionals; (4) the ability to easily search these repositories offers an opportunity to discover and re-use layouts, democratizing access to this wealth of design expertise; (5) searching repositories of past UX designs is also useful for individual creatives or design houses to recall similar prior work both for efficiency and/or for consistency of style; (6) generate a layout representation that encodes these structural properties;(7)  the layout representation may then be input to a neural network which maps the layout representation to a search embedding; and (8) this search embedding can then be compared to the search embeddings generated for other layouts to identify layouts that are structurally similar.  COLLOMOSSE also discloses in ¶¶ [0025]-[0038] with FIGS. 1-2 that (1) the UX layout 100 can be a labeled or annotated image of all or a portion of a UX layout; e.g., an image of a UX layout may be labeled or annotated with bounding boxes around each component represented in the image and each bounding box may be associated with a class label or annotation that identifies the type of component (e.g., text button, icon, slider, etc.); (2) the input UX layout may be automatically annotated by passing the UX layout 100 through a trained layout detector; (3) the layout detector may be a machine learning model trained to identify UX components in an image and which outputs bounding box and class labels or annotations for each identified UX component; (4) a layout  representation generator 103 that transforms the UX layout 100 into a layout representation 104; (5) the UX layout can be represented as a spatial graph, where each node of the graph corresponds to a different component and each edge represents the relationships between components; (6) by encoding the geometric features and relationship features of the components of the UX layout 100, the graph representation provides more details about the UX layout than a mere image; (7) each node of the graph, corresponding to a different component of the UX layout 100, can include a semantic property and a geometric property of the component; (8) the semantic property may define the class to which the component belongs; (9) this may be represented by a one-hot vector which denotes the component class; e.g., a one-hot vector may be of a length equal to the number of different classes of UX components the layout management system is configured to recognize; (11) the geometric features may include the spatial location of the component in the UX; e.g., the geometric features of a component ci, may include the height and width of the component and the centroid of the component; (12) the nodes of the layout representation 104 may be connected by edges, where each edge includes relationship features between the components corresponding to the connected nodes; e.g., the relationship features can include a relative distance between the components, aspect ratios, orientation, and whether one component is nested within another component; (13) the spatial graph layout representation may be a directed graph, as such there are two edges connecting each pair of nodes; (14) by encoding geometric and relationship features into the graph, fine-grain arrangements of components can be retrieved which would be lost in raster representations; (15) with the UX layout 100 encoded into layout representation 104, the layout representation can be input to a neural network 106 which may include a machine-learning model that can be tuned (e.g., trained) based on training input to approximate unknown functions; (16) where the layout representation 104 is a graph, the neural network 106 may be a graph convolutional network (GCN); (17) GCNs are specialized convolutional neural networks (CNN) designed to analyze non-Euclidean data for deep learning, e. g., social graphs, communication, traffic networks, etc.; (18) the neural network 106 can be trained as a hybrid GCN-CNN, where the GCN encodes the input training layouts into a high dimensional latent representation, referred to herein as search embedding and, during training, the CNN decodes the search embedding into a raster representation of the input UX layout 100; (19) the neural network can be trained end-to-end based on a reconstruction loss function between the output raster representation of the search embedding and a ground truth raster representation of the training input layout; (20) once the neural network 106 has been trained it can be used by layout management system 102 to search for layouts in a layout repository similar to an arbitrary input UX layout 100; (21) the neural network encodes the input layout into a search embedding 108. This search embedding 108 can then be compared to search embeddings for layouts in a layout repository in the embedding space to identify similar layouts; (22) the layout management system 102 may include a query manager 110 which receives the search embedding 108 generated for UX layout 100; (23) the query manager can use the search embedding 108 to search layouts 114 stored in layout repository 112 that are structurally similar 114 to UX layout 100; (24) the query manager can compare the search embedding 108 to search embeddings 116 that have been generated for the layouts 114; e.g., a distance metric, such as L1 or L2 distance metrics, may be used to identify layouts from the layout repository that are "close" to the input layout in the high-dimensional embedding space (25) the layout repository or repositories being search may include layout repository 112 maintained by layout management system 102 and/or may include external layout repository or repositories 118 (e.g., accessible over one or more networks, such as the Internet); (26) the layout management system 102 can return a set of structurally similar layouts 120 that are close to the input layout in the high-dimensional embedding space; (27) this may be a ranked list of layouts from the repository, ranked in descending or ascending order of "closeness" in the embedding space and/or may include those layouts which are within a threshold distance of the input layout in the embedding space; (28) locations of components to add to the UX layout during design, or changes in location, size, etc. of existing components in a UX layout may be recommended based on similar layouts identified in the layout repository; (29) enable a designer to leverage the creative knowledge embodied in existing layout designs when designing a new UX layout, enhancing the productivity of the designer; and (30) such layout comparisons can be used by a design firm, company, or other entity to ensure that their UX layouts across different products and/or platforms provide a coherent, consistent user experience and that new UX layouts can be developed more quickly than under previous systems. COLLOMOSSE further teaches in ¶¶ [0082]-[0086] with FIG. 11 that (1) the method 1100 includes an act 1102 of generating a representation of a layout of a graphical user interface (GUI), the layout including a plurality of control components, each control component including a component class, geometric features, and relationship features to at least one other control component; (2) generating a representation of the layout of the GUI may further include generating a graph representation of the layout, the graph representation including a plurality of nodes corresponding to the plurality of control components and at least one edge connecting the plurality of nodes; (3) each node includes a semantic feature corresponding to a component class, and a geometric feature corresponding to dimensions of a corresponding control component; (4) the at least one edge includes relationship features including at least one of a relative distance, orientation, aspect ratio, or component nesting between a pair of control components; (5) the method 1100 also includes an act 1104 of generating a search embedding for the representation of the layout using a neural network; (6) generating the search embedding may further include processing the plurality of nodes of the graph representation by a first one or more layers of a graph convolutional network (GCN) to generate a plurality of node embeddings, and processing the at least one edge of the graph representation by a second one or more layers of the GCN to generate a plurality of relationship embeddings; (7) the method may further include the acts of determining a weighted average of the plurality of node embeddings using a first self-attention module, determining a weighted average of the plurality of relationship embeddings using a second self-attention module, and generating the search embedding based on the weighted average of the plurality of node embeddings and the weighted average of the plurality of relationship embeddings; (8) the method 1100 also includes an act 1106 of querying a repository of layouts in embedding space using the search embedding to obtain a plurality of layouts based on similarity to the layout of the GUI in the embedding space; (9) querying the repository of layouts in embedding space may further include determining a distance between the search embedding and each layout in the repository of layouts in the embedding space using a distance metric, and returning the plurality of layouts based at least on the distance between the search embedding and the plurality of layouts; and (10) returning the plurality of layouts may further include returning a layout recommendation based at least on the plurality of layouts, wherein the layout recommendation includes at least one of a control component or a control component geometry to be changed in the layout of the GUI.  COLLOMOSSE also teaches in ¶¶ [0088]-[0092] with FIG. 12 that (1) the method 1200 includes an act 1202 of receiving, by a machine-learning backed service, a request to identify one or more similar graphical user interfaces (GUIs) based on a GUI layout; (2) the method 1200 also includes an act 1204 of identifying the one or more similar GUIs based on the at least a portion of the GUI layout; (3) identifying the one or more similar GUIs may include generating a representation of the GUI layout, the GUI layout including a plurality of control components, each control component including a component class, geometric features, and relationship features to at least one other control component, generating a search embedding for the representation of the GUI layout using a neural network, and querying a repository of layouts in embedding space using the search embedding to obtain a plurality of layouts based on similarity to the GUI layout in the embedding space; (4) the method may further include an act of generating a graph representation of the GUI layout, the graph representation including a plurality of nodes corresponding to the plurality of control components and at least one edge connecting the plurality of nodes; (5) each node includes a semantic feature corresponding to a component class, and a geometric feature corresponding to dimensions of a corresponding control component; (6) the at least one edge includes relationship features including at least one of a relative distance, orientation, aspect ratio, or component nesting between a pair of control components; (7) generating the search embedding may further include processing the plurality of nodes of the graph representation by a first one or more layers of a graph convolutional network (GCN) to generate a plurality of node embeddings, and processing the at least one edge of the graph representation by a second one or more layers of the GCN to generate a plurality of relationship embeddings; (8) generating the search embedding may further include determining a weighted average of the plurality of node embeddings using a first self-attention module, determining a weighted average of the plurality of relationship embeddings using a second self-attention module, and generating the search embedding based on the weighted average of the plurality of node embeddings and the weighted average of the plurality of relationship embeddings; (9) the method 1200 also includes an act 1206 of returning the one or more similar GUIs; (10) the one or more similar GUIs may be sent to the client computing device which originated the request, to be displayed on the client computing device or other computing device; and (11) the one or more similar GUIs may be returned as recommendations of changes to be made to one or more components of the GUI layout (e.g., size, location, component class, etc.).
Pokorny et al. (US 2021/0056146 A1, pub. date: 02/25/2021) discloses in ABSTRACT and ¶¶ [0006]-[0008] that (1) a searchable database of software features for software projects can be automatically built; (2) involve analyzing descriptive information about a software project to determine software features of the software project; (3) then a feature vector for the software project can be generated based on the software features of the software project; (4) the feature vector can be stored in a database having multiple feature vectors for multiple software projects; (5) the multiple feature vectors can be easily and quickly searched in response to search queries; (6) a software feature is any functional characteristic of a software project, such as a framework relied on by the software project, a dependency ( e.g., a code library or plugin) relied on by the software project, a programming language of the software project, other systems or software with which the software project is designed to interface, an executable format of the software project, types of algorithms implemented by the software project, etc.; (7) obtain descriptive information about a software project from one or more sources that may be internal or external to the software project, which can include source code for the software project, configuration files or readme files provided with the software project, a description of the software project, or any combination of these; and (8) a feature vector is a data structure containing elements, where each element is a numerical value indicating whether a particular software feature corresponding to the element is present or not present in a related software project.  Pokorny further discloses in ¶¶ [0010]-[0026] with FIG. 1 that (1) obtain descriptive information 104 about a software project 122, such as a software application or package, which can include (a) feedback, reviews, questions, or comments about the software project; (b) source code for the software project; (c) configuration files or readme files provided with the software project; (d) keywords characterizing the software project; and (e) a description of the software project; or any combination of these; (2) the computing device 102 can obtain the descriptive information 104 from one or more sources 106 (e.g., over the Internet) which can include one or more websites, such as discussion forums, repositories, review websites, or any combination of these.; (3) the source(s) 106 can also include parts of the software project 122, such as readme files, configuration files, or source code files; (4) after obtaining the descriptive information 104 for the software project 122, the computing device 102 can parse the descriptive information 104 to determine software features 110 of the software project 122; (5) the software features 110 can be any functional characteristics of the software project 122 relating to how the software project 122 works or operates; e.g., the software features 110 can include tasks and functions that the software project 122 is configured to perform, frameworks and dependencies relied on by the software project 122, operating systems and operating environments for the software project 122, or any combination of these; (6) the computing device 102 can determine the software features 110 based on the descriptive information 104 using any number and combination of techniques; (7) the computing device 102 can determine the software features 110 by parsing the keywords from the website content and using at least some of those keywords as the software features 110; (8) the computing device 102 can apply a count technique to the descriptive information 104 to determine the software features 110, wherein the count technique can involve counting how many times a particular textual term occurs in the descriptive information 104 and storing that count value; (9) the computing device 102 can then determine which of the textual terms have counts exceeding a predefined threshold (e.g., 30), wherein those textual terms may indicate particularly important software features so at least some of those textual terms may be designated as the software features 110; (10) a portion of the textual terms may be filtered out (e.g., using a predefined filter list) as irrelevant, for example, because they are articles, prepositions, or otherwise relatively common textual terms that provide little value, to improve accuracy; (11) the computing device 102 can apply a machine-learning model 112 to the descriptive information 104 to determine the software features 110; (12) the computing device 102 can additionally or alternatively apply other techniques, such as term frequency-inverse document frequency ("TF-IDF"), to determine the software features 110 of the software project 122, wherein TF-IDF may involve generating a numerical statistic that reflects how important a word is to a document in a corpus; (13) the TF-IDF value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general; (14) having determined the software features 110 of the software project 122, the computing device 102 can next determine a feature vector 114 for the software project 122; (15) the computing device 102 can determine the feature vector 114 for the software project 122 by first generating a default feature vector in which all the elements have default values (e.g., zeros); (16) if the computing device 102 determined that the software project 122 has a particular software feature, the computing device 102 can then modify the corresponding element's value in the default feature vector to so indicate; (17) after generating the feature vector 114 for the software project 122, the computing device 102 can store the feature vector 114 in a database 116; (18) the computing device 102 can repeat the above process for any number and combination of software projects to automatically build (e.g., construct) the database 116; (19) each entry can include a relationship between a software project and its feature vector; (20) these entries in the database 116 may be searchable and comparable to perform various computing tasks; (21) the computing device 102 can receive one or more search queries 118 from a client device 120; (22) a search query 118 can be a request to identify one or more software projects in the database 116 that have a particular set of software features; (21) the search process can involve the computing device 102 generating a feature mask 124 based on the particular set of software features associated with a search query 118, wherein a feature mask 124 is a data structure containing elements with values defining criteria for a search; (22) after determining the feature mask 124, the computing device 102 can apply the feature mask 124 to some or all of the feature vectors in the database 116 to determine search results, which may consist of a subset of feature vectors having the particular set of software features requested in the search query 118; (23) applying the feature mask 124 to a feature vector in the database 116 may involve comparing the feature mask 124 to the feature vector or performing mathematical operations using the feature vector and the feature mask 124; (24) applying the feature mask 124 to the feature vector in the database 116 may involve performing a bitwise AND operation between the feature mask 124 and the feature vector; (25) applying the feature mask 124 to the feature vector in the database 116 may involve determining a distance (e.g., Euclidian distance) between the feature mask 124 and the feature vector, where the computing device 102 may return as search results only those feature vectors having distances that are below a predefined threshold; and (26) the search query 118 may be a request to identify a closest match to a software project 122.  Pokorny also discloses in ¶¶ [0032]-[0035] with FIGS. 1 and 3 that (1) in block 302, analyze descriptive information 104 about a software project 122 to determine software features 110 of the software project 12; (2) in block 304, generate a feature vector 114 for the software project 122 based on the software features 110 of the software project 122; (3) in block 306, the feature vector 114 in a database 116 having a plurality of feature vectors for a plurality of software projects; and (4) the blocks 302-306 can be repeated for as many software projects as desired to automatically construct a database 116 of feature vectors, which can be easily searched and compared.
Walters et al. (US 10,628,434 B1, date of patent: 04/21/2020) discloses in ABSTRACT and Col. 1, line 39 – Col. 3, line 13 that (1) systems and methods for indexing and clustering machine learned models; (2) systems and methods for searching indexed machine learned models and receiving suggested models based on the clustering of the same; (3) indexing and mapping models by hyperparameters as well as searching indexed models; (4) indexing and mapping models by hyperparameters and cluster characteristics; (5) receiving a plurality of models configured to transform a base representation of data into a structured representation of the data; (6) using one or more templates, generalizing one of more of the models into one or more corresponding neural network architectures; (7) mapping hyperparameters of the one or more generalized models; (8) indexing the one or more generalized models by the hyperparameters; (9) clustering the one or more indexed models using the hyperparameters (or the one or more templates); (10) storing the generalized models with the received models, the index, and the clusters such that the index and the clusters are searchable; i.e., enabling searching for models using the clustered index; (11) searching indexed models by hyperparameters and cluster characteristics; (12) receiving a query requesting at least one model (or models with at least one hyperparameter within at least one range) and executing the query against an index of generalized machine learning models (or indexed and clustered database of models); (13) the index may include hyperparameters of the models in the database, and the clusters may be based on the hyperparameters; (14) retrieving, in response to the executed query, results including at least one model (or the requested models), and retrieving at least one additional model in a same cluster as the at least one model (or the requested models or neighboring clusters); (15) the cluster may be based on at least one of structural similarities or hyperparameter similarities; (16) returning the at least one model and the at least one additional model as a response to the query; (17) suggesting similar models in response to model selection; (18) receiving, from a user, an identifier associated with at least one model; (19) retrieving, from a database of models clustered by at least one of structural similarities or hyperparameter similarities, the at least one model; (20) using the clusters in the database, identifying one or more additional models within a same cluster as the at least one model or within a neighboring cluster; (21) returning the at least one model and the one or more additional models to the user; (22) displaying the results by displaying the requested models on one region of a graphical user interface and the one or more additional model on another region of the graphical user interface; (23) in response to executing the query, results including the at least one model, as well as one or more additional models having one or more associated degrees of belonging to the same cluster as the at least one model within a threshold; and (24) displaying the results by displaying the at least one model and the one or more additional models together with the one or more associated degrees of belonging on a graphical user interface.  Walters further discloses in Col.4, line 7 – Col. 6, line 38 with FIG. 1 that (1) automatically index and cluster machine learned data models, even if those models are of different types; (2) using generalized representations of the models, which may comprise neural networks, produce indices using comparable hyperparameters and cluster the models using those hyperparameters; (3) use the clusters to suggest models to users that are related to queries from the users; (4) support indexing and clustering of data models, searching and retrieval of data models, optimized choice of parameters for machine learning, and imposition of rules on indexed and clustered data models; (5) computing resources 101 can be configured to (a) index and cluster data models; (b) run applications for generating data models; (c) receive models for training from model optimizer 107, model storage 109, or another component of system 100; and (d) index and cluster received models, e.g., using hyperparameters; (6) dataset generator 103 can be configured to (a) generate data; (b) provide data to computing resources 101, database 105, another component of system 100, or another system; (c) receive data from database 105 or another component of system 100; (d) receive data models from model storage 109 or another component of system 100; (e) generate synthetic data by identifying and replacing sensitive information in data received from database 103 or interface 113; and (f) generate synthetic data using a data model without reliance on input data; (7) database 105 can be configured to store indexed and clustered models for use by system 100; e.g., store models associated with generalized representations of those models (e.g., neural network architectures stored in TensorFlow or other standardized formats); (8) model optimizer 107 can be configured to (a) optimize hyperparameters of indexed and clustered data models for system 100; e.g., adjust one or more hyperparameters of received models and/or corresponding generalized representations of those models before or during indexing and clustering by computing resources 101; (b) optimize models based on instructions received from a user through interface 113 or another system; e.g., receive input of one or more thresholds, one or more loss functions, and/or one or more limits on a number of interactions and apply the input for optimizing a received model (or corresponding generalized representation such as a neural network) on computing resources 101; (c) apply one or more templates to a data model retrieved from model storage 109 and applying the templates to generate a generalized representation of the retrieved model (e.g., a neural network); (d) select model training parameters for the generalized representation based on model performance feedback received from computing resources 101; and (e) provide trained generalized representations to model storage 109 for storing in association with corresponding models; (9) model storage 109 can be configured to (a) store data models and indexing and cluster information associated therewith; and (b) provide information regarding available data models to a user or another system using interface 113, wherein the information can include model information, such as the type and/or purpose of the model and any measures of classification error; (10) model curator 111 can be configured to (a) cluster models and/or generalized representations of the models; e.g., cluster models based on comparisons of hyperparameters of the models themselves and/or of their generalized representation; and (b) cluster models based on similarities (in hyperparameters or other structural variables, such as activation functions, number of weights, or the like) in structure of the models themselves and/or of their generalized representations; (11) interface 113 can be configured to (a) manage interactions between system 100 and other systems using network 115; (b) publish data received from other components of system 100 (e.g., dataset generator 103, computing resources 101, database 105, or the like); (c) provide results from indexed and clustered models in model storage 109 in response to a query received via interface 113; (d) provide data or instructions received from other systems to components of system 100; e.g., receive instructions for retrieving data models (e.g., according to a query of indexed hyperparameters, clustered structural similarities, or the like) from another system and provide this information to model optimizer 107; and (e) receive data including sensitive portions from another system (e.g. in a file, a message in a publication and subscription framework, a network socket, or the like) and provide that data to dataset generator 103 or database 105; and (12) Network 115 can enable communication between components of system 100.  Walters also discloses in Col. 6, line 39 – Col. 9, line 15 with FIG. 2 that (1) efficiently searching and indexing (as well as clustering) data models, wherein these data models may parse unstructured data to generate structured data; (2) transformations 210 may receive one or more data models from databases 202, wherein the one or more models may include (a) one or more linear regressions, neural networks, or the like that parse unstructured data into structured data; (b) one or more linear regressions, neural networks, or the like that predict an outcome based on input or predict a full input based on partial inputs ( e.g., recommending "Capital One" in response to "Cap"); (c) statistical algorithms; e.g., regression models that estimate the relationships among input and output variables; (d) a convolutional neural network model; (e) deep fully connected neural network; (f) recurrent neural network; and (g) Random Forests composed of a combination of decision tree predictors; (3) transformations 210 may transform each received model into a generalized representation; (4)  transformations 210 may include one or more blueprints for converting a decision tree to an equivalent neural network structure, a blueprint for converting a Bayesian classifier to an equivalent neural network structure, or the like; (5) a grid search may map hyperparameters of each received model (e.g., a decision tree, a Bayesian classifier, or the like) to an equivalent neural network structure; (6) one or more hyperparameters may be added; e.g., the LASSO algorithm may add a regularization hyperparameter to a model comprising an ordinary least squares regression such that the model is suitable for a grid search; (7) re-train the generalized representation before application of hyperparameter indexer 212 and/or before application of clusterer 214; e.g., use training data initially used to train each received model to then train the generated representation output by transformations 210; (8) hyperparameter indexer 212 may index parameters of the generalized representations from transformations 210 (e.g., a regularization hyperparameter, a number of coefficients 40 associated with nodes of a neural network, a number of layers of a neural network, a number of leaves in a decision tree, or a learning rate, or the like); (9) clusterer 214 may apply one or more thresholds to one or more hyperparameters in order to classify the generalized representations into one or more clusters; (10) clusterer 214 may apply hierarchical clustering, centroid-based clustering, distribution-based clustering, density-based clustering, or the like to the one or 55 more hyperparameters; (11) clusterer 214 may perform fuzzy clustering such that each generalized representation has an associated score (such as 3 out of 5, 22.5 out of 100, a letter grade such as 'A' or 'C,' or the like) indicating a degree of belongingness in each cluster; (12) compare structural similarities of the generalized representations to perform the clustering; (13) cluster the generalized representations based on a type of activation function used in one or more nodes of a neural network such that one cluster is for polynomials (with possible sub-clusters based on order), another cluster is for logistic functions, or the like; and (14) cluster 214 may cluster the generalized representations based on ranges of one or more weights associated with one or more nodes of the neural network.  Walters further teaches in Col. 10, line 42 – Col. 11, line 54 with FIG.6 that (1) at step 602, receive a plurality of models, wherein the plurality of models may comprise machine learned models; (2) at step 604, using one or more templates, generalize one of more of the models into one or more corresponding neural network architectures; e.g., the one or more templates may comprise mappings of model types to corresponding neural network architectures; (3) train the generalized model using the same training and/or testing data used to train the received model; (4) at step 606, map hyperparameters of the one or more generalized models, wherein the hyperparameters may comprise at least one of a regularization hyperparameter, a number of coefficients associated with nodes of a neural network, a number of layers of a neural network, a number of leaves in a decision tree, a learning rate, or the like; (5) the hyperparameters may be directly extracted from a representation of the generalized model (e.g., a number of layers, a number of nodes, or the like may be extracted from a TensorFlow file describing a neural network); (6) the hyperparameters may be determined during a training of the generalized model; e.g., determine a learning rate during training of the generalized model; (7) at step 608, index the one or more generalized models by the hyperparameters; e.g., generate a relational index such that generalized representations are retrievable using the hyperparameters; (8) generate a graphical index such that each generalized representation is a node and is connected, via an edge, to one or more nodes representing the hyperparameters; (9) cluster the indexed models using the hyperparameters and/or the one or more templates; e.g., applying one or more thresholds to one or more of the hyperparameters to generate one or more clusters; (10) the clustering may comprise at least one of hierarchical clustering, centroid-based clustering, distribution-based clustering, or density-based clustering; (11) the clustering may comprise fuzzy clustering such that each generalized model has a score associated with a degree of belonging in each cluster generated by the clustering; and (12) at step 610, enable searching for models using the clustered index.  Walters also teaches in Col. 11, line 55 – Col. 12, line 53 with FIG. 7 that (1) at step 702, receive a query requesting at least one model, wherein the query may request models with at least one hyperparameter within at least one range; (2) at step 704, execute the query against an index of generalized machine learning models, wherein the index may include hyperparameters of the models; (3) the database of models may be clustered, and the clusters may be based on the hyperparameters; (4) the clusters may be based on structural similarities of the generalized machine learning models; (5) at step 706, retrieve, in response to the executed query and from a database including the indexed generalized machine learning models, results including at least one model, where the query requests models with at least one hyperparameter within at least one range, retrieve results including the requested models; (6) at step 708, retrieve at least one additional model in a same cluster as the at least one model; (7) the cluster may be based on at least one of structural similarities or hyperparameter similarities; (8) retrieve one or more additional models in the same cluster as the requested models or neighboring clusters; and (9) at step 710, return the at least one model and the at least one additional model as a response to the query; e.g., display the results by displaying the requested models on one region of a graphical user interface and the one or more additional model on another region of the graphical user interface.  Walters further discloses in Col. 12, line 54 – Col. 13, line 39 with FIG. 8 that (1) at step 802, receive, from a user, an identifier associated with at least one model; (2) at step 804, retrieve, from a database of models clustered by at least one of structural similarities or hyperparameter similarities, the at least one model; e.g., execute the query against an index of clustered machine learning models. (3) at step 806, using the clusters in the database, identify one or more additional models within a same cluster as the at least one model or within a neighboring cluster; (4) at step 808, return the at least one model and the one or more additional models to the user; e.g., display the results by displaying the at least one model and the one or more additional models together with the one or more associated degrees of belonging on a graphical user interface.
BLISS et al. (US 2021/0157551 A1, pub. date: 05/27/2021) discloses in ABSTRACT and ¶¶ [0003]-[0014] that (1) creating solution design blueprints, wherein a solution design blueprint is a machine readable data structure that includes a conceptual design model for an application framework; (2) a user interface is configured to receive a plain language textual request from a user that describes a desired application or solution to a problem; (3) artificial Intelligence is leveraged to fit the textual request to semantic data models and map elements of the textual request to components of a design library; (4) the resulting solution design blueprint can be presented to a user, and the user interface can be used to provide feedback related to the solution design blueprint that can be utilized to update machine learning algorithms and/or neural networks; (5) the solution design blueprint can be converted to an application framework that is provided to the use in an integrated development environment of the user interface; (6) generating solution design blueprints; (7) a user interface configured to receive a textual request from a user and an ingestion engine is configured to process the textual request by one or more machine learning algorithms to generate a solution design blueprint; (8) at least one machine learning algorithm is configured to parse the textual request based on semantic data models; (9) at least one additional machine learning algorithm is configured to map parsed components of the textual request to components in a design library which includes at least one of relational tables and unstructured data; (10) the one or more machine learning algorithms includes at least one of a recurrent neural network (RNN), a convolutional neural network (CNN), or a self-organizing feature map (SOFM); (11) an intelligent design generator is configured to reject or accept the solution design blueprint and, if the solution design blueprint is rejected, then select a second solution design blueprint generated by the ingestion engine in response to the textual request, or, if the solution design blueprint is accepted, then present the solution design blueprint to the user in the user interface; (12) the user interface is configured to receive feedback from the user for the solution design blueprint; (13) at least one machine learning algorithm can be trained based on the feedback; (14) the intelligent design generator is further configured to consume the solution design blueprint to generate an application framework; (15) at least one user queue that monitors a work flow of a user in generating an application from the application framework; (16) at least one machine learning algorithm can be adjusted based on an analysis of the work flow; (20) the user interface is configured to utilize predictive analytics to suggest completions for the textual request while the user is typing; (21) receiving a textual request from a user interface, parsing the textual request based on semantic data models to generate a parsed request, generating a solution design blueprint based on a design library, and presenting the solution design blueprint to a user in the user interface; (22) the parsing and generating can be performed, at least in part, through execution of at least one machine learning algorithm; and (23) receiving a plurality of solution design blueprints for the textual request, ranking each of the plurality of solution design blueprints, selecting a particular solution design blueprint based on the ranking, and presenting the particular solution design blueprint to the user in the user interface.  BLISS further discloses in ¶¶ [0028]-[0044] with FIG. 1 that (1) artificial Intelligence and Deep Learning Neural Networks can be utilized to assist in the creation of solution design blueprints, which can take the form of a conceptual design model (e.g., a sketch) for an application framework that defines components and integration points; (2) a textual request created by a user is parsed using machine learning algorithms and/or neural networks to fit the elements of the textual request to one or more semantic data models, which helps to establish a deeper understanding of the components included in the application framework and a series of relationships or interconnection points between components; (3) the solution design blueprints are generated based, at least in part, on an existing design library such that existing applications or components can be integrated into the conceptual design model for new applications, as appropriate; (4) feedback related to the generated solution design blueprints can be collected and utilized to update the machine learning algorithms and/or neural networks in order to create better solution design blueprints as more and more textual requests are submitted by developers; (5) the user interface 110 is configured to receive a textual request from a user, which is forwarded to the ingestion engine 120; (2) the textual request is a plain English description of the desired solution design blueprint for a new application, which can include a description of what a software architect desires for a project; e.g., the textual request can list the information sources or other resources available within a system, describe one or more tools (e.g., processes, services, applications, etc.) that can be used to process the information or resources, list the desired output of the application, and list any desired structure or connection between existing or new components; (3) the format of the textual request is not rigid and can include any description deemed appropriate by the software architect for the identified problem; (4) the ingestion engine 120 is configured to process the textual request by one or more machine learning algorithms to generate a solution design blueprint; (5) the processing can include parsing the textual request based on one or more semantic data models 122 which can identify certain types of speech or relationships between words in the language of the textual request; (6) a semantic data model 122 can also infer structure associated with certain words; (7) the ingestion engine 120 uses one or more machine learning algorithms 152 and/or neural networks 154 in the AI library 150 to parse the textual request to fit the elements of the textual request into one or more of the semantic data models 122; e.g., a machine learning algorithm 152 can be trained to process a sentence in the English language and reduce the sentence to the basic objects in the sentence paired with their corresponding parts of speech; (8) the neural networks 154 can include at least one of a recurrent neural network (RNN), a convolutional neural network (CNN), or a self-organizing feature map (SOFM); (9) the ML algorithms 152 can utilize one or more of the neural networks 154; e.g., a particular ML algorithm 152 can employ a CNN and an RNN to process different portions of the input; (10) the ingestion engine 120 uses one or more machine learning algorithms 152 and/or neural networks 154 in the AI library 150 to map parsed components of the textual request to components in a design library 124, wherein the design library 124 includes information related to assets or components accessible through one or more networks; (11) the design library 124 can also include information about services that can be utilized by an application; (12) the design library 124 can include relational tables, such as databases with structured content, and unstructured data, such as documentation for applications or services, images that depict a basic structure of a software component, or the like; (13) the ingestion engine 120 generates one or more solution design blueprints, which are passed to the intelligent design generator 130; (14) a solution design blueprint refers to a machine-readable representation of a description for a software framework; i.e., a solution design blueprint is an intermediate representation for a software framework that is somewhere between a plain English description of the desired solution for a stated problem and a source code definition of a software framework that can be used to develop source code for a desired application that is designed to address the stated problem; (15) the solution design blueprint is not required to be written in a particular programming language, but should be easily interpretable to be translated from a format of the solution design blueprint into a software framework in a particular programming language; e.g., the solution design blueprint can have a format of a structured document such as an eXtended Markup Language (XML) document or a JavaScript Object Notation (JSON) document, which includes elements that can easily be translated into a software framework including one or more software modules comprising classes, methods, object definitions, a user interface, or the like; (16) the intelligent design generator 130 is configured to reject or accept a solution design blueprint; (17) the intelligent design generator can implement logic for analyzing a solution design blueprint based on a heuristic algorithm; (18) a score for the solution design blueprint can be compared against a threshold value to determine whether the solution design blueprint is accepted or rejected; (19) an example of the heuristic algorithm can include calculation on a score based on a complexity (e.g., number of functions, number of parameters per function, number of independent variables, etc.) of the solution design blueprint; (20) if the solution design blueprint is rejected, then the intelligent design generator 130 is configured to select a second solution design blueprint generated by the ingestion engine 120 in response to the textual request; (21) the intelligent design generator 130 is configured to transmit a request to the ingestion engine 120 that causes the ingestion engine 120 to generate a new solution design blueprint; (22) the new solution design blueprint is generated using a different ML algorithms 152 and/or different neural network 154; (23) if the solution design blueprint is accepted, then the intelligent design generator 130 is configured to present the solution design blueprint to the user in the user interface 110; (24) presenting the solution design blueprint includes displaying a representation of the solution design blueprint on a display device within the user interface 110; (25) the user interface 110 is configured to receive feedback from the user for the solution design blueprint; e.g., a prompt can request the user's feedback as to whether the solution design blueprint is accepted or rejected by the user; (26) since the user's feedback may be subjective, the feedback can be used to train at least one machine learning algorithm 152 or neural network 154; e.g., when a user rejects a provided solution design blueprint, the rejection can be used to adjust parameters of a machine learning algorithm 152 and/or a neural network 154 in order to change the result of the next solution design blueprint using the machine learning algorithm 152 and/or the neural network 154; (27) the user interface 110 can be configured to allow a user to generate a software framework based on the solution design blueprint; (28) the tasks performed by the user within the user interface 110 can be analyzed to provide the feedback on which the machine learning algorithm 152 and/or the neural network 154 is trained; (29) in this manner, the feedback used to train the AI components in the AI library 150 is not subjective, but objective based on how extensively the software framework was modified to meet the design goal to develop the application; and (30) the intelligent design generator 130 is configured to create the software framework based on the solution design blueprint.  BLISS also discloses in ¶¶ [0045]-[0052] with FIGS. 1 and 2A-B that (1) at step 202, a textual request is received from a user interface 110; (2) at step 204, the textual request is parsed based on semantic data models 122 to generate a parsed request; (3) the textual request is processed by at least one machine learning algorithm 152 and/or at least one neural network 154 in order to generate the parsed request that is fit to one or more of the semantic data models 122; (4) at step 206, a solution design blueprint is generated based on a design library 124; (5) the parsed request is processed by at least one additional machine learning algorithm 152 and/or at least one additional neural network 154 in order to generate the solution design blueprint; (6) the machine learning algorithm 152 can consume the design library 124 in order to map components of the parsed request to existing components in the design library 124; (7) at step 208, the solution design blueprint is presented to a user in the user interface 110; (8) at step 252, feedback for the solution design blueprint is received from a user; (9) the feedback can include an indication of whether the solution design blueprint is accepted or rejected, an indication of changes made to the solution design blueprint, an indication of whether the solution design blueprint was implemented or discarded, or the like; (10) at step 254, at least one machine learning algorithm 152 is trained based on the feedback; (11) parameters for one or more machine learning algorithms are adjusted using, e.g., back propagation with gradient descent or other types of training techniques based on the feedback; and (12) the at least one machine learning algorithm 152 includes an adversarial neural network, and the feedback and solution design blueprint are used as a training sample to train the adversarial neural network.  BLISS further teaches in ¶¶ [0053]-[] with FIGS. 3A-B that (1) a textual request 302 is received from a user interface 110 by the ingestion engine 120, wherein the textual request is processed by a first machine learning algorithm 152-1, which also ingests the semantic data model(s) 122; (2) the first machine learning algorithm 152-1 operates to parse the elements of the textual request to fit the semantic data model(s) 122 in order to generate the parsed request 30; (3) the parsed request 304 is then processed by a second machine learning algorithm 152-2, which also ingests the design library 124, wherein the second machine learning algorithm 152-2 operates to map zero or more elements in the parsed request to corresponding elements included in the design library 124; (4) the transformation of the textual request 302 into a parsed request 304 that fits one or more of the semantic data models 122 where certain elements of the parsed request 304 are then replaced by existing elements included in the design library 124 results in a solution design blueprint 306; (5) the structure depicted in FIG. 3A is configured to produce one candidate solution design blueprint; (6) however, given the parsed request 304, there are potentially innumerable options for mapping certain plain English elements to existing elements in the design library 124, especially where the design library 124 includes various alternatives for performing the same or similar functions; (7) the resulting selection can depend on the structure of the second machine learning algorithm 152-2, as some algorithms may be better at producing results in certain conditions compared to other algorithms; (8) if a solution design blueprint is rejected by a user, then the ingestion engine 120 may need to be re-run using slightly different inputs (e.g., textual request, parsed request, etc.) such as by randomly changing certain terms in the textual request to different terms with similar semantic meaning or by randomly changing a subset of semantic data models 122 or a subset of the design library 124 ingested by the machine learning algorithms 152 in order to produce a different solution design blueprint for the same original textual request; (9) the parameters associated with the machine learning algorithm 152, such as the weights for a particular neural network 154 executed as part of the machine learning algorithm 152, can be adjusted before processing the textual request a second time using the same semantic data models 122 and/or design library 124; (10) process the parsed request 304 by two or more different machine learning algorithms 152 in parallel paths to generate two or more candidate solution design blueprints 306 at the same, or substantially similar, time; (11) each of the three machine learning algorithms 152 ingests the design library 124 and generates a separate and distinct solution design blueprint 306-1, 306-2, and 306-3, respectively; and (12) the textual request 302 can also be processed by multiple instances of different machine learning algorithms 152 to generate multiple different parsed requests that are each processed by one or more additional machine learning algorithms 152 to generate a plurality of candidate solution design blueprints 306.  BLISS also teaches in ¶¶ [0059]-[0071] with FIGS. 4A-C that (1) the intelligent design generator 130 receives a candidate solution design blueprint 306 from the ingestion engine 120; (2) a machine learning algorithm 152-5 processes the solution design blueprint 306 to generate a score 402; (3) selection logic 410 then determines, based at least in part on the score 402, whether to transmit the candidate solution design blueprint 306 to the UI 110 or to discard the solution design blueprint 306 as unacceptable and request a second, alternative solution design blueprint 306 from the ingestion engine 120; (4) the machine learning algorithm 152-5 is a classifier algorithm that classifies the solution design blueprint 306 as either acceptable or not acceptable; (5) the machine learning algorithm 152-5 calculates a heuristic value based on different characteristics of the solution design blueprint 306, one or more components of which can be the result of processing the solution design blueprint 306 by a corresponding neural network 154; e.g., the heuristic value can be calculated by summing individual scores for different characteristics such as length, number of components, average number of parameters per method, or the like, wherein the heuristic value can then be mapped to a score 402; (6) the selection logic 410 is configured to compare the score 402 to a threshold value to determine whether the candidate solution design blueprint 306 is accepted or rejected; (7) the threshold value can be dynamically configured to match an expected acceptance rate; (8) the operation of the intelligent design generator 130 is described in FIG. 4A as processing a single solution design blueprint 306 at a time; (9) alternatively, the intelligent design generator 130 can be configured to process a plurality of solution design blueprints 306 at once to select a preferred solution design blueprint 306 from the plurality of alternatives as the candidate solution design blueprint 306; (10) the machine learning algorithm 152-6 receives three solution design blueprints 306-1, 306-2, 306-3 from the ingestion engine 120; (11) the machine learning algorithm 152-6 processes the three solution design blueprints 306-1, 306-2, 306-3 to generate rankings 404; (12) the selection logic 410 is then configured to transmit the solution design blueprint 306 assigned the highest ranking (e.g., a ranking of 1) to the UI 110; (13) the machine learning algorithm 152-6 generates rankings as independent heuristic values for each of the plurality of solution design blueprints 306-1, 306-2, 306-3; (14) the selection logic 410 is then configured to compare the plurality of heuristic values to select the solution design blueprint 306 with the best heuristic value (e.g., the highest heuristic value), where the selected solution design blueprint 306 is transmitted to the UI 110; (15) at step 452, a plurality of solution design blueprints are received in response to a textual request; (16) at step 454, each of the plurality of solution design blueprints 306 are ranked; (17) a machine learning algorithm 152-6 is trained to rank the plurality of solution design blueprints 306 based on features extracted from the solution design blueprints 306; (18) at step 456, a particular solution design blueprint 306 is selected based on the ranks; (19) selection logic 410 analyzes the ranks generated for each of the plurality of solution design blueprints 306 to select an optimal solution design blueprint 306 to present to a user; and (20) at step 458, the particular solution design blueprint 306 is presented to the user in a user interface 110.  BLISS further discloses in ¶¶ [0072]-[0074] with FIG. 5 that (1) the intelligent design generator 130 is configured to translate or convert the solution design blueprint 306 into an application framework 502; (2) while the solution design blueprint 306 is programming language independent, the framework generator 510 is configured to convert the syntax of the solution design blueprint into usable source code that represents an application framework 502 in a particular programming language; (3) the framework generator 510 can utilize one or more machine learning algorithms 152-8 that are trained to convert the solution design blueprint 306 into the application framework 502; (4) a different machine learning algorithm 152-8 can be trained for each of a plurality of different programming language such that the resulting application framework 502 can be tailored to a particular programming language desired by the developer; (5) the intelligent design generator 130 is configured to generate an application framework 502 and present the application framework 502 to the user in a user interface 110; and (6) the application framework 502 represents the intermediate solution design blueprint 306, and the user provides feedback related to the solution design blueprint 306 by interacting with, or explicitly providing feedback for, the application framework 502.  BLISS also discloses in ¶¶ [0075]-[0076] with FIG. 6 that (1) the user is typing in a textual request, the predictive analytics module 610 can provide suggested completions within the textual request based on at least one of the design library 124 or the AI library 150; (2) the predictive analytics module 610 can analyze a portion of a sentence and predict one or more elements defined in the design library 124 that may complete the sentence based on, e.g., past textual requests generated by one or more users; and (3) the predictive analytics module 610 utilizes components of the AI library 150 in order to generate suggested completions; (4) the AI library 150 can be utilized to generate a database that associates certain words with suggested completions that follow those words.  BLISS further teaches in ¶¶ [0077]-[0082] with FIGS. 7A-B that (1) the AI library 150 can be utilized to generate a database that associates certain words with suggested completions that follow those words; (2) the user queues 710 are configured to track changes to the source code of the application framework 502 in order to train one or more components of the AI library 150; (3) a machine learning algorithm 152 can be trained based on the feedback tracked by the user queues 710 to help change the output of future solution design blueprints 306 to more closely match the resulting source code used in the application 702; (4) logic is configured to analyze the user queues 710 to adjust the parameters of a neural network 154; (5) the modifications in the user queues 710 are related to a loss value that reflects changes in the desired output of the neural network 154; (6) the loss value can be estimated based on the changes in the user queues 710 and  used to adjust the parameters of the neural network 154, which is a form of supervised learning; (7) at step 752, a solution design blueprint is consumed to generate an application framework 502; (8) at step 754, a work flow of the user is collected while the user develops an application using the application framework 502; (9) at step 756, at least one machine learning algorithm is adjusted based on an analysis of the work flow; (10) logic is configured to analyze the modifications included in the work flow to determine an optimum application framework that is ultimately utilized for development of the application 702; (11) a difference between the optimum application framework and the application framework produced by the intelligent design generator 130 can be utilized to adjust the at least one machine learning algorithm; and (12) the machine learning algorithm can utilize a neural network and the difference is used to modify the parameters of the neural network.
Yuan et al. (US 11,599,813 B1, filed on 09/26/2019) discloses in ABSTRACT and Col. 2, line 21 – Col. 3, line 27 that (1) interactive workflow generation for machine learning lifecycle management; (2) determines one or more prompts associated with use of a machine learning model; (3) input representing one or more responses to the one or more prompts is received; (4) the one or more responses are provided via a user interface; (5) determine one or more workflows associated with the machine learning model; (6) the workflow(s) are determined based at least in part on the one or more responses; (7) the workflow(s) comprise a plurality of tasks associated with use of the machine learning model at a plurality of stages of a lifecycle of the mode; (8) one or more computing resources are determined, and at least a portion of the workflow(s) is performed using the one or more computing resources; (9) inefficient approach often led to a loss of productivity as well as potential quality problems with machine learning models due to a lack of enforcement for tasks such as versioning and approval; (10) manage an end-to-end machine learning model development lifecycle (MDLC) for rapid design, training, and productionization of high-quality machine learning models; (11) implement workflows or blueprints for various stages of a lifecycle of a machine learning model, such as data sourcing, quality monitoring, feature engineering, model training, back-testing, evaluation and promotion, deployment (e.g., to produce inferences), and performance monitoring; (12) by building and implementing workflows for an end-to-end machine learning lifecycle, the machine learning management system may accelerate the delivery of machine learning solutions while also improving the quality of machine learning models with versioning, auditing, and approval mechanisms; (13) implement workflows using a set of workflow templates along with a step library that represents machine learning tasks; (14) workflows may be determined dynamically based (at least in part) on interactions between the machine learning management system and users (e.g., machine learning scientists, developers, and so on) via a user interface; (15) the user interface may display or present a set of prompts or questions, and the user's answers to those questions may determine the presentation of additional prompts as well as the selection of one or more workflow templates and configuration of one or more workflows from the selected template(s); (16) the machine learning management system may determine and/or provision computing resources for performing a particular workflow; (17) generate a resource template that describes resources and their architecture, wherein the resource template may be merged into a continuous deployment pipeline so that the workflow can be performed using the provisioned resources in a multi-tenant provider network; (18) workflows may be implemented using orchestration of various services of the provider network; (19) models may be maintained in a model registry, and changes to workflows or steps in the step library may be automatically deployed to existing workflows; and (20) achieving certain technical advantages, including some or all of the following: (a) reducing the consumption of computing resources by accelerating processes for development, deployment, and use of machine learning models; (b) reducing the consumption of computing resources for development, deployment, and use of machine learning models by provisioning resources on demand and not requiring developers to maintain dedicated fleets of resources; (c) improving the quality of machine learning models by permitting concurrent testing of different versions of a model as sourced from a model registry; (d) improving the quality of machine learning models using versioning, auditing, and approval mechanisms; and (e) improving the use of storage, network, and computational resources by re-use of workflow steps for different workflows and different users; and so on.  Yuan further discloses in Col. 10, line 35 – Col. 12, line 4 with FIG. 2 that (1) one or more prompts may be presented to a user via a user interface, wherein the prompt(s) may be determined by an interactive workflow builder of a machine learning management system and sent over a network to a computing device operated by the user; (2) the prompt( s) may represent questions associated with the use of a machine learning model; (3) the prompt(s) may solicit the selection of a machine learning process such as model training, batch inference, or real-time inference, e.g., by presenting a list of such processes; (4) one or more responses to the prompt(s) may be received via the user interface, wherein the response(s) may be entered by a user into a computing device via appropriate input and sent over a network to the interactive workflow builder; (5) the response(s) may represent answers to the questions associated with the use of a machine learning model; (6) the response(s) may include a selection of a machine learning process such as model training, batch inference, or real-time inference; (7) a user may select multiple workflow templates to represent an end-to-end machine learning lifecycle, such as one workflow template for ad-hoc training, one workflow template for scheduled production batch inferencing, and another workflow template for model performance monitoring; (8) the interactive workflow builder may determine, select, or otherwise configure one or more workflows based (at least in part) on the response(s); e.g., to begin building a workflow for model training, the interactive workflow builder may select a model training workflow template from a library of workflow templates; (9) a workflow may include a set of steps or other ordered elements representing tasks to be performed in association with a machine learning model; e.g., workflows or their steps may relate to data sourcing, quality monitoring, feature engineering, model training, back-testing, evaluation and promotion, deployment ( e.g., to produce inferences), and performance monitoring; (10) the interactive workflow builder may determine a customized workflow for the user based (at least in part) on the user's responses to the prompts and based (at least in part) on a selected workflow template; (11) the one or more workflows may define workflow steps at a plurality of stages of the lifecycle of the machine learning model, such as data sourcing, data quality monitoring, feature engineering, model development, model training and evaluation, model deployment and inference, model monitoring, and so on; (12) the interactive workflow builder may determine whether the workflow(s) are complete; (13) a workflow may be deemed complete when the interactive workflow builder has no additional questions to ask the user concerning the workflow being generated; (14) if one or more of the workflows are not complete, the method may continue with the operations shown in 200, 210, and 220; (15) the interactive workflow builder may determine additional prompt(s) to solicit configuration information for customizing a workflow template for the specific needs of the user; (16) the user interface may solicit input representing a training input dataset, references for a data collection step, data quality rules for a quality monitoring step, a specific model training algorithm for a training step, model approval rules for an approval step, locations at which to store outputs of various steps, a specific version of a model to be used for inference, and so on; (17) determine and/or provision one or more computing resources to perform the generated workflow(s); (18) in determining the resources, the machine learning management system may select or determine resource types, resource numbers, and resource configurations; (19) the machine learning management system may provision the resources from one or more resource pools of a multi-tenant provider network, which may include selecting the resource, reserving the resource for use by a particular account, configuring the resource to perform the desired task(s), and so on; (20) the resource pools may include compute instances, storage resources, and other resources usable to perform machine learning tasks; (21) to provision resources, the machine learning management system may interact with a resource manager of the provider network to select and reserve particular resources for use in performing workflows (or portions thereof) on behalf of particular users; (22) the steps of the workflow(s) may be performed using the selected computing resource(s); (23) the machine learning management system may generate a resource template that describes resources and their architecture for a particular workflow, wherein the resource template may be provided to a cloud-based resource management service offered by the provider network; (24) the resource template may be merged into a continuous deployment pipeline so that the workflow can be performed using the provisioned resources in the multi-tenant provider network; and (25) workflows may be implemented using orchestration of various services of the provider network; e.g., services in the provider network that implement machine learning tasks may include virtualized compute services that offer virtual compute instances, virtualized storage services that offer virtual storage resources, virtualized graphics processing services that offer virtual graphics processing units (GPUs), serverless computation services that execute code on behalf of clients, batch computing services that run batch computing jobs, machine learning endpoints in a machine learning framework, and so on.
However, closest arts of records, as discussed above, singly or in combination do not teach or suggest at least following features "receive, via a graphical user interface from a client device, a request to search for one or more blueprints including one or more models to add to a project configured to deploy the one or more models trained via machine learning; identify, based on a selection received from the client device via the graphical user interface, a list of features with which to execute the requested search; provide, responsive to execution of the search with the list of features, a blueprint comprising a model selected from a plurality of projects established via input from a plurality of client devices different from the client device, the plurality of projects including a plurality of blueprints, the plurality of blueprints including a plurality of models trained by machine learning to determine a target based on a list of features; train, via machine learning, the model of the blueprint to determine the target and add the blueprint including the trained model to the project; and generate data causing the graphical user interface to display an indication of the blueprint including the trained model"  when combining with all other limitations of the claim as a whole.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HWEI-MIN LU whose telephone number is (313)446-4913. The examiner can normally be reached Mon - Fri: 9:00 AM - 6:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela D. Reyes can be reached at (571) 270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HWEI-MIN LU/Primary Examiner, Art Unit 2142
Read full office action
Prosecution Timeline

May 31, 2023
Application Filed
Feb 12, 2026
Non-Final Rejection — §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/737,938
Patent 12602578
LIGHT SOURCE COLOR COORDINATE ESTIMATION SYSTEM AND DEEP LEARNING METHOD THEREOF
2y 5m to grant Granted Apr 14, 2026
17/804,513
Patent 12596954
MACHINE LEARNING FOR MANAGEMENT OF POSITIONING TECHNIQUES AND RADIO FREQUENCY USAGE
2y 5m to grant Granted Apr 07, 2026
17/231,757
Patent 12591770
PREDICTING A STATE OF A COMPUTER-CONTROLLED ENTITY
2y 5m to grant Granted Mar 31, 2026
17/662,568
Patent 12579466
DYNAMIC USER-INTERFACE COMPARISON BETWEEN MACHINE LEARNING OUTPUT AND TRAINING DATA
2y 5m to grant Granted Mar 17, 2026
17/805,377
Patent 12561222
REDUCING BIAS IN MACHINE LEARNING MODELS UTILIZING A FAIRNESS DEVIATION CONSTRAINT AND DECISION MATRIX
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
62%
Grant Probability
99%
With Interview (+39.5%)
3y 1m
Median Time to Grant
Low
PTA Risk
Based on 217 resolved cases by this examiner. Grant probability derived from career allow rate.