Last updated: April 19, 2026
Application No. 18/063,715
DATA MODELING AND PROCESSING TECHNIQUES FOR GENERATING PREDICTIVE METRICS

Final Rejection §101§103
Filed
Dec 09, 2022
Examiner
LEE, PO HAN
Art Unit
3623
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Optum Inc.
OA Round
2 (Final)
Interview Optional

— +41.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 158 resolved cases, 2023–2026
Examiner Intelligence

LEE, PO HAN View full profile →
Grants only 32% of cases
Career Allow Rate
51 granted / 158 resolved
-19.7% vs TC avg
Strong +41% interview lift
Without
With
+41.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 6m
Avg Prosecution
50 currently pending
Career history
208
Total Applications
across all art units
Statute-Specific Performance

§101
40.9%
+0.9% vs TC avg
§103
31.3%
-8.7% vs TC avg
§102
11.4%
-28.6% vs TC avg
§112
14.8%
-25.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 158 resolved cases
Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Status of the Application
The following is a Final Office Action. 

In response to Examiner's communication of 11/5/2025, Applicant responded on 2/2/2026. Amended claim  1, 3-5, and 7-20. 

Claims 1-20 are pending in this application and have been examined. 









Response to Amendment

Applicant's amendments to claims 1, 3-5, and 7-20 are sufficient to overcome the claim objections set forth in the previous action.  The claim objections are withdrawn. 

Applicant's amendments to claims 1, 3-5, and 7-20 are not sufficient to overcome the 35 USC 101 rejections set forth in the previous action.  

Applicant's amendments to claims 1, 3-5, and 7-20 are not sufficient to overcome the prior art rejections set forth in the previous action.  




Response to Arguments – 35 USC § 101
Applicant’s arguments with respect to the rejections have been fully considered, but they are not persuasive. 

Applicant submits, “…the independent claims recite a training technique for a machine learning model that improves the processing time of a developmental process. See e.g., Specification, as filed,    [0020]-[0021] and [0127]. As such, the machine learning techniques recited by the claims conform with at least one of the examples provided by the Advance notice of change to the MPEP in light of Ex Parte Desjardins Memorandum, December 5, 2025 (hereinafter Desjardins Memorandum).…As amended, claim 1 recites a process for generating a predictive metric data object and training a machine learning model to optimize this predictive metric data object that improves the performance of a developmental process. The human mind cannot practically (i) receive input data objects, (ii) generate input data object parameters, (iii) generate variance and weighting metrics, or (iv) train or store a machine learning model using the predictive metric data object as a loss function. Accordingly, no element of claim 1, as amended, under its broadest reasonable interpretation may be considered a mental process…Additionally, although mathematical elements may be referenced by the claims, the claims are not directed to a mathematical concept. They are directed to a process for training a machine learning model to optimize a loss function and enhance the efficiency of a developmental process…Claim 1 recites a machine learning training technique that adjusts the parameters of a machine learning model to improve a computer component (i.e., the processing speeds thereof), which has been affirmatively recognized as an improvement sufficient to integrate a judicial limitation into a practical application. Thus, even if claim 1 were directed to an abstract idea-which, Applicant submits, it is not- the claim recites a combination of additional elements that improves a technical field such that the claim as a whole integrates any alleged abstract idea into a practical application that is patent eligible under 35 U.S.C. § 101…As amended, claim 1 recites a process for training and storing a machine learning model to optimize a loss function of a predictive metric data object in order to evaluate and improve the performance of a developmental process. This is an improvement to the training of a machine learning model that also improves the efficiency of a developmental process. Thus, claim 1 recites an improvement in machine learning training that improves the performance of a computer - not an underlying abstract idea….” The Examiner respectfully disagrees.

While Applicant’s amendments further prosecution, unlike the Desjardins, the claims and the argued elements, are directed to, …a process for generating a predictive metric data…to optimize this predictive metric data object that improves the performance of a developmental process…to optimize a loss function and enhance the efficiency of a developmental process…to evaluate and improve the performance of a developmental process…, is a problem directed to mental process (i.e. human training and using mathematical modeling with mathematical weighing, mathematical variance, mathematical prediction, mathematical loss function, to evaluate and develop more efficient mental methods and mental processes), mathematical concepts (i.e. human training and using mathematical modeling with mathematical weighing, mathematical variance, mathematical prediction, mathematical loss function, to evaluate and develop more efficient mental methods and mental processes), as established in Step 2A Prong 1. This problem does not specifically arise in the realm of computer technology, but rather, this problem existed and was addressed long before the advent of computers. Thus, the claims do not recite a technical improvement to a technical problem. Additionally, pursuant to the broadest reasonable interpretation, as an ordered combination, each of the additional elements are computing elements recited at high level of generality implementing the abstract idea, and thus, are no more than applying the abstract idea with generic computer components, i.e. computer and machine learning. Further, these additional elements generally link the abstract idea to a technical environment, namely the environment of a computer, performing extra solution activities. Therefore, as a whole, the additional elements do not integrate the abstract ideas into a practical application in Step 2A Prong 2 or amount to significantly more in Step 2B.
Even novel and newly discovered judicial exceptions are still exceptions, despite their novelty. July 2015 Update, p. 3; see SAP America Inc. v. Investpic, LLC, No. 2017-2081, slip op. at 2 (Fed Cir. May 15, 2018). 
Simply reciting specific limitations that narrow the abstract idea does not make an abstract idea non-abstract. 79 Fed. Reg. 74631; buySAFE Inc. v. Google, Inc., 765 F.3d 1350, 1355 (2014); see SAP America at p. 12. As discussed in SAP America, no matter how much of an advance the claims recite, when “the advance lies entirely in the realm of abstract ideas, with no plausibly alleged innovation in the non-abstract application realm,” “[a]n advance of that nature is ineligible for patenting.” Id. at p. 3.
Use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) does not integrate a judicial exception into a practical application or provide significantly more. See Affinity Labs v. DirecTV, 838 F.3d 1253, 1262, 120 USPQ2d 1201, 1207 (Fed. Cir. 2016) (cellular telephone); TLI Communications LLC v. AV Auto, LLC, 823 F.3d 607, 613, 118 USPQ2d 1744, 1748 (Fed. Cir. 2016) (computer server and telephone unit). Similarly, “claiming the improved speed or efficiency inherent with applying the abstract idea on a computer” does not integrate a judicial exception into a practical application or provide an inventive concept. Intellectual Ventures I LLC v. Capital One Bank (USA), 792 F.3d 1363, 1367, 115 USPQ2d 1636, 1639 (Fed. Cir. 2015).



Response to Arguments – Prior Art
Applicant’s arguments with respect to the rejections have been fully considered, but they are not persuasive. 

Applicant submits, “…The cited references do not disclose, teach, or suggest "receiving a plurality of input data objects associated with a developmental process, wherein an input data object of the plurality of input data objects comprises a plurality of input data object parameters, and the plurality of input data object parameters comprises a contextual attribute and a predictive metric attribute" or "training, using the predictive metric data object, a machine learning model to generate a processing optimization action to reduce a processing time associated with the developmental process; and storing the machine learning model in association with the developmental process."…The cited references do not disclose, teach, or suggest "receiving a plurality of input data objects associated with a developmental process, wherein an input data object of the plurality of input data objects comprises a plurality of input data object parameters, and the plurality of input data object parameters comprises a contextual attribute and a predictive metric attribute." The references do not teach consolidated values with specific contextual and predictive attributes in this way….the cited references do not disclose, teach, or suggest "training, using the predictive metric data object, a machine learning model to generate a processing optimization action to reduce a processing time associated with the developmental process; and storing the machine learning model in association with the developmental process." The references do not train and store a machine learning model in this way, especially wherein the optimization of the predictive metric data object involves optimizing a loss function….” The Examiner respectfully disagrees.

Examiner respectfully notes, the claims do not recite or require “optimizing a loss function”.

Under the broadest reasonable interpretation, Schoedl teaches: 
receiving, by one or more processors, a plurality of input data objects associated with a developmental process, wherein an input data object of the plurality of input data objects comprises a plurality of input data object parameters, and the plurality of input data object parameters comprises a contextual attribute and a predictive metric attribute;(in at least [0063] the prediction logic may be used for predicting whether a particular drug D1 having a target molecule T1 will be approved by the FDA for treating disease D and may in addition be used for prediction whether a particular drug D2 having a target molecule T2 will be approved by the FDA for treating the disease D. Thus, the input data for the two predictions may differ because the names of the drug targets T1, T2 differ. The mobile device displays a prediction list on the display of the mobile device. Each list item represents one of the received prediction results and comprises at least a thumbnail-analog scale icon graphically representing said prediction result.  [0132] receive some input data 969, e.g. a specification of the name (i.e. contextual attribute) of one or more target molecules of the drug of interest (i.e. predictive metric attribute). The prediction unit 955 can then analyze the currently available literature for identifying documents or document abstracts mentioning the names of the one or more target molecules as well as the name of the disease to be treated and analyzing meta-data associated with the identified documents. For example, the predictor can analyze the author names, publication date, cross references to other documents, the names of diseases, metabolites, genes or drugs mentioned in the documents for extracting a plurality of literature-based features for the one or more target molecules provided as input.)
…
training, by the one or more processors and using the predictive metric data object, a machine learning model to generate a processing optimization action to reduce a processing time associated with the developmental process; and (in at least [0051] The method comprises sending the first prediction results selectively to the mobile devices of the users to which the prediction tasks for which the first prediction results were generated are assigned. In response to each re-training of the machine learning logic, the machine learning logic automatically performs each of the prediction tasks a further time, thereby respectively using the updated version of the biomedical model for generating a second prediction result. Then, the method comprises sending the second prediction results or a notification of their computation selectively to the mobile devices of the users to which the prediction tasks for which the first prediction results were generated are assigned. [0064] the thumbnail analog scale icons may provide a user with an intuitive impression of the prediction result and the quality of the model used for the prediction. The user is enabled to manage technical task, such as comparing and interpreting a plurality of biomedical prediction results generated by one or many models or model versions, in a more efficient and accurate manner. [0065] allow identifying the most appropriate target and, in general, to compare and evaluate different use case and data input scenarios to find the best solution for a particular biomedical task, e.g. the task of identifying a drug and/or a drug target. In this case, the same model is used for performing the predictions. [0074] repeatedly performing, by the program logic, the generation of the prediction result for the biomedical prediction task, thereby using repeatedly updated versions of the biomedical model. The method further comprises visualizing the change of the accuracy of the repeatedly updated biomedical model in the form of a moving image of the analog scale icon, wherein the size of the first and second sub-range indicators, the direction of the pointer and/or the size of the variance bar, if any, vary in the moving image over time. [0131] FIG. 2A depicts the generation and use of a predictive biomedical model according to an embodiment of the invention. For example, the model 958 can be an implicit model of an artificial neural network that was learned implicitly by the network 956 in a training phase. The machine learning logic 956 can be a prediction logic having been trained to predict, based on an analysis of biomedical literature, whether a particular drug will be accepted by the FDA as a treatment for a particular disease or not. For example, the network 956 can initially be trained on a large literature corpus such as the MEDLINE literature database used as training data 966. During the training of the machine learning logic on the training data 966, the model 958 is explicitly or implicitly learned. The functionality of learning a model 958 from training data 966 is illustrated as model generation unit 957, although the model generation process may be an implicit part of the machine learning logic 956 that has not been explicitly specified by a human programmer. The MLL can be implemented using a large variety of programming techniques and/or readily available machine learning tools, libraries and modules. In some embodiments, the logic for training the model and for applying the trained model on some new input data can be implemented in different program modules. In some other embodiments, the biological model is an integral part of the program logic that is trained and/or that performs the prediction, so it may not be possible to separate the biological model from the program logic that generates or uses it. For example, the model may be based on a neural network architecture configured to receive input data and features of a particular type and whose weights in the different network layers have been adapted during the training phase thus that the trained neural network architecture is able to perform a prediction based on new input data that corresponds to the structure and type of data used in the training phase. [0132] Once the model 958 has been generated, the machine learning logic 956 can use the model 958 to solve a particular prediction problem. For example, a model-based prediction unit 955 can receive some input data 969, e.g. a specification of the name of one or more target molecules of the drug of interest. The prediction unit 955 can then analyze the currently available literature for identifying documents or document abstracts mentioning the names of the one or more target molecules as well as the name of the disease to be treated and analyzing meta-data associated with the identified documents. For example, the predictor can analyze the author names, publication date, cross references to other documents, the names of diseases, metabolites, genes or drugs mentioned in the documents for extracting a plurality of literature-based features for the one or more target molecules provided as input. The feature extraction can be a data analysis step that is explicitly or implicitly specified in the code of the prediction logic 956. The extracted features are then used as input for the model 958 which generates a prediction whether the drug whose targets were provided as input 969 will be approved by the FDA in the future as a treatment for a particular disease or not. The feature extraction can also be performed in the training phase for extracting features from the training data that are actually fed into the model to be trained. [0134] If the prediction result is that the FDA will with 100% likelihood approve the drug, the prediction score (that may be optionally normalized) is, for example, 1. If the prediction result is that the FDA will with 100% likelihood reject approval of the drug, the prediction score (that may be optionally normalized) is, for example, −1. Typically, the prediction score will have a numerical number greater than the minimum value of the scale (greater than −1) and smaller than the maximum value of the scale (smaller than +1).)
storing, by the one or more processors, the machine learning model in association with the developmental process. (in at least [0136] the model generation based on the training data is performed fully automatically, e.g. within a computer implemented model generation and update framework. For example, the training data 966 can be updated and supplemented with additional data on a regular basis, e.g. once a week or once a month. This may be highly advantageous in biomedical domains where the amount of available data is rapidly increasing. This is the case for example for biomedical literature data. The model generation and update framework is preferably configured such that whenever the training data 966 is supplemented with additional training data or is modified by removing or replacing some parts of the training data, the machine learning logic 956 is automatically re-trained on the updated version of the training data 966. Thereby, also an updated version of the biomedical model 958 is automatically generated. If the updated version of the model is used for computing the same prediction on the same inputs data 969 a further time, the prediction result will differ from the previously generated prediction results, because the model has integrated additional, new knowledge that may have an impact on the outcome of a prediction.)


Claim Rejections – 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  

Claim 1 (similarly 13, 17) recite, “A … method for implementing an automatic data processing scheme for evaluating robust data sets to optimize procedure efficiency, the … method comprising:
receiving, by …, a plurality of input data objects associated with a developmental process, wherein an input data object of the plurality of input data objects comprises a plurality of input data object parameters, and the plurality of input data object parameters comprises a contextual attribute and a predictive metric attribute;
generating, …, a weighting-based input data object parameter for an input data object cohort comprising a subset of the plurality of input data objects associated with a predictive entity, wherein the predictive entity describes a common contextual attribute of the plurality of input data objects; 
generating, …, a variance-based input data object cohort parameter for the input data object cohort; 
generating, …, a predictive variance metric for an input data object of the input data object cohort based at least in part on the variance-based input data object cohort parameter; 
generating, …, a predictive weighting metric for the input data object based at least in part on the weighting-based input data object parameter; 
generating, …, a predictive metric data object that represents an indication of efficiency of the developmental process based at least in part on the predictive variance metric and the predictive weighting metric for the input data object; 
training, by the … and using the predictive metric data object, a … model to generate a processing optimization action to reduce a processing time associated with the developmental process; and
storing, by the …, the … model in association with the developmental process.”  


Analyzing under Step 2A, Prong 1:
The limitations regarding, …receiving, by …, a plurality of input data objects associated with a developmental process, wherein an input data object of the plurality of input data objects comprises a plurality of input data object parameters, and the plurality of input data object parameters comprises a contextual attribute and a predictive metric attribute; generating, …, a weighting-based input data object parameter for an input data object cohort comprising a subset of the plurality of input data objects associated with a predictive entity, wherein the predictive entity describes a common contextual attribute of the plurality of input data objects; generating, …, a variance-based input data object cohort parameter for the input data object cohort; generating, …, a predictive variance metric for an input data object of the input data object cohort based at least in part on the variance-based input data object cohort parameter; generating, …, a predictive weighting metric for the input data object based at least in part on the weighting-based input data object parameter; generating, …, a predictive metric data object that represents an indication of efficiency of the developmental process based at least in part on the predictive variance metric and the predictive weighting metric for the input data object; training, by the … and using the predictive metric data object, a … model to generate a processing optimization action to reduce a processing time associated with the developmental process; and storing, by the …, the … model in association with the developmental process…, under the broadest reasonable interpretation, can include a human using their mind and using pen and paper to perform the identified limitations; therefore, the claims are directed to a mental process. 

Further, …receiving, by …, a plurality of input data objects associated with a developmental process, wherein an input data object of the plurality of input data objects comprises a plurality of input data object parameters, and the plurality of input data object parameters comprises a contextual attribute and a predictive metric attribute; generating, …, a weighting-based input data object parameter for an input data object cohort comprising a subset of the plurality of input data objects associated with a predictive entity, wherein the predictive entity describes a common contextual attribute of the plurality of input data objects; generating, …, a variance-based input data object cohort parameter for the input data object cohort; generating, …, a predictive variance metric for an input data object of the input data object cohort based at least in part on the variance-based input data object cohort parameter; generating, …, a predictive weighting metric for the input data object based at least in part on the weighting-based input data object parameter; generating, …, a predictive metric data object that represents an indication of efficiency of the developmental process based at least in part on the predictive variance metric and the predictive weighting metric for the input data object; training, by the … and using the predictive metric data object, a … model to generate a processing optimization action to reduce a processing time associated with the developmental process; and storing, by the …, the … model in association with the developmental process…, are mathematical concepts. 

	Accordingly, the claims are directed to a mental process, mathematical concepts, and thus, the claims are directed to an abstract idea under the first prong of Step 2A.


Analyzing under Step 2A, Prong 2:
This judicial exception is not integrated into a practical application under the second prong of Step 2A. 
In particular, the claims recite the additional elements beyond the recited abstract idea identified under Step 2A, Prong 1, such as:
	
Claim 1, 13, 17: computer-implemented, by one or more processors, A system comprising: one or more processors; and one or more memories storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to, One or more non-transitory computer-readable media storing processor-executable instructions that, when executed by one or more processors, cause the one or more processors to, machine learning

	, and pursuant to the broadest reasonable interpretation, as an ordered combination, each of the additional elements are computing elements recited at high level of generality implementing the abstract idea, and thus, are no more than applying the abstract idea with generic computer components. Further, these additional elements generally link the abstract idea to a technical environment, namely the environment of a computer.

Additionally, with respect to, “…receiving …generating… input…storing...”, “…providing …”,  these elements do not add a meaningful limitations to integrate the abstract idea into a practical application because they are extra-solution activity, pre and post solution activity - i.e. data gathering – “…receiving…generating… input… storing…”,  data output –  “…providing …”


Analyzing under Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under Step 2B. 
As noted above, the aforementioned additional elements beyond the recited abstract idea are not sufficient to amount to significantly more than the recited abstract idea because, as an order combination, the additional elements are no more than mere instructions to implement the idea using generic computer components (i.e. apply it). 
Additionally, as an order combination, the additional elements append the recited abstract idea to well-understood, routine, and conventional activities in the field as individually evinced by the applicant’s own disclosure, as required by the Berkheimer Memo, in at least:
[0014]   FIG. 2 provides an example predictive data analysis computing entity 106 in accordance with some embodiments discussed herein. In general, the terms computing entity, computer, entity, device, system, and/or similar words used herein interchangeably may refer to, for example, one or more computers, computing entities, desktops, mobile phones, tablets, notebooks, laptops, distributed systems, kiosks, input terminals, servers or server networks, blades, gateways, switches, processing devices, processing entities, set-top boxes, relays, routers, network access points, base stations, the like, and/or any combination of devices or entities adapted to perform the functions, steps/operations, and/or processes described herein. Such functions, steps/operations, and/or processes may include, for example, transmitting, receiving, operating on, processing, displaying, storing, determining, creating/generating, monitoring, evaluating, comparing, and/or similar terms used herein interchangeably. In one embodiment, these functions, steps/operations, and/or processes may be performed on data, content, information, and/or similar terms used herein interchangeably. 
[0015]   The predictive data analysis computing entity 106 may include a network interface 208 for communicating with various computing entities, such as by communicating data, content, information, and/or similar terms used herein interchangeably that may be transmitted, received, operated on, processed, displayed, stored, and/or the like. 
[0016]   In one embodiment, the predictive data analysis computing entity 106 may include or be in communication with a processing element 202 (also referred to as processors, processing circuitry, and/or similar terms used herein interchangeably) that communicate with other elements within the predictive data analysis computing entity 106 via a bus, for example. As will be understood, the processing element 202 may be embodied in a number of different ways including, for example, as at least one processor/processing apparatus, one or more processors/processing apparatuses, and/or the like. 
[0017]   For example, the processing element 202 may be embodied as one or more complex programmable logic devices (CPLDs), microprocessors, multi-core processors, coprocessing entities, application-specific instruction-set processors (ASIPs), microcontrollers, and/or controllers. Further, the processing element 202 may be embodied as one or more other processing devices or circuitry. The term circuitry may refer to an entirely hardware embodiment or a combination of hardware and computer program products. Thus, the processing element 202 may be embodied as integrated circuits, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), programmable logic arrays (PLAs), hardware accelerators, other circuitry, and/or the like. 
[0018] As will therefore be understood, the processing element 202 may be configured for a particular use or configured to execute instructions stored in one or more memory elements including, for example, one or more volatile memories 206 and/or non-volatile memories 204202. As such, whether configured by hardware or computer program products, or by a combination thereof, the processing element 202 may be capable of performing steps or operations according to embodiments of the present disclosure when configured accordingly. The processing element 202, for example in combination with the one or more volatile memories 206 and/or or non-volatile memories 204, may be capable of implementing one or more computer-implemented methods described herein. In some embodiments, the predictive data analysis computing entity 106 may include a computing apparatus, the processing element 202 may include at least one processor of the computing apparatus, and the one or more volatile memories 206 and/or non-volatile memories 204 may include at least one memory including program code. The at least one memory and the program code may be configured to, upon execution by the at least one processor, cause the computing apparatus to perform one or more steps/operations described herein. 
[0019]   The non-volatile memories 204 (also referred to as non-volatile storage, memory, memory storage, memory circuitry, media, and/or similar terms used herein interchangeably) may include at least one non-volatile memorydevice204, including but not limited to hard disks, ROM, PROM, EPROM, EEPROM, flash memory, MMCs, SD memory cards, Memory Sticks, CBRAM, PRAM, FeRAM, NVRAM, MRAM, RRAM, SONOS, FJG RAM, Millipede memory, racetrack memory, and/or the like. 
[0020]   As will be recognized, the non-volatile memories 204 may store databases, database instances, database management systems, data, applications, programs, program modules, scripts, source code, object code, byte code, compiled code, interpreted code, machine code, executable instructions, and/or the like. The term database, database instance, database management system, and/or similar terms used herein interchangeably may refer to a collection of records or data that is stored in a computer-readable storage medium using one or more database models, such as a hierarchical database model, network model, relational model, entity-relationship model, object model, document model, semantic model, graph model, and/or the like. [0021]   The one or more volatile memories (also referred to as volatile storage, memory, memory storage, memory circuitry, media, and/or similar terms used herein interchangeably) may include at least one volatile memory 206device, including but not limited to RAM, DRAM, SRAM, FPM DRAM, EDO DRAM, SDRAM, DDR SDRAM, DDR2 SDRAM, DDR3 SDRAM, RDRAM, TTRAM, T-RAM, Z-RAM, RIMM, DIMM, SIMM, VRAM, cache memory, register memory, and/or the like. 
[0022] As will be recognized, the volatile memories 206 may be used to store at least portions of the databases, database instances, database management systems, data, applications, programs, program modules, scripts, source code, object code, byte code, compiled code, interpreted code, machine code, executable instructions, and/or the like being executed by, for example, the processing element 202. Thus, the databases, database instances, database management systems, data, applications, programs, program modules, scripts, source code, object code, byte code, compiled code, interpreted code, machine code, executable instructions, and/or the like may be used to control certain embodiments of the operation of the predictive data analysis computing entity 106 with the assistance of the processing element 202.
[0023]   As indicated, in one embodiment, the predictive data analysis computing entity 106 may also include the network interface 208 for communicating with various computing entities, such as by communicating data, content, information, and/or the like that may be transmitted, received, operated on, processed, displayed, stored, and/or the like. Such communication data may be executed using a wired data transmission protocol, such as fiber distributed data interface (FDDI), digital subscriber line (DSL), Ethernet, asynchronous transfer mode (ATM), frame relay, data over cable service interface specification (DOCSIS), or any other wired transmission protocol. Similarly, the predictive data analysis computing entity 106 may be configured to communicate via wireless client communication networks using any of a variety of protocols, such as general packet radio service (GPRS), Universal Mobile Telecommunications System (UMTS), Code Division Multiple Access 2000 (CDMA2000), CDMA2000 1X (1xRTT), Wideband Code Division Multiple Access (WCDMA), Global System for Mobile Communications (GSM), Enhanced Data rates for GSM Evolution (EDGE), Time Division-Synchronous Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), Evolved Universal Terrestrial Radio Access Network (E-UTRAN), Evolution-Data Optimized (EVDO), High Speed Packet Access (HSPA), High-Speed Downlink Packet Access (HSDPA), IEEE 802.11 (Wi-Fi), Wi-Fi Direct, 802.16 (WiMAX), ultra-wideband (UWB), infrared (IR) protocols, near field communication (NFC) protocols, Wibree, Bluetooth protocols, wireless universal serial bus (USB) protocols, and/or any other wireless protocol. 
[0024]   FIG. 3 provides an example external computing entity 102A in accordance with some embodiments discussed herein. In general, the terms device, system, computing entity, entity, and/or similar words used herein interchangeably may refer to, for example, one or more computers, computing entities, desktops, mobile phones, tablets, phablets, notebooks, laptops, distributed systems, kiosks, input terminals, servers or server networks, blades, gateways, switches, processing devices, processing entities, set-top boxes, relays, routers, network access points, base stations, the like, and/or any combination of devices or entities adapted to perform the functions, steps/operations, and/or processes described herein. The external computing entities 102A-N may be operated by various parties. As shown in FIG. 3, the external computing entity 102A may include an antenna 312, a transmitter 304 (e.g., radio), a receiver 306 (e.g., radio), and/or an external entity processing element 308 (e.g., CPLDs, microprocessors, multi-core processors, coprocessing entities, ASIPs, microcontrollers, and/or controllers) that provides signals to and receives signals from the transmitter 304 and the receiver 306, correspondingly. As will be understood, the external entity processing element 308 may be embodied in a number of different ways including, for example, as at least one processor/processing apparatus, one or more processors/processing apparatuses, and/or the like as described herein with reference the processing element 202.
[0025]   The signals provided to and received from the transmitter 304 and the receiver 306, correspondingly, may include signaling information/data in accordance with air interface standards of applicable wireless systems. In this regard, the external computing entity 102A may be capable of operating with one or more air interface standards, communication protocols, modulation types, and access types. More particularly, the external computing entity 102A may operate in accordance with any of a number of wireless communication standards and protocols, such as those described above with regard to the predictive data analysis computing entity 106. In a particular embodiment, the external computing entity 102A may operate in accordance with multiple wireless communication standards and protocols, such as UMTS, CDMA2000, 1xRTT, WCDMA, GSM, EDGE, TD- SCDMA, LTE, E-UTRAN, EVDO, HSPA, HSDPA, Wi-Fi, Wi-Fi Direct, WiMAX, UWB, IR, NFC, Bluetooth, USB, and/or the like. Similarly, the external computing entity 102A may operate in accordance with multiple wired communication standards and protocols, such as those described above with regard to the predictive data analysis computing entity 106 via an external entity network interface 320. 
[0026]   Via these communication standards and protocols, the external computing entity 102A may communicate with various other entities using means such as Unstructured Supplementary Service Data (USSD), Short Message Service (SMS), Multimedia Messaging Service (MMS), Dual- Tone Multi-Frequency Signaling (DTMF), and/or Subscriber Identity Module Dialer (SIM dialer). The external computing entity 102A may also download changes, add-ons, and updates, for instance, to its firmware, software (e.g., including executable instructions, applications, program modules), operating system, and/or the like. 
[0027]   According to one embodiment, the external computing entity 102A may include location determining embodiments, devices, modules, functionalities, and/or the like. For example, the external computing entity 102A may include outdoor positioning embodiments, such as a location module adapted to acquire, for example, latitude, longitude, altitude, geocode, course, direction, heading, speed, universal time (UTC), date, and/or various other information/data. In one embodiment, the location module may acquire data such as ephemeris data, by identifying the number of satellites in view and the relative positions of those satellites (e.g., using global positioning systems (GPS)). The satellites may be a variety of different satellites, including Low Earth Orbit (LEO) satellite systems, Department of Defense (DOD) satellite systems, the European Union Galileo positioning systems, the Chinese Compass navigation systems, Indian Regional Navigational satellite systems, and/or the like. This data may be collected using a variety of coordinate systems, such as the Decimal Degrees (DD); Degrees, Minutes, Seconds (DMS); Universal Transverse Mercator (UTM); Universal Polar Stereographic (UPS) coordinate systems; and/or the like. Alternatively, the location information/data may be determined by triangulating a position of the external computing entity 102A in connection with a variety of other systems, including cellular towers, Wi-Fi access points, and/or the like. Similarly, the external computing entity 102A may include indoor positioning embodiments, such as a location module adapted to acquire, for example, latitude, longitude, altitude, geocode, course, direction, heading, speed, time, date, and/or various other information/data. Some of the indoor systems may use various position or location technologies including RFID tags, indoor beacons or transmitters, Wi-Fi access points, cellular towers, nearby computing devices (e.g., smartphones, laptops) and/or the like. For instance, such technologies may include the iBeacons, Gimbal proximity beacons, Bluetooth Low Energy (BLE) transmitters, NFC transmitters, and/or the like. These indoor positioning embodiments may be used in a variety of settings to determine the location of someone or something to within inches or centimeters. 
[0028]   The external computing entity 102A may include a user interface 316 (e.g., a display, speaker, and/or the like) that may be coupled to the external entity processing element 308. In addition, or alternatively, the external computing entity 102A may include a user input interface 319 (e.g., keypad, touch screen, microphone, and/or the like) coupled to the external entity processing element 308). 
[0029]   For example, the user interface 316 may be a user application, browser, and/or similar words used herein interchangeably executing on and/or accessible via the external computing entity 102A to interact with and/or cause the display, announcement, and/or the like of information/data to a user. The user input interface 318 may comprise any of a number of input devices or interfaces allowing the external computing entity 102A to receive data including, as examples, a keypad (hard or soft), a touch display, voice/speech interfaces, motion interfaces, and/or any other input device. In embodiments including a keypad, the keypad may include (or cause display of) the conventional numeric (0-9) and related keys (#, *, and/or the like), and other keys used for operating the external computing entity 102A and may include a full set of alphabetic keys or set of keys that may be activated to provide a full set of alphanumeric keys. In addition to providing input, the user input interface 318 may be used, for example, to activate or deactivate certain functions, such as screen savers, sleep modes, and/or the like. 
[0030]   The external computing entity 102A may also include one or more external entity non- volatile memories 322 and/or one or more external entity volatile memories 324, which may be embedded within and/or may be removable from the external computing entity 102A. As will be understood, the external entity non-volatile memories 322 and/or the external entity volatile memories 324 may be embodied in a number of different ways including, for example, as described herein with reference the non-volatile memories 204 and/or the external volatile memories 206. 
[0042] As described below, various embodiments of the present invention leverage robust data processing techniques to make important technical contributions to data and data processing intensive developmental processes. 
[0043] FIG. 4 provides a flowchart diagram of an example process 402 for an automatic data processing scheme for evaluating robust data sets to optimize procedure efficiency in accordance with some embodiments discussed herein. The dataflow diagram depicts an automatic data processing scheme for generating insights for a developmental process based at least in part on a plurality of input data objects associated with the developmental process. The automatic data processing scheme may be implemented by one or more computing device(s) and/or system(s) described herein. For example, the predictive data analysis computing entity 106 may utilize the automatic data processing scheme to overcome the various limitations with conventional data modeling, processing, and evaluative techniques.

Furthermore, as an ordered combination, these elements amount to generic computer components receiving or transmitting data over a network, performing repetitive calculations, electronic record keeping, and storing and retrieving information in memory, which, as held by the courts, are well-understood, routine, and conventional. See MPEP 2106.05(d).
Moreover, the remaining elements of dependent claims do not transform the recited abstract idea into a patent eligible invention because these remaining elements merely recite further abstract limitations that provide nothing more than simply a narrowing of the abstract idea recited in the independent claims. 
Looking at these limitations as an ordered combination adds nothing additional that is sufficient to amount to significantly more than the recited abstract idea because they simply provide instructions to use a generic arrangement of generic computer components to “apply” the recited abstract idea, perform insignificant extra-solution activity, and generally link the abstract idea to a technical environment. Thus, the elements of the claims, considered both individually and as an ordered combination, are not sufficient to ensure that the claim as a whole amounts to significantly more than the abstract idea itself. Since there are no limitations in these claims that transform the exception into a patent eligible application such that these claims amount to significantly more than the exception itself, claims 1-20 are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter.
Claim Rejections – 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
Determining the scope and contents of the prior art.
Ascertaining the differences between the prior art and the claims at issue.
Resolving the level of ordinary skill in the pertinent art.
Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable by US Patent Publication to US20210366618A1 to Schoedl et al., (hereinafter referred to as “Schoedl”) in view of US Patent Publication to US20220207425A1 to Nakae et al., (hereinafter referred to as “Nakae”) 

As per Claim 1, Schoedl teaches: (Currently Amended)  A computer-implemented method for implementing an automatic data processing scheme for evaluating robust data sets to optimize procedure efficiency, the computer-implemented method comprising: ([0171]-[0190])
receiving, by one or more processors, a plurality of input data objects associated with a developmental process, wherein an input data object of the plurality of input data objects comprises a plurality of input data object parameters, and the plurality of input data object parameters comprises a contextual attribute and a predictive metric attribute;(in at least [0063] the prediction logic may be used for predicting whether a particular drug D1 having a target molecule T1 will be approved by the FDA for treating disease D and may in addition be used for prediction whether a particular drug D2 having a target molecule T2 will be approved by the FDA for treating the disease D. Thus, the input data for the two predictions may differ because the names of the drug targets T1, T2 differ. The mobile device displays a prediction list on the display of the mobile device. Each list item represents one of the received prediction results and comprises at least a thumbnail-analog scale icon graphically representing said prediction result.  [0132] receive some input data 969, e.g. a specification of the name (i.e. contextual attribute) of one or more target molecules of the drug of interest (i.e. predictive metric attribute). The prediction unit 955 can then analyze the currently available literature for identifying documents or document abstracts mentioning the names of the one or more target molecules as well as the name of the disease to be treated and analyzing meta-data associated with the identified documents. For example, the predictor can analyze the author names, publication date, cross references to other documents, the names of diseases, metabolites, genes or drugs mentioned in the documents for extracting a plurality of literature-based features for the one or more target molecules provided as input.)
generating, by the one or more processors, a …-based input data object parameter for an input data object cohort comprising a subset of the plurality of input data objects associated with a predictive entity, wherein the predictive entity describes a common contextual attribute of the plurality of input data objects; (in at least [0119] The biomedical model can be an explicitly specified model, e.g. a manually, semi-automatically or automatically specified model. Alternatively, the model can be an implicitly specified model that is generated during the training phase of a machine learning logic. For example, the network architecture elements of an artificial neural network that are modified in the training phase (e.g. weights of the “neurons” of the layers) in combination with the network architecture may constitute an implicit predictive model (a “black box” model) adapted for providing predictions for biomedical questions. [0121] The prediction score is indicative of the certainty of the prediction and is a numerical value within a score range. This range can also be referred to as “interval of possible score values”. For example, the score range can be a predefined range between −1 and +1 and any original score value output by the model-based prediction logic is normalized to a numerical value between −1 and +1. Depending on the type of prediction, other score ranges may be used for normalization, e.g. a range between 0 and 1. In the following examples, a score range from −1 to +1 will be used, but this range is an example only and any other predefined score ranges may likewise be used for normalizing an originally provided score value. [0132] receive some (i.e. cohort) input data 969, e.g. a specification of the name of one or more target molecules of the drug of interest (i.e. common contextual attribute). The prediction unit 955 can then analyze the currently available literature for identifying documents or document abstracts mentioning the names of the one or more target molecules as well as the name of the disease to be treated and analyzing meta-data associated with the identified documents. For example, the predictor can analyze the author names, publication date, cross references to other documents, the names of diseases, metabolites, genes or drugs mentioned in the documents for extracting a plurality of literature-based features for the one or more target molecules provided as input  (i.e. common contextual attribute). [0157] allows a user to easily compare the prediction scores and the quality of a plurality of predictions provided by different models, different model versions and/or different predictive tasks. Thus, a dense visualization of a plurality of highly heterogeneous predictive models and software programs is provided that allows a user to compare the prediction results and the prediction qualities provided by many different models. This is particularly advantageous in the context of life science research and drug development, because these technical fields are characterized by highly heterogeneous IT frameworks, a rapidly increasing amount of structured and unstructured data and a large plurality of different predictive approaches regarding the type of training and input data (literature, sequence data, expression profiles, 3-D structures, array data, image analysis), regarding the type of biomedical question (target prediction, toxicities prediction, drug identification, side effect prediction) and regarding the type of prediction algorithm used (neuronal networks, support vector machines, random forests, rules etc.).)
generating, by the one or more processors, a variance-based input data object cohort parameter for the input data object cohort; (in at least [0133] The prediction result comprises a normalized prediction score 216, a first confidence interval 256.1 and a second confidence interval 256.2 indicating score value sub-ranges with a particular low ratio of false positives or false negatives results. Optionally, the prediction result further comprises a prediction-variance-interval 254. [0186] the user may specify that a prediction-variance-interval 254 of a current prediction is significantly different from a prediction- variance-interval 254 obtained for the previous prediction for the same prediction task if the intersection interval of the two prediction- variance -intervals is more than 15% smaller than the prediction- variance -interval obtained for the previous prediction and/or if the size of the prediction- variance-interval 254 obtained for the new prediction is at least 15% larger than the size of the prediction- variance-interval 254 obtained for the previous prediction. [0187] The user can set different values for the first and second confidence intervals, the prediction- variance-interval and the score value difference threshold.)
generating, by the one or more processors, a predictive variance metric for an input data object of the input data object cohort based at least in part on the variance-based input data object cohort parameter;  (in at least [0133] The prediction result comprises a normalized prediction score 216, a first confidence interval 256.1 and a second confidence interval 256.2 indicating score value sub-ranges with a particular low ratio of false positives or false negatives results. Optionally, the prediction result further comprises a prediction-variance-interval 254. [0134] If the prediction result is that the FDA will with 100% likelihood approve the drug, the prediction score (that may be optionally normalized) is, for example, 1. If the prediction result is that the FDA will with 100% likelihood reject approval of the drug, the prediction score (that may be optionally normalized) is, for example, −1. Typically, the prediction score will have a numerical number greater than the minimum value of the scale (greater than −1) and smaller than the maximum value of the scale (smaller than +1). [0146] The variance bar 258 is an optional element that visualizes the prediction-variance-interval, i.e., visualizes if the outcome of a prediction would be very different from the current prediction output if the input data used for the prediction would be modified slightly. If the prediction is robust against small changes of the input parameter values, the prediction variance bar is short, indicating that the prediction would not change much. If the prediction is sensitive to small changes of the input parameter values, the prediction variance bar is broad, indicating that the prediction would change significantly. In the depicted example, the scale is represented as a thick line along parts of the outline of the background area 220.)
generating, by the one or more processors, a predictive … metric for the input data object based at least in part on the …-based input data object parameter; (in at least [0121] The prediction score is indicative of the certainty of the prediction and is a numerical value within a score range. This range can also be referred to as “interval of possible score values”. For example, the score range can be a predefined range between −1 and +1 and any original score value output by the model-based prediction logic is normalized to a numerical value between −1 and +1. Depending on the type of prediction, other score ranges may be used for normalization, e.g. a range between 0 and 1. In the following examples, a score range from −1 to +1 will be used, but this range is an example only and any other predefined score ranges may likewise be used for normalizing an originally provided score value. [0131] the model may be based on a neural network architecture configured to receive input data and features of a particular type and whose weights in the different network layers have been adapted during the training phase thus that the trained neural network architecture is able to perform a prediction based on new input data that corresponds to the structure and type of data used in the training phase.)
generating, by the one or more processors, a predictive metric data object that represents an indication of efficiency of the developmental process based at least in part on the predictive variance metric and the predictive … metric for the input data object; (in at least [0063] the prediction logic may be used for predicting whether a particular drug D1 having a target molecule T1 will be approved by the FDA for treating disease D and may in addition be used for prediction whether a particular drug D2 having a target molecule T2 will be approved by the FDA for treating the disease D. Thus, the input data for the two predictions may differ because the names of the drug targets T1, T2 differ. The mobile device displays a prediction list on the display of the mobile device. Each list item represents one of the received prediction results and comprises at least a thumbnail-analog scale icon graphically representing said prediction result. [0064] the thumbnail analog scale icons may provide a user with an intuitive impression of the prediction result and the quality of the model used for the prediction. The user is enabled to manage technical task, such as comparing and interpreting a plurality of biomedical prediction results generated by one or many models or model versions, in a more efficient and accurate manner. [0133] The prediction result comprises a normalized prediction score 216, a first confidence interval 256.1 and a second confidence interval 256.2 indicating score value sub-ranges with a particular low ratio of false positives or false negatives results. Optionally, the prediction result further comprises a prediction-variance-interval 254. [0134] If the prediction result is that the FDA will with 100% likelihood approve the drug, the prediction score (that may be optionally normalized) is, for example, 1. If the prediction result is that the FDA will with 100% likelihood reject approval of the drug, the prediction score (that may be optionally normalized) is, for example, −1. Typically, the prediction score will have a numerical number greater than the minimum value of the scale (greater than −1) and smaller than the maximum value of the scale (smaller than +1). [0170] FIG. 9 depicts a further example of an analog scale icon displayed on a GUI 902. The icon depicted in this figure represents an integrated prediction generated by an integrated model. The integrated model uses the output generated by many different predictive models as input for generating an overall prediction result that may also comprise the elements 216, 254, 256 described already with reference to FIG. 2A. The GUI may comprise additional meta data, e.g. the predictive task, the number of publications identified mentioning the disease and the drug target, the specificity for positive and negative outcome and the like. [0175] The backend program 962 forwards the prediction result 960 to the one of the mobile devices 992, 994, 970 from which the request to perform the prediction was received or to the one of the mobile devices that is assigned to the user to whom the prediction task is assigned in the user-and task-registry 963 of the database 964. In addition, the prediction result can be stored in a prediction history 961 in the database 964. Preferentially, each prediction result stored in the history 961 has assigned some meta-data, e.g. a prediction task for which the prediction was performed, one or more users having assigned the prediction task, the date of the prediction, and ID, type and/or version of the model used for the prediction, and the like. The history allows obtaining profiles of the development of the prediction score and the model-based prediction quality and certainty over the time provided that a particular prediction is repeated multiple times on updated versions of the same model for the same predictive task.)
training, by the one or more processors and using the predictive metric data object, a machine learning model to generate a processing optimization action to reduce a processing time associated with the developmental process; and (in at least [0051] The method comprises sending the first prediction results selectively to the mobile devices of the users to which the prediction tasks for which the first prediction results were generated are assigned. In response to each re-training of the machine learning logic, the machine learning logic automatically performs each of the prediction tasks a further time, thereby respectively using the updated version of the biomedical model for generating a second prediction result. Then, the method comprises sending the second prediction results or a notification of their computation selectively to the mobile devices of the users to which the prediction tasks for which the first prediction results were generated are assigned. [0064] the thumbnail analog scale icons may provide a user with an intuitive impression of the prediction result and the quality of the model used for the prediction. The user is enabled to manage technical task, such as comparing and interpreting a plurality of biomedical prediction results generated by one or many models or model versions, in a more efficient and accurate manner. [0065] allow identifying the most appropriate target and, in general, to compare and evaluate different use case and data input scenarios to find the best solution for a particular biomedical task, e.g. the task of identifying a drug and/or a drug target. In this case, the same model is used for performing the predictions. [0074] repeatedly performing, by the program logic, the generation of the prediction result for the biomedical prediction task, thereby using repeatedly updated versions of the biomedical model. The method further comprises visualizing the change of the accuracy of the repeatedly updated biomedical model in the form of a moving image of the analog scale icon, wherein the size of the first and second sub-range indicators, the direction of the pointer and/or the size of the variance bar, if any, vary in the moving image over time. [0131] FIG. 2A depicts the generation and use of a predictive biomedical model according to an embodiment of the invention. For example, the model 958 can be an implicit model of an artificial neural network that was learned implicitly by the network 956 in a training phase. The machine learning logic 956 can be a prediction logic having been trained to predict, based on an analysis of biomedical literature, whether a particular drug will be accepted by the FDA as a treatment for a particular disease or not. For example, the network 956 can initially be trained on a large literature corpus such as the MEDLINE literature database used as training data 966. During the training of the machine learning logic on the training data 966, the model 958 is explicitly or implicitly learned. The functionality of learning a model 958 from training data 966 is illustrated as model generation unit 957, although the model generation process may be an implicit part of the machine learning logic 956 that has not been explicitly specified by a human programmer. The MLL can be implemented using a large variety of programming techniques and/or readily available machine learning tools, libraries and modules. In some embodiments, the logic for training the model and for applying the trained model on some new input data can be implemented in different program modules. In some other embodiments, the biological model is an integral part of the program logic that is trained and/or that performs the prediction, so it may not be possible to separate the biological model from the program logic that generates or uses it. For example, the model may be based on a neural network architecture configured to receive input data and features of a particular type and whose weights in the different network layers have been adapted during the training phase thus that the trained neural network architecture is able to perform a prediction based on new input data that corresponds to the structure and type of data used in the training phase. [0132] Once the model 958 has been generated, the machine learning logic 956 can use the model 958 to solve a particular prediction problem. For example, a model-based prediction unit 955 can receive some input data 969, e.g. a specification of the name of one or more target molecules of the drug of interest. The prediction unit 955 can then analyze the currently available literature for identifying documents or document abstracts mentioning the names of the one or more target molecules as well as the name of the disease to be treated and analyzing meta-data associated with the identified documents. For example, the predictor can analyze the author names, publication date, cross references to other documents, the names of diseases, metabolites, genes or drugs mentioned in the documents for extracting a plurality of literature-based features for the one or more target molecules provided as input. The feature extraction can be a data analysis step that is explicitly or implicitly specified in the code of the prediction logic 956. The extracted features are then used as input for the model 958 which generates a prediction whether the drug whose targets were provided as input 969 will be approved by the FDA in the future as a treatment for a particular disease or not. The feature extraction can also be performed in the training phase for extracting features from the training data that are actually fed into the model to be trained. [0134] If the prediction result is that the FDA will with 100% likelihood approve the drug, the prediction score (that may be optionally normalized) is, for example, 1. If the prediction result is that the FDA will with 100% likelihood reject approval of the drug, the prediction score (that may be optionally normalized) is, for example, −1. Typically, the prediction score will have a numerical number greater than the minimum value of the scale (greater than −1) and smaller than the maximum value of the scale (smaller than +1).)
storing, by the one or more processors, the machine learning model in association with the developmental process. (in at least [0136] the model generation based on the training data is performed fully automatically, e.g. within a computer implemented model generation and update framework. For example, the training data 966 can be updated and supplemented with additional data on a regular basis, e.g. once a week or once a month. This may be highly advantageous in biomedical domains where the amount of available data is rapidly increasing. This is the case for example for biomedical literature data. The model generation and update framework is preferably configured such that whenever the training data 966 is supplemented with additional training data or is modified by removing or replacing some parts of the training data, the machine learning logic 956 is automatically re-trained on the updated version of the training data 966. Thereby, also an updated version of the biomedical model 958 is automatically generated. If the updated version of the model is used for computing the same prediction on the same inputs data 969 a further time, the prediction result will differ from the previously generated prediction results, because the model has integrated additional, new knowledge that may have an impact on the outcome of a prediction.)
Although implied, Schoedl does not expressly disclose the following limitations, which however, are taught by Nakae,
…weighting… (in at least [0204] “weighting coefficient” is a coefficient that is set so that an important element is calculated as more important in the calculation of the present disclosure, including approximation coefficients. For example, a coefficient can be obtained by approximating a function to data, but the coefficient itself only has a description indicating the degree of approximation. When coefficients are ranked or chosen/discarded on the basis of the magnitude or the like, a difference in contribution within the model is provided to a specific feature, so that this can be considered a weighting coefficient. A weighting coefficient is used in the same meaning as an approximation index of a differentiation function. Examples thereof include R2 value, correlation coefficient, regression coefficient, residual sum of squares (difference in feature from differentiation function))
At the time the invention was filed, it would have been obvious for one of ordinary skill in the art to have modified the teachings of Schoedl, as taught by Nakae above, with a reasonable expectation of success if arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make this modification to the teachings of Schoedl with the motivation of, … provide a system and the like for augmenting supervisory data while maintaining the relationship among a plurality of supervisory data used for machine learning … when learning data obtained from an organism, the sample augmentation of the present disclosure can reduce the burden imposed on an organism as much as possible instead of imposing many stimulations on the organism in order to obtain many supervisory data when, for example, obtaining reaction data against the stimulation, in a scene where it is difficult to obtain many supervisory data … for improving a differentiation model. This function can be in a pain differentiation model generation unit, or comprised as a separate module… in order to improve accuracy of machine learning…. While simple augmentation of the number of a plurality of supervisory data is insufficient as supervisory data for machine learning and cannot achieve the intended prediction accuracy and high prediction accuracy is required when, for example, learning data obtained from an organism, the present disclosure can also improve the low reliability that can be seen in the prediction by learning the supervisory data in which the number of a plurality of supervisory data has been simply augmented…., as recited in Nakae.


As per Claim 2, Schoedl teaches: The computer-implemented method of claim 1, 
wherein the predictive variance metric is indicative of … standard deviations between the input data object and the variance-based input data object cohort parameter for the input data object cohort. (in at least [0038] the variance bar describes the variance (e.g. the standard deviation) of the prediction score. The variance bar indicates the impact of small variations in the input data used for a particular prediction on the score value. An ideal/highly certain prediction can quickly identified by the user by determining that the pointer indicating the score value of the prediction lies within the range of one of the confidence intervals and is associated with a variance bar that is completely within the range of the confidence interval. Thus, complex information relating to the certainty of a model-based prediction can be perceived by a user quickly and intuitively.)
Although implied, Schoedl does not expressly disclose the following limitations, which however, are taught by Nakae,
…a number of standard deviations… (in at least [0201] In machine learning, over-training (over-fitting) can occur. With over-training, empirical error (prediction error relative to training data) is small, but generalization error (prediction error relative to data from a true model) is large due to selecting a model that is overfitted to training data, such that the original objective of learning cannot be achieved. Generalization errors can be divided into three components, i.e., bias (error resulting from a candidate model set not including a true model; this error is greater for a more simple model set), variance (error resulting from selecting a different prediction model when training data is different; this error is greater for a more complex model set), and noise (deviation of a true model that cannot be fundamentally reduced, independent of the selection of a model set). Since bias and variance cannot be simultaneously reduced, the overall error is reduced by balancing the bias and variance. Since less training data tends to cause overlearning, the possibility of overlearning may be reduced by using the sample augmentation [0742] COVAS templates created beforehand were sorted in ascending order from the minimum value of 0 to the maximum value of 100. From the sorted COVAS templates, 19 ranges were cut out from the minimum value 0 to the maximum value 1000 in the unit of 10 while shifting 5 at a time. These 19 ranges are 19 types of standardization parameters, wherein the mean value and standard deviation of each of these 19 types of standardization parameters are calculated. 19 mean values and 19 standard deviations are each preserved for use upon the off-line chronological data analysis later on.)
The reason and rationale to combine Schoedl and Nakae is the same as recited above. 


As per Claim 3, Schoedl teaches: (Currently Amended) The computer-implemented method of claim 2, 
wherein generating the predictive variance metric for the input data object of the input data object cohort is further based on a first predictive metric attribute for the input data object, and (in at least [0063] the prediction logic may be used for predicting whether a particular drug D1 having a target molecule T1 will be approved by the FDA for treating disease D and may in addition be used for prediction whether a particular drug D2 having a target molecule T2 will be approved by the FDA for treating the disease D. Thus, the input data for the two predictions may differ because the names of the drug targets T1, T2 differ. The mobile device displays a prediction list on the display of the mobile device. Each list item represents one of the received prediction results and comprises at least a thumbnail-analog scale icon graphically representing said prediction result. [0132] receive some input data 969, e.g. a specification of the name of one or more target molecules of the drug of interest. The prediction unit 955 can then analyze the currently available literature for identifying documents or document abstracts mentioning the names of the one or more target molecules as well as the name of the disease to be treated and analyzing meta-data associated with the identified documents. For example, the predictor can analyze the author names, publication date, cross references to other documents, the names of diseases, metabolites, genes or drugs mentioned in the documents for extracting a plurality of literature-based features for the one or more target molecules provided as input. [0134] If the prediction result is that the FDA will with 100% likelihood approve the drug, the prediction score (that may be optionally normalized) is, for example, 1. If the prediction result is that the FDA will with 100% likelihood reject approval of the drug, the prediction score (that may be optionally normalized) is, for example, −1. Typically, the prediction score will have a numerical number greater than the minimum value of the scale (greater than −1) and smaller than the maximum value of the scale (smaller than +1). [0135] For example, a prediction score of 0.7 indicates that a particular drug whose respective target molecule name was provided as input to the prediction logic 956 is predicted to be highly likely approved by the FDA. A prediction score of −0.8 indicates that a particular drug whose respective target molecule name was provided as input to the prediction logic 956 is predicted to be highly likely rejected (not approved) by the FDA. A prediction score of about 0 indicates that the model is not able to clearly predict, for the input data 969 currently provided, whether or not the FDA will approve the drug or not, because the likelihood of refusal and the likelihood of acceptance are considered identical or highly similar by the model. A user can easily and intuitively understand the prediction result simply by having a short look on the position of the pointer: a pointer that points towards to a scale region close to the end of the scale representing the minimum scale value indicates a rejection of the hypothesis/a very low prediction score; a pointer that points towards a scale region close to the end of the scale representing the maximum scale value indicates acceptance of the hypothesis/a very high prediction score; a pointer that points towards the center region of the scale indicates that the prediction result is ambiguous and vague.)
wherein the first predictive metric attribute is indicative of … parameter for the input data object, and wherein generating the predictive variance metric for the input data object comprises: (in at least [0122] The first confidence interval 256.1 is a first sub-interval of the score range and is indicative of the model-specific sub-range of score values known to have a percentage of false negative (FN) predictions below a predefined FN-percentage threshold. For example, the first confidence interval 256.1 can be the sub-interval of the score ranges for which it is known, e.g. based on a statistical analysis of a plurality of model predictions, that any prediction score within this sub-interval has a likelihood of being a false negative score value that is less than the predefined FN-percentage threshold, e.g. less than 10%, or less than 5%, or less than 1%. The suitable size of the FN-percentage threshold preferably depends on the type of prediction that is computed: in case a false negative result would impose significant financial or health-related costs on a patient or the society, the predefined FN-percentage threshold is chosen such that the resulting first sub-range is comparatively narrow. For example, the first sub-range is chosen such that it covers only score values known to comprise a false negative (FN)-percentage of less than 5%. To the contrary, in case a false negative result would not impose significant financial or health-related costs on a patient or the society, the predefined FN-percentage threshold is chosen such that the resulting first sub-range is comparatively broad. For example, the first sub-range is chosen such that it covers only score values known to comprise a false negative percentage of less than 25%. In some embodiments, the first sub-range selectively covers score values known to comprise a false negative percentage of less than 10%. [0123] The second confidence interval 256.2 is a second sub-interval of the score range and is indicative of the model-specific sub-range of score values known to have a percentage of false positive (FP) predictions below a predefined FP-percentage threshold. For example, the second confidence interval 256.2 can be the sub-interval of the score ranges for which it is known, e.g. based on a statistical analysis of a plurality of model predictions, that any prediction score within this sub-interval has a likelihood of being a false positive score value that is less than the predefined FP-percentage threshold, e.g. less than 10%, or less than 5%, or less than 1%. The suitable size of the FP-percentage threshold preferably depends on the type of prediction that is computed: in case a false positive result would impose significant financial or health-related costs on a patient or the society, the predefined FP-percentage threshold is chosen such that the resulting second sub-range is comparatively narrow. For example, the second sub-range is chosen such that it covers only score values known to comprise a false positive-percentage of less than 5%. To the contrary, in case a false positive result would not impose significant financial or health-related costs on a patient or the society, the predefined FP-percentage threshold is chosen such that the resulting second sub-range is comparatively broad. For example, the second sub-range is chosen such that it covers only score values known to comprise a FP-percentage of less than 10%.)
generating the variance-based input data object cohort parameter for the input data object cohort based at least in part on a median … parameter of the input data object cohort; (in at least [0088] A “prediction score” as used herein is a numerical value that is indicative of a prediction result. For example, the prediction score may be a normalized numerical value. In some examples, in case the normalized prediction score is higher than the median of all possible normalized score values, the prediction result is that a given hypothesis is predicted to be true. In case the normalized prediction score is lower than the median of all possible normalized score values, the prediction result is that a given hypothesis is predicted to be false. Thus, according to embodiments of the invention, the prediction score is a numerical value that indicates which one out of two possible values or classes is likely correct. These two possible values can be, for example: “membership in a particular class: yes or no”; “drug approval by FDA for a particular drug in respect to a particular disease: yes or no”; etc. [0160] Plot 402 depicts changes in topics for publications related to successful and unsuccessful drugs over time, focusing on the topic “Drug therapy”. The displayed time range are 20 years before a time point “0” which references to a specific significant time point in the development of a drug, in this case the beginning of the earliest phase 2 trial. Each publication is annotated with a limited set of topics, also called Mesh terms. Plot 402 shows the percentage of publications that are annotated with the topic “Drug therapy” for two classes of publications: the “FDA approved” publications (the upper one of the two curves at the right border of the plot) are publications that mention target and an indication of drugs that were approved by the FDA; the “Failed” publications mention the target and an indication of drugs that were terminated in phase 2 or 3. The thick lines show the median, the shaded areas the confidence interval (and implicitly the variance) of the distribution.)
determining the … standard deviations between the input data object and the variance-based input data object cohort parameter based at least in part on the … parameter and the median … parameter; and (in at least [0038] the variance bar describes the variance (e.g. the standard deviation) of the prediction score. The variance bar indicates the impact of small variations in the input data used for a particular prediction on the score value. An ideal/highly certain prediction can quickly identified by the user by determining that the pointer indicating the score value of the prediction lies within the range of one of the confidence intervals and is associated with a variance bar that is completely within the range of the confidence interval. [0090] A “prediction-variance-interval” as used herein is a measure that is used to quantify the amount of variation or dispersion of a set of prediction scores computed by a model-based prediction logic. A small prediction-variance-interval indicates a small amount of variation, a large prediction-variance-interval indicates a large amount of variation. Hence, a small prediction-variance-interval (covering only about 7% of the score range or less) may indicate that the score values computed on the currently used input data values and on similar input data values tend to be close to an expected score value. A large prediction-variance-interval indicates that the score values computed on the currently used input data values and on similar input data values tend to be spread out over a wider range of values. The “prediction-variance-interval” can be implemented as a sub-range of score values whereby the width of this sub-range is a measure of the amount of variation or dispersion of the score values. [0091] the prediction-variance-interval represents a standard deviation of score values. This may be advantageous as the standard deviation is algebraically simpler than other measures of variance such as the average absolute deviation. However, there are also other measures of the deviation of a prediction score from an expected value, including average absolute deviation, which provide different mathematical properties from standard deviation. [0094] The higher the confidence-level, the broader the prediction-variance-interval and the variance bar: the prediction-variance-interval computed based on a confidence level of 95% will be smaller than a prediction-variance-interval computed based on a confidence level of 99%. )
generating the predictive variance metric for the input data object based at least in part on … of standard deviations between the input data object and the variance-based input data object cohort parameter. (in at least [0091] the prediction-variance-interval represents a standard deviation of score values. This may be advantageous as the standard deviation is algebraically simpler than other measures of variance such as the average absolute deviation. However, there are also other measures of the deviation of a prediction score from an expected value, including average absolute deviation, which provide different mathematical properties from standard deviation. [0093] the width of the prediction-variance-interval is computed for each prediction by the model-based prediction logic based on a predefined (e.g. user-defined or pre-configured) confidence-level, e.g. a confidence-level of 90%. For example, a prediction-variance-interval generated for a particular prediction based on a confidence-level of 90% and computed by a model-based predictor having been trained on a particular sample of training data is a score interval that fulfills the following condition: “were this prediction to be repeated by numerous other versions of the model-based predictor respectively having been trained on another sample of the training data, the fraction of calculated confidence intervals represented as prediction-variance-intervals (which would differ for each sample) that encompass the true population parameter (the true prediction/classification result) would tend toward 90%. [0096] The variability of predictions made by a model-based prediction logic, e.g. bagged learners and random forests, can be determined as described in Stefan Wager, Trevor Hastie and Bradley Efron “Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife”, Journal of Machine Learning Research 15 (2014) 1625-1651. The variability of a prediction can be expressed, e.g. in the form of a standard error and the width of the prediction-variance-interval can represent and correlate with the standard error.) 
Although implied, Schoedl does not expressly disclose the following limitations, which however, are taught by Nakae,
…cost… (in at least [0369] the calculation cost would be high and the learning process would be inefficient. In this manner, if sigmoid fitting of features or the like additively materialize previous to “machine learning process with contracting of the number of features”, an addition procedure for futilely performing machine learning would be required so that the calculation cost would be high. In view of the above, it is more advantageous to perform the contracting of the present disclosure first. In addition, in the case of an embodiment carrying out contraction, sample augmentation is more efficient when carried out after the contraction, and is thus advantageous. By carrying out sample augmentation right before carrying out machine learning, the machine learning can be carried out in a state in which there are enough samples. [0370] sample augmentation can be carried out before the contraction. In this case, sample augmentation may be carried out for the purpose of how high the precision should be upon carrying out the contraction. In this case, the more the sample is augmented, the higher the calculation cost for the contraction may be.)
… number of standard deviations… (in at least [0201] In machine learning, over-training (over-fitting) can occur. With over-training, empirical error (prediction error relative to training data) is small, but generalization error (prediction error relative to data from a true model) is large due to selecting a model that is overfitted to training data, such that the original objective of learning cannot be achieved. Generalization errors can be divided into three components, i.e., bias (error resulting from a candidate model set not including a true model; this error is greater for a more simple model set), variance (error resulting from selecting a different prediction model when training data is different; this error is greater for a more complex model set), and noise (deviation of a true model that cannot be fundamentally reduced, independent of the selection of a model set). Since bias and variance cannot be simultaneously reduced, the overall error is reduced by balancing the bias and variance. Since less training data tends to cause overlearning, the possibility of overlearning may be reduced by using the sample augmentation [0742] COVAS templates created beforehand were sorted in ascending order from the minimum value of 0 to the maximum value of 100. From the sorted COVAS templates, 19 ranges were cut out from the minimum value 0 to the maximum value 1000 in the unit of 10 while shifting 5 at a time. These 19 ranges are 19 types of standardization parameters, wherein the mean value and standard deviation of each of these 19 types of standardization parameters are calculated. 19 mean values and 19 standard deviations are each preserved for use upon the off-line chronological data analysis later on.)
The reason and rationale to combine Schoedl and Nakae is the same as recited above. 


As per Claim 4, Schoedl teaches: (Currently Amended) The computer-implemented method of claim 3, wherein generating the predictive metric data object for the input data object comprises: 
aggregating the first predictive metric attribute, the predictive variance metric, and the predictive … metric for the input data object.  (in at least [0120] in FIG. 2A in further detail, the prediction result can comprise multiple data values. The prediction result comprises a prediction score 216, a first confidence interval 256.1 and a second confidence interval 256.2. [0160] Plot 402 depicts changes in topics for publications related to successful and unsuccessful drugs over time, focusing on the topic “Drug therapy”. The displayed time range are 20 years before a time point “0” which references to a specific significant time point in the development of a drug, in this case the beginning of the earliest phase 2 trial. Each publication is annotated with a limited set of topics, also called Mesh terms. Plot 402 shows the percentage of publications that are annotated with the topic “Drug therapy” for two classes of publications: the “FDA approved” publications (the upper one of the two curves at the right border of the plot) are publications that mention target and an indication of drugs that were approved by the FDA; the “Failed” publications mention the target and an indication of drugs that were terminated in phase 2 or 3. The thick lines show the median, the shaded areas the confidence interval (and implicitly the variance) of the distribution. [0161] Statistically significant differences of the distribution as assessed by a Wilcoxon test are marked by asterisks on top of the plot. The main hypothesis proven here is that publications leading up to the development of successful drugs are annotated with the topic “Drug therapy” significantly more often before the beginning of phase 2 trials. [0162] Plot 404 depicts a first and a second curve. The first curve “FDA approved” (the upper one of the two curves at the right border of the plot) indicates the number of articles in the Medline database mentioning the name of a particular drug target, whereby the drug was later approved by the FDA as a treatment for said disease. The second curve “failed” indicates the number of articles in the Medline database mentioning the name of a particular drug target, whereby said drug was later rejected by the FDA and was not allowed to be used for treating said disease. In the early years of an emerging research area, the two curves are very similar and it may not be possible for a literature-based model to make a clear prediction whether a particular drug will likely be approved by the FDA or not. However, after several years, it can be observed that the number of published articles mentioning a disease in combination with a drug target is higher for targets later approved by the FDA than for targets that failed. This is probably because positive results supporting a relationship between a particular drug target and a disease invite further research groups to work in this field, thereby increasing the number of publications mentioning only the target. This plot illustrates that literature-based models may reliably predict whether or not the FDA will approve a particular drug or not, in particular in later years when a sufficient number of documents is available. Thus, frequently updating literature based prediction models is key for providing high quality predictions.)
Although implied, Schoedl does not expressly disclose the following limitations, which however, are taught by Nakae,
…weighting… (in at least [0066] c) augmenting the features that have been weighted after the contracting or combination thereof, comprising: i) deriving a covariance matrix from the features that have been weighted after the contracting or combination thereof; ii) decomposing the covariance matrix; and iii) applying a random number to the decomposed matrix; [0204] “weighting coefficient” is a coefficient that is set so that an important element is calculated as more important in the calculation of the present disclosure, including approximation coefficients. For example, a coefficient can be obtained by approximating a function to data, but the coefficient itself only has a description indicating the degree of approximation. When coefficients are ranked or chosen/discarded on the basis of the magnitude or the like, a difference in contribution within the model is provided to a specific feature, so that this can be considered a weighting coefficient. A weighting coefficient is used in the same meaning as an approximation index of a differentiation function. Examples thereof include R2 value, correlation coefficient, regression coefficient, residual sum of squares (difference in feature from differentiation function))
The reason and rationale to combine Schoedl and Nakae is the same as recited above. 


As per Claim 5, Schoedl teaches: (Currently Amended)The computer-implemented method of claim 1, wherein generating the predictive variance metric for the input data object further comprises: 
generating the predictive variance metric for the input data object of the input data object cohort based at least in part on a timing-based input data object cohort parameter and a third predictive metric attribute for the input data object, wherein the third predictive metric attribute is indicative of a timing associated with the input data object.  (in at least [0160] Plot 402 depicts changes in topics for publications related to successful and unsuccessful drugs over time, focusing on the topic “Drug therapy”. The displayed time range are 20 years before a time point “0” which references to a specific significant time point in the development of a drug, in this case the beginning of the earliest phase 2 trial. Each publication is annotated with a limited set of topics, also called Mesh terms. Plot 402 shows the percentage of publications that are annotated with the topic “Drug therapy” for two classes of publications: the “FDA approved” publications (the upper one of the two curves at the right border of the plot) are publications that mention target and an indication of drugs that were approved by the FDA; the “Failed” publications mention the target and an indication of drugs that were terminated in phase 2 or 3. The thick lines show the median, the shaded areas the confidence interval (and implicitly the variance) of the distribution. [0165] FIG. 5 depicts a plot 502 correlating FDA approval of a particular drug with article count. The time when the earliest phase 2 trial of particular drug starts is the time “0” and the plot 502 thus shows the article counts 20 years ahead of this beginning of phase 2 and even further ahead of the final decision of the FDA upon approval or rejection of the drug. It can be seen that the article count is very similar in the time 20 years ahead of the decision time till about 7 years ahead of the entry into phase 2. Then, the publication count of documents mentioning a disease in combination with a particular drug target is significantly higher for drug targets of drugs that will later be approved by the FDA. Five years before entry into phase 2 of the drug the difference is statistically significant, as indicated by the asterisks on top of the plot. Thus, literature based models that use article count together with other discriminatory features such as those shown in FIG. 4 may be able to predict at the entry into phase 2 trials, years ahead of the actual decision of the FDA, if a particular drug should still be considered as a promising candidate for FDA approval and if further money and effort should be invested in pre-clinical and clinical research related to this drug.)


As per Claim 6, Schoedl teaches: The computer-implemented method of claim 1, 
wherein the predictive … metric is indicative of a … of the input data object relative to the plurality of input data objects associated with the predictive entity.  (in at least [0119] The biomedical model can be an explicitly specified model, e.g. a manually, semi-automatically or automatically specified model. Alternatively, the model can be an implicitly specified model that is generated during the training phase of a machine learning logic. For example, the network architecture elements of an artificial neural network that are modified in the training phase (e.g. weights of the “neurons” of the layers) in combination with the network architecture may constitute an implicit predictive model (a “black box” model) adapted for providing predictions for biomedical questions. [0121] The prediction score is indicative of the certainty of the prediction and is a numerical value within a score range. This range can also be referred to as “interval of possible score values”. For example, the score range can be a predefined range between −1 and +1 and any original score value output by the model-based prediction logic is normalized to a numerical value between −1 and +1. Depending on the type of prediction, other score ranges may be used for normalization, e.g. a range between 0 and 1. In the following examples, a score range from −1 to +1 will be used, but this range is an example only and any other predefined score ranges may likewise be used for normalizing an originally provided score value. [0132] Once the model 958 has been generated, the machine learning logic 956 can use the model 958 to solve a particular prediction problem. For example, a model-based prediction unit 955 can receive some input data 969, e.g. a specification of the name of one or more target molecules of the drug of interest. [0157] allows a user to easily compare the prediction scores and the quality of a plurality of predictions provided by different models, different model versions and/or different predictive tasks. Thus, a dense visualization of a plurality of highly heterogeneous predictive models and software programs is provided that allows a user to compare the prediction results and the prediction qualities provided by many different models. This is particularly advantageous in the context of life science research and drug development, because these technical fields are characterized by highly heterogeneous IT frameworks, a rapidly increasing amount of structured and unstructured data and a large plurality of different predictive approaches regarding the type of training and input data (literature, sequence data, expression profiles, 3-D structures, array data, image analysis), regarding the type of biomedical question (target prediction, toxicities prediction, drug identification, side effect prediction) and regarding the type of prediction algorithm used (neuronal networks, support vector machines, random forests, rules etc.).)
Although implied, Schoedl does not expressly disclose the following limitations, which however, are taught by Nakae,
…weighting… magnitude… (in at least [0066] c) augmenting the features that have been weighted after the contracting or combination thereof, comprising: i) deriving a covariance matrix from the features that have been weighted after the contracting or combination thereof; ii) decomposing the covariance matrix; and iii) applying a random number to the decomposed matrix; [0204] “weighting coefficient” is a coefficient that is set so that an important element is calculated as more important in the calculation of the present disclosure, including approximation coefficients. For example, a coefficient can be obtained by approximating a function to data, but the coefficient itself only has a description indicating the degree of approximation. When coefficients are ranked or chosen/discarded on the basis of the magnitude or the like, a difference in contribution within the model is provided to a specific feature, so that this can be considered a weighting coefficient. A weighting coefficient is used in the same meaning as an approximation index of a differentiation function. Examples thereof include R2 value, correlation coefficient, regression coefficient, residual sum of squares (difference in feature from differentiation function))
The reason and rationale to combine Schoedl and Nakae is the same as recited above. 



As per Claim 7, Schoedl teaches: (Currently Amended) The computer-implemented method of claim 6,
wherein generating the predictive … metric for the input data object is further based on a second predictive metric attribute for the input data object, and (in at least [0063] the prediction logic may be used for predicting whether a particular drug D1 having a target molecule T1 will be approved by the FDA for treating disease D and may in addition be used for prediction whether a particular drug D2 having a target molecule T2 will be approved by the FDA for treating the disease D. Thus, the input data for the two predictions may differ because the names of the drug targets T1, T2 differ. The mobile device displays a prediction list on the display of the mobile device. Each list item represents one of the received prediction results and comprises at least a thumbnail-analog scale icon graphically representing said prediction result.  [0132] receive some input data 969, e.g. a specification of the name of one or more target molecules of the drug of interest. The prediction unit 955 can then analyze the currently available literature for identifying documents or document abstracts mentioning the names of the one or more target molecules as well as the name of the disease to be treated and analyzing meta-data associated with the identified documents. For example, the predictor can analyze the author names, publication date, cross references to other documents, the names of diseases, metabolites, genes or drugs mentioned in the documents for extracting a plurality of literature-based features for the one or more target molecules provided as input. [0134] If the prediction result is that the FDA will with 100% likelihood approve the drug, the prediction score (that may be optionally normalized) is, for example, 1. If the prediction result is that the FDA will with 100% likelihood reject approval of the drug, the prediction score (that may be optionally normalized) is, for example, −1. Typically, the prediction score will have a numerical number greater than the minimum value of the scale (greater than −1) and smaller than the maximum value of the scale (smaller than +1). [0135] For example, a prediction score of 0.7 indicates that a particular drug whose respective target molecule name was provided as input to the prediction logic 956 is predicted to be highly likely approved by the FDA. A prediction score of −0.8 indicates that a particular drug whose respective target molecule name was provided as input to the prediction logic 956 is predicted to be highly likely rejected (not approved) by the FDA. A prediction score of about 0 indicates that the model is not able to clearly predict, for the input data 969 currently provided, whether or not the FDA will approve the drug or not, because the likelihood of refusal and the likelihood of acceptance are considered identical or highly similar by the model. A user can easily and intuitively understand the prediction result simply by having a short look on the position of the pointer: a pointer that points towards to a scale region close to the end of the scale representing the minimum scale value indicates a rejection of the hypothesis/a very low prediction score; a pointer that points towards a scale region close to the end of the scale representing the maximum scale value indicates acceptance of the hypothesis/a very high prediction score; a pointer that points towards the center region of the scale indicates that the prediction result is ambiguous and vague.)
wherein the second predictive metric attribute is indicative of an advantage parameter for the input data object, wherein generating the predictive … metric comprises: (in at least [0133] The prediction result comprises a normalized prediction score 216, a first confidence interval 256.1 and a second confidence interval 256.2 indicating score value sub-ranges with a particular low ratio of false positives or false negatives results. Optionally, the prediction result further comprises a prediction-variance-interval 254. [0170] FIG. 9 depicts a further example of an analog scale icon displayed on a GUI 902. The icon depicted in this figure represents an integrated prediction generated by an integrated model. The integrated model uses the output generated by many different predictive models as input for generating an overall prediction result that may also comprise the elements 216, 254, 256 described already with reference to FIG. 2A. The GUI may comprise additional meta data, e.g. the predictive task, the number of publications identified mentioning the disease and the drug target, the specificity for positive and negative outcome and the like.)
generating the …-based input data object parameter for the input data object cohort based at least in part on an aggregate advantage parameter of the plurality of input data objects; (in at least [0120] in FIG. 2A in further detail, the prediction result can comprise multiple data values. The prediction result comprises a prediction score 216, a first confidence interval 256.1 and a second confidence interval 256.2. [0136] the model generation based on the training data is performed fully automatically, e.g. within a computer implemented model generation and update framework. For example, the training data 966 can be updated and supplemented with additional data on a regular basis, e.g. once a week or once a month. This may be highly advantageous in biomedical domains where the amount of available data is rapidly increasing. This is the case for example for biomedical literature data. The model generation and update framework is preferably configured such that whenever the training data 966 is supplemented with additional training data or is modified by removing or replacing some parts of the training data, the machine learning logic 956 is automatically re-trained on the updated version of the training data 966. Thereby, also an updated version of the biomedical model 958 is automatically generated. If the updated version of the model is used for computing the same prediction on the same inputs data 969 a further time, the prediction result will differ from the previously generated prediction results, because the model has integrated additional, new knowledge that may have an impact on the outcome of a prediction. [0152] The plurality of prediction results can comprise two or more prediction results provided by the same model for different prediction tasks, whereby different prediction tasks are associated with the provision of different input data to the prediction logic. For example, a first task may be the prediction whether a particular drug X having the target PDCD1 will be approved by the FDA as a treatment for melanoma, and a second task may be the prediction whether same drug X having the same target PDCD1 will be approved by the FDA as a treatment for breast neoplasms. For the two different prediction tasks whose prediction results are contained in the list 302, the same literature-based biomedical prediction model can be used.)
determining an … between the advantage parameter of the input data object and the aggregate advantage parameter of the plurality of input data objects; and (in at least [0136] the model generation based on the training data is performed fully automatically, e.g. within a computer implemented model generation and update framework. For example, the training data 966 can be updated and supplemented with additional data on a regular basis, e.g. once a week or once a month. This may be highly advantageous in biomedical domains where the amount of available data is rapidly increasing. This is the case for example for biomedical literature data. The model generation and update framework is preferably configured such that whenever the training data 966 is supplemented with additional training data or is modified by removing or replacing some parts of the training data, the machine learning logic 956 is automatically re-trained on the updated version of the training data 966. Thereby, also an updated version of the biomedical model 958 is automatically generated. If the updated version of the model is used for computing the same prediction on the same inputs data 969 a further time, the prediction result will differ from the previously generated prediction results, because the model has integrated additional, new knowledge that may have an impact on the outcome of a prediction. [0152] The plurality of prediction results can comprise two or more prediction results provided by the same model for different prediction tasks, whereby different prediction tasks are associated with the provision of different input data to the prediction logic. For example, a first task may be the prediction whether a particular drug X having the target PDCD1 will be approved by the FDA as a treatment for melanoma, and a second task may be the prediction whether same drug X having the same target PDCD1 will be approved by the FDA as a treatment for breast neoplasms. For the two different prediction tasks whose prediction results are contained in the list 302, the same literature-based biomedical prediction model can be used. [0170] FIG. 9 depicts a further example of an analog scale icon displayed on a GUI 902. The icon depicted in this figure represents an integrated prediction generated by an integrated model. The integrated model uses the output generated by many different predictive models as input for generating an overall prediction result that may also comprise the elements 216, 254, 256 described already with reference to FIG. 2A. The GUI may comprise additional meta data, e.g. the predictive task, the number of publications identified mentioning the disease and the drug target, the specificity for positive and negative outcome and the like.)
generating the predictive … metric for the input data object based at least in part on the ….  (in at least [0120] in FIG. 2A in further detail, the prediction result can comprise multiple data values. The prediction result comprises a prediction score 216, a first confidence interval 256.1 and a second confidence interval 256.2. [0136] the model generation based on the training data is performed fully automatically, e.g. within a computer implemented model generation and update framework. For example, the training data 966 can be updated and supplemented with additional data on a regular basis, e.g. once a week or once a month. This may be highly advantageous in biomedical domains where the amount of available data is rapidly increasing. This is the case for example for biomedical literature data. The model generation and update framework is preferably configured such that whenever the training data 966 is supplemented with additional training data or is modified by removing or replacing some parts of the training data, the machine learning logic 956 is automatically re-trained on the updated version of the training data 966. Thereby, also an updated version of the biomedical model 958 is automatically generated. If the updated version of the model is used for computing the same prediction on the same inputs data 969 a further time, the prediction result will differ from the previously generated prediction results, because the model has integrated additional, new knowledge that may have an impact on the outcome of a prediction. [0170] FIG. 9 depicts a further example of an analog scale icon displayed on a GUI 902. The icon depicted in this figure represents an integrated prediction generated by an integrated model. The integrated model uses the output generated by many different predictive models as input for generating an overall prediction result that may also comprise the elements 216, 254, 256 described already with reference to FIG. 2A. The GUI may comprise additional meta data, e.g. the predictive task, the number of publications identified mentioning the disease and the drug target, the specificity for positive and negative outcome and the like.)
Although implied, Schoedl does not expressly disclose the following limitations, which however, are taught by Nakae,
…weighting…  (in at least [0066] c) augmenting the features that have been weighted after the contracting or combination thereof, comprising: i) deriving a covariance matrix from the features that have been weighted after the contracting or combination thereof; ii) decomposing the covariance matrix; and iii) applying a random number to the decomposed matrix; [0204] “weighting coefficient” is a coefficient that is set so that an important element is calculated as more important in the calculation of the present disclosure, including approximation coefficients. For example, a coefficient can be obtained by approximating a function to data, but the coefficient itself only has a description indicating the degree of approximation. When coefficients are ranked or chosen/discarded on the basis of the magnitude or the like, a difference in contribution within the model is provided to a specific feature, so that this can be considered a weighting coefficient. A weighting coefficient is used in the same meaning as an approximation index of a differentiation function. Examples thereof include R2 value, correlation coefficient, regression coefficient, residual sum of squares (difference in feature from differentiation function))
… advantage ratio…(in at least [0373] A sigmoid function is an elemental function used at various levels, which can certainly be a neuron firing principle, as well a pain reaction function, pain differentiator, or pain occurrence function (see FIG. 13), a feature contracting tool described above (see FIG. 14), or can be related to the selection process of a differentiation model described herein. Therefore, in a preferred embodiment, an asymptote of the minimum value and maximum value is derived by sigmoid approximation by first limiting the inflection area to that with a relatively large inflection range (amplitude). Next, the variation in differentiation accuracy is expressed as a representative value (minimum value, maximum value), and the value of improvement in differentiation accuracy (maximum value−minimum value) is calculated, and with the maximum value as the threshold value, the number of features that first exceeds this value can be presented as an economical differentiation model attaining the maximum gain in the percentage of improvement.)
The reason and rationale to combine Schoedl and Nakae is the same as recited above. 


As per Claim 8, Schoedl teaches: (Currently Amended) The computer-implemented method of claim 1, further comprising: 
generating a cohort predictive metric data object based at least in part on the predictive metric data object, wherein the cohort predictive metric data object is indicative of an average predictive metric data object for the input data object cohort. (in at least [0090] A “prediction-variance-interval” as used herein is a measure that is used to quantify the amount of variation or dispersion of a set of prediction scores computed by a model-based prediction logic. A small prediction-variance-interval indicates a small amount of variation, a large prediction-variance-interval indicates a large amount of variation. Hence, a small prediction-variance-interval (covering only about 7% of the score range or less) may indicate that the score values computed on the currently used input data values and on similar input data values tend to be close to an expected score value. A large prediction-variance-interval indicates that the score values computed on the currently used input data values and on similar input data values tend to be spread out over a wider range of values. The “prediction-variance-interval” can be implemented as a sub-range of score values whereby the width of this sub-range is a measure of the amount of variation or dispersion of the score values.  [0091] the prediction-variance-interval represents a standard deviation of score values. This may be advantageous as the standard deviation is algebraically simpler than other measures of variance such as the average absolute deviation. However, there are also other measures of the deviation of a prediction score from an expected value, including average absolute deviation, which provide different mathematical properties from standard deviation. )


As per Claim 9, Schoedl teaches: (Currently Amended) The computer-implemented method of claim 8, further comprising: 
identifying an optimization cohort cluster for the input data object cohort based at least in part on the cohort predictive metric data object, wherein the optimization cohort cluster comprises one or more … input data objects of the input data object cohort.  (in at least [0090] A “prediction-variance-interval” as used herein is a measure that is used to quantify the amount of variation or dispersion of a set of prediction scores computed by a model-based prediction logic. A small prediction-variance-interval indicates a small amount of variation, a large prediction-variance-interval indicates a large amount of variation. Hence, a small prediction-variance-interval (covering only about 7% of the score range or less) may indicate that the score values computed on the currently used input data values and on similar input data values tend to be close to an expected score value. A large prediction-variance-interval indicates that the score values computed on the currently used input data values and on similar input data values tend to be spread out over a wider range of values. The “prediction-variance-interval” can be implemented as a sub-range of score values whereby the width of this sub-range is a measure of the amount of variation or dispersion of the score values. [0091] the prediction-variance-interval represents a standard deviation of score values. This may be advantageous as the standard deviation is algebraically simpler than other measures of variance such as the average absolute deviation. However, there are also other measures of the deviation of a prediction score from an expected value, including average absolute deviation, which provide different mathematical properties from standard deviation. [0135] a prediction score of 0.7 indicates that a particular drug whose respective target molecule name was provided as input to the prediction logic 956 is predicted to be highly likely approved by the FDA. A prediction score of −0.8 indicates that a particular drug whose respective target molecule name was provided as input to the prediction logic 956 is predicted to be highly likely rejected (not approved) by the FDA. A prediction score of about 0 indicates that the model is not able to clearly predict, for the input data 969 currently provided, whether or not the FDA will approve the drug or not, because the likelihood of refusal and the likelihood of acceptance are considered identical or highly similar by the model. [0122] The first confidence interval 256.1 is a first sub-interval of the score range and is indicative of the model-specific sub-range of score values known to have a percentage of false negative (FN) predictions below a predefined FN-percentage threshold. For example, the first confidence interval 256.1 can be the sub-interval of the score ranges for which it is known, e.g. based on a statistical analysis of a plurality of model predictions, that any prediction score within this sub-interval has a likelihood of being a false negative score value that is less than the predefined FN-percentage threshold, e.g. less than 10%, or less than 5%, or less than 1%. The suitable size of the FN-percentage threshold preferably depends on the type of prediction that is computed: in case a false negative result would impose significant financial or health-related costs on a patient or the society, the predefined FN-percentage threshold is chosen such that the resulting first sub-range is comparatively narrow. For example, the first sub-range is chosen such that it covers only score values known to comprise a false negative (FN)-percentage of less than 5%. To the contrary, in case a false negative result would not impose significant financial or health-related costs on a patient or the society, the predefined FN-percentage threshold is chosen such that the resulting first sub-range is comparatively broad. For example, the first sub-range is chosen such that it covers only score values known to comprise a false negative percentage of less than 25%. In some embodiments, the first sub-range selectively covers score values known to comprise a false negative percentage of less than 10%. [0123] The second confidence interval 256.2 is a second sub-interval of the score range and is indicative of the model-specific sub-range of score values known to have a percentage of false positive (FP) predictions below a predefined FP-percentage threshold. For example, the second confidence interval 256.2 can be the sub-interval of the score ranges for which it is known, e.g. based on a statistical analysis of a plurality of model predictions, that any prediction score within this sub-interval has a likelihood of being a false positive score value that is less than the predefined FP-percentage threshold, e.g. less than 10%, or less than 5%, or less than 1%. The suitable size of the FP-percentage threshold preferably depends on the type of prediction that is computed: in case a false positive result would impose significant financial or health-related costs on a patient or the society, the predefined FP-percentage threshold is chosen such that the resulting second sub-range is comparatively narrow. For example, the second sub-range is chosen such that it covers only score values known to comprise a false positive-percentage of less than 5%. To the contrary, in case a false positive result would not impose significant financial or health-related costs on a patient or the society, the predefined FP-percentage threshold is chosen such that the resulting second sub-range is comparatively broad. For example, the second sub-range is chosen such that it covers only score values known to comprise a FP-percentage of less than 10%)
Although implied, Schoedl does not expressly disclose the following limitations, which however, are taught by Nakae,
…outlier…(in at least [0701] the detection results of outliers by the following sample augmentation methods are compared. 1. Sample augmentation method (OLD) 2. Sample augmentation method (PCA))
The reason and rationale to combine Schoedl and Nakae is the same as recited above. 


As per Claim 10, Schoedl teaches: (Currently Amended) The computer-implemented method of claim 1, 
wherein the processing optimization action comprises a processing recommendation for improving the cohort predictive metric data object of the input data object cohort. (in at least [0136] the model generation based on the training data is performed fully automatically, e.g. within a computer implemented model generation and update framework. For example, the training data 966 can be updated and supplemented with additional data on a regular basis, e.g. once a week or once a month. This may be highly advantageous in biomedical domains where the amount of available data is rapidly increasing. This is the case for example for biomedical literature data. The model generation and update framework is preferably configured such that whenever the training data 966 is supplemented with additional training data or is modified by removing or replacing some parts of the training data, the machine learning logic 956 is automatically re-trained on the updated version of the training data 966. Thereby, also an updated version of the biomedical model 958 is automatically generated. If the updated version of the model is used for computing the same prediction on the same inputs data 969 a further time, the prediction result will differ from the previously generated prediction results, because the model has integrated additional, new knowledge that may have an impact on the outcome of a prediction. [0158] provide an intuitive, dense overview for a plurality of different models and also allows a user to monitor trends having an effect on the quality of a model. For example, if for a particular prediction task a plurality of prediction results are available having been generated by different versions of a model, then in some embodiments the analog scale icons generated for the prediction results are combined into a single moving image, e.g. an animated gif or a video clip wherein the elements of the icon, e.g. the arrow 218, the sub-range indicators 202, 204, 206 and/or the variance bar 258 may change their respective position and size. When a user clicks on the moving image, the elements of the analog scale icons change their size and or position. For example, in case different versions of a model correspond to continuously increasing training data set, it may happen that the growing amount of data available may allow increasing the accuracy and predictive power of a particular model. Thus, while the prediction scores of the initial predictions may be ambiguous and close to zero and the sub-range indicators 202, 204 may be very narrow, the prediction scores generated by later versions of the models may clearly indicate a positive (or negative) answer and the sub-range indicators 202, 204 may be very broad. In some cases, the model quality may also deteriorate in case the additional available data comprises information that is in contrast to a hypothesis that was hitherto supported by the outdated versions of the training data set. Thus, a user can easily recognize, by watching a moving image generated from a plurality of analog scale icons representing the prediction results of many different versions of the same model for the same prediction tasks, whether the quality of a model changed over time and if the change results in an improvement or deterioration of the prediction quality. [0170] FIG. 9 depicts a further example of an analog scale icon displayed on a GUI 902. The icon depicted in this figure represents an integrated prediction generated by an integrated model. The integrated model uses the output generated by many different predictive models as input for generating an overall prediction result that may also comprise the elements 216, 254, 256 described already with reference to FIG. 2A. The GUI may comprise additional meta data, e.g. the predictive task, the number of publications identified mentioning the disease and the drug target, the specificity for positive and negative outcome and the like.)


As per Claim 11, Schoedl teaches: (Currently Amended) The computer-implemented method of claim 9, 
wherein the processing optimization action for the predictive entity is based at least in part on one or more shared cluster attributes associated with at least one of the one or more … input data objects. (in at least [0122] The first confidence interval 256.1 is a first sub-interval of the score range and is indicative of the model-specific sub-range of score values known to have a percentage of false negative (FN) predictions below a predefined FN-percentage threshold. For example, the first confidence interval 256.1 can be the sub-interval of the score ranges for which it is known, e.g. based on a statistical analysis of a plurality of model predictions, that any prediction score within this sub-interval has a likelihood of being a false negative score value that is less than the predefined FN-percentage threshold, e.g. less than 10%, or less than 5%, or less than 1%. The suitable size of the FN-percentage threshold preferably depends on the type of prediction that is computed: in case a false negative result would impose significant financial or health-related costs on a patient or the society, the predefined FN-percentage threshold is chosen such that the resulting first sub-range is comparatively narrow. For example, the first sub-range is chosen such that it covers only score values known to comprise a false negative (FN)-percentage of less than 5%. To the contrary, in case a false negative result would not impose significant financial or health-related costs on a patient or the society, the predefined FN-percentage threshold is chosen such that the resulting first sub-range is comparatively broad. For example, the first sub-range is chosen such that it covers only score values known to comprise a false negative percentage of less than 25%. In some embodiments, the first sub-range selectively covers score values known to comprise a false negative percentage of less than 10%. [0123] The second confidence interval 256.2 is a second sub-interval of the score range and is indicative of the model-specific sub-range of score values known to have a percentage of false positive (FP) predictions below a predefined FP-percentage threshold. For example, the second confidence interval 256.2 can be the sub-interval of the score ranges for which it is known, e.g. based on a statistical analysis of a plurality of model predictions, that any prediction score within this sub-interval has a likelihood of being a false positive score value that is less than the predefined FP-percentage threshold, e.g. less than 10%, or less than 5%, or less than 1%. The suitable size of the FP-percentage threshold preferably depends on the type of prediction that is computed: in case a false positive result would impose significant financial or health-related costs on a patient or the society, the predefined FP-percentage threshold is chosen such that the resulting second sub-range is comparatively narrow. For example, the second sub-range is chosen such that it covers only score values known to comprise a false positive-percentage of less than 5%. To the contrary, in case a false positive result would not impose significant financial or health-related costs on a patient or the society, the predefined FP-percentage threshold is chosen such that the resulting second sub-range is comparatively broad. For example, the second sub-range is chosen such that it covers only score values known to comprise a FP-percentage of less than 10%.)
Although implied, Schoedl does not expressly disclose the following limitations, which however, are taught by Nakae,
…outlier…(in at least [0701] the detection results of outliers by the following sample augmentation methods are compared. 1. Sample augmentation method (OLD) 2. Sample augmentation method (PCA))
The reason and rationale to combine Schoedl and Nakae is the same as recited above. 


As per Claim 12, Schoedl teaches: (Currently Amended) The computer-implemented method of claim 11, further comprising: 
initiating the processing optimization action; (in at least [0040] in response to a first prediction request of a user, a first prediction result with a prediction score of 0.7 and a short prediction variance bar, indicating a small range of variability of the score along the area of the scale (e.g. ranging from 0.69 to 0.71), is generated and displayed. The score value and the direction of the pointer may indicate that a hypothesis like “drug X will be approved by FDA for treating disease D” is predicted to be true with a score value of 0.7. The range 0.69 to 0.71 may be computed based on a user-defined or otherwise defined prediction-confidence-level, e.g. 95%. The pointer may be covered by a short variance bar representing and indicating the width of 0.02 score units. The user immediately and intuitively comprehends that the certainty of this particular result is very high, because the variance bar is short. [0051] the biomedical model used by the machine learning logic is a first biomedical model having been generated based on first training data. The mobile device is one of a plurality of mobile devices respectively assigned to one of a plurality of users. The method further comprises registering the plurality of users and a plurality of biomedical prediction tasks at a backend program. For example, the backend program may maintain and manage a user- and prediction task registry. Each registered user has assigned one or more of the prediction tasks. The machine learning logic performs each of the prediction tasks, thereby respectively using the first biomedical model for generating a first prediction result.  [0052] prediction tasks are executed based on many different types of models, e.g. literature-based models, microarray-data based models, and the like. The user and task registry further comprises an assignment of prediction tasks and model-IDs and the background program is configured to select for each prediction task to be performed or re-performed the appropriate model based on the model-ID and task assignment in the registry. [0158] in case different versions of a model correspond to continuously increasing training data set, it may happen that the growing amount of data available may allow increasing the accuracy and predictive power of a particular model. Thus, while the prediction scores of the initial predictions may be ambiguous and close to zero and the sub-range indicators 202, 204 may be very narrow, the prediction scores generated by later versions of the models may clearly indicate a positive (or negative) answer and the sub-range indicators 202, 204 may be very broad. In some cases, the model quality may also deteriorate in case the additional available data comprises information that is in contrast to a hypothesis that was hitherto supported by the outdated versions of the training data set.)
generating a second iteration cohort predictive metric data object; and (in at least [0051] sending the first prediction results selectively to the mobile devices of the users to which the prediction tasks for which the first prediction results were generated are assigned. In response to each re-training of the machine learning logic, the machine learning logic automatically performs each of the prediction tasks a further time, thereby respectively using the updated version of the biomedical model for generating a second prediction result. Then, the method comprises sending the second prediction results or a notification of their computation selectively to the mobile devices of the users to which the prediction tasks for which the first prediction results were generated are assigned. [0158] in case different versions of a model correspond to continuously increasing training data set, it may happen that the growing amount of data available may allow increasing the accuracy and predictive power of a particular model. Thus, while the prediction scores of the initial predictions may be ambiguous and close to zero and the sub-range indicators 202, 204 may be very narrow, the prediction scores generated by later versions of the models may clearly indicate a positive (or negative) answer and the sub-range indicators 202, 204 may be very broad. In some cases, the model quality may also deteriorate in case the additional available data comprises information that is in contrast to a hypothesis that was hitherto supported by the outdated versions of the training data set.)
responsive to the second iteration cohort predictive metric data object not achieving an optimization threshold, generating a second iteration processing optimization action for the predictive entity. (in at least [0054] the backend program compares the first prediction result and the second prediction result computed for each prediction task. The sending of the second prediction results or the sending of the notification of their computation is performed selectively for those prediction tasks for which a first prediction result and a second prediction result were computed which fulfill one or more of the following conditions: the score value of the second prediction result but not the score value of the first prediction result lies within the first confidence interval; for example, this may mean that the new prediction result is observed to have entered a score range considered to be particularly reliably due to a low portion of FNs (e.g. a ratio of <10% FNs); or the score value of the first prediction result but not the score value of the second prediction result lies within the first confidence interval; for example, this may mean that the new prediction result is observed to have suddenly entered a score area considered to be non-reliable due to many FNs (e.g. a ratio of >10% FNs); or the score value of the first prediction result but not the score value of the second prediction result lies within the second confidence interval; for example, this may mean that the new prediction result is observed to have suddenly left a score area considered to be particularly reliable due to a low portion of FPs (e.g. a ratio of <10%FPs); or the score value of the second prediction result but not the score value of the first prediction result lies within the second confidence interval; for example, this may mean that the new prediction result is observed to have suddenly entered a score area considered to be particularly reliable due to a low ratio of FPs (e.g. a ratio of <10%FPs); or the score value of the first and second prediction result differ by more than a predefined score difference threshold; for example, this may mean that the model-based prediction suddenly improves or deteriorates significantly; or the size of the prediction-variance-interval of the first and second prediction result differ by more than a predefined interval length difference threshold. For example, this may mean that the quality of the model-based prediction suddenly improves or deteriorates significantly, e.g. due to changed variability of the training data.[0061] This may help avoiding network traffic and unnecessarily disturbing the registered users, because a user is notified of a prediction result only in case the prediction result generated based on the updated model is significantly different from the prediction result generated based on the previous model version and/or only in case the certainty of the model or the certainty of the prediction result generated based on the updated model is significantly different from the certainty of the model or the certainty of the prediction result generated based on the previous model version. [0187] The user can set different values for the first and second confidence intervals, the prediction- variance-interval and the score value difference threshold. )

As per Claim 13-16 for An apparatus (see at least Schoedl [0172]), substantially recite the subject matter of Claim 1-4 and are rejected based on the same reasoning and rationale.

As per Claim 17-20 for A computer program product (see at least Schoedl [0172]), substantially recite the subject matter of Claim 1, 8-10 and are rejected based on the same reasoning and rationale.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PO HAN MAX LEE whose telephone number is (571)272-3821.  The examiner can normally be reached on Mon-Thurs 8:00 am - 7:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rutao Wu can be reached on (571) 272-6045.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PO HAN LEE/Primary Examiner, Art Unit 3623
Read full office action
Prosecution Timeline

Dec 09, 2022
Application Filed
Nov 01, 2025
Non-Final Rejection — §101, §103
Jan 07, 2026
Examiner Interview Summary
Jan 07, 2026
Applicant Interview (Telephonic)
Feb 02, 2026
Response Filed
Feb 20, 2026
Final Rejection — §101, §103
Mar 27, 2026
Interview Requested
Apr 02, 2026
Examiner Interview Summary
Apr 02, 2026
Applicant Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

18/100,491
Patent 12602629
USING MACHINE LEARNING TO PREDICT FLEET MOVES IN HYDRAULIC FRACTURING OPERATIONS
2y 5m to grant Granted Apr 14, 2026
17/979,302
Patent 12548089
OPTIMIZATION OF HYBRID GROWING INFRASTRUCTURE FOR DIFFERENT WEATHER PROFILES AND MARKET CONDITIONS
2y 5m to grant Granted Feb 10, 2026
18/417,506
Patent 12548046
SYSTEM FOR ACCURATE PREDICTIONS USING A PREDICTIVE MODEL
2y 5m to grant Granted Feb 10, 2026
18/940,667
Patent 12547241
SYSTEMS AND METHODS FOR COMPUTER-IMPLEMENTED SURVEYS
2y 5m to grant Granted Feb 10, 2026
17/449,344
Patent 12361363
METHOD AND SYSTEM FOR PROFICIENCY IDENTIFICATION
2y 5m to grant Granted Jul 15, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
32%
Grant Probability
74%
With Interview (+41.2%)
3y 6m
Median Time to Grant
Moderate
PTA Risk
Based on 158 resolved cases by this examiner. Grant probability derived from career allow rate.
DATA MODELING AND PROCESSING TECHNIQUES FOR GENERATING PREDICTIVE METRICS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email