Last updated: April 19, 2026
Application No. 17/809,141
LEARNING SYSTEM, LEARNING DEVICE, LEARNING METHOD, AND STORAGE MEDIUM

Non-Final OA §103
Filed
Jun 27, 2022
Examiner
DWIVEDI, MAHESH H
Art Unit
2168
Tech Center
2100 — Computer Architecture & Software
Assignee
Canon Medical Systems Corporation
OA Round
3 (Non-Final)
Interview Optional

— +4.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 751 resolved cases, 2023–2026
Examiner Intelligence

DWIVEDI, MAHESH H View full profile →
Grants 69% — above average
Career Allow Rate
521 granted / 751 resolved
+14.4% vs TC avg
Minimal +4% lift
Without
With
+4.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 6m
Avg Prosecution
21 currently pending
Career history
772
Total Applications
across all art units
Statute-Specific Performance

§101
16.5%
-23.5% vs TC avg
§103
40.2%
+0.2% vs TC avg
§102
17.2%
-22.8% vs TC avg
§112
19.5%
-20.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 751 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
2.	Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Continued Examination Under 37 CFR 1.114
3.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/26/2025 has been entered.
Response to Amendment
4.         Receipt of Applicant’s Amendment filed on 12/26/2025 is acknowledged.  The amendment includes the amending of claims 1, 6, and 9-11, the cancellation of claim 2, and the addition of claims 12-13.
Claim Rejections - 35 USC § 103
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
6.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
7.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
8.	Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Szeto et al. (U.S. PGPUB 2018/0018590), in view of Yurtsever et al. (Article entitled “A Traffic Flow Simulation Framework for Learning Driver Heterogeneity from Naturalistic Driving Data using Autoencoders”, dated 2019).
9.	Regarding claim 9, Szeto teaches a learning device comprising:
A)  processing circuitry (Paragraph 27);
B)  a medical information database storing second medical data (Paragraphs 53 and 67, Figure 2);
C)  wherein the processing circuity is configured to: extract a data set that is used for updating, out of a second data set that is the second medical data, of data based on a second cohort associated with the second site on the basis of a first data distribution which is calculated by a first site of another medical facility and which is associated with a first data set that is a first medical data used to train a first model out of data sets based on a first cohort (Paragraphs 51, 106, and 121); 
D)  update the first model on the basis of at least part of the second data set based on the second cohort associated with the second site (Paragraphs 51, 106, and 121); and 
E)  transmit the updated first model to the first site (Paragraphs 51 and 121);
I)  update the first model on the basis of the second data set based on the selected second cohort (Paragraphs 33, 51, and 106-108, Figure 1).
	The examiner notes that Szeto teaches “processing circuitry” as “One of ordinary skill in the art should appreciate that the computing devices comprise one or more processors configured to execute software instructions that are stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, FPGA, PLA, PLD, solid state drive, RAM, flash, ROM, external drive, memory stick, etc.)” (Paragraph 27).  The examiner further notes that the one or more processors teach the claimed processing circuitry.  The examiner further notes that Szeto further teaches “a medical information database storing second medical data” as “The example presented in FIG. 2 illustrates the inventive concepts from the perspective of how private data server 224 interacts with a remote computing device and private data 222. In more preferred embodiments, private data 222 comprises local private healthcare data, or more specifically includes patient-specific data (e.g., name, SSN, normal WGS, tumor WGS, genomic diff objects, a patient identifier, etc.). Entity 220 typically is an institution having private local raw data and subject to restrictions as discussed above. Example entities include hospitals, labs, clinics, pharmacies, insurance companies, oncologist offices, or other entities having locally stored data” (Paragraph 53) and “Modeling engine 226 leverages the data selection criteria from model instructions 230 to create a result set from private data 222, possibly via submitting a query to the database storing private data 222” (Paragraph 67).  The examiner further notes that a database housing private medical data 222 includes the claimed second medical data.  The examiner further notes that Szetko teaches “wherein the processing circuity is configured to: extract a data set that is used for updating, out of a second data set that is the second medical data, of data based on a second cohort associated with the second site on the basis of a first data distribution which is calculated by a first site of another medical facility and which is associated with a first data set that is a first medical data used to train a first model out of data sets based on a first cohort” as “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), and “The disclosed approach of distributed, online machine learning can leverage numerous techniques for validating trained models. One approach includes a first private data server sending its trained actual model to other private data servers. The other private data servers can then validate the trained actual model on their own local data and send the results back to the first private data server” (Paragraph 121).  The examiner further notes that a second site can receive distributed data associated with a first cohort (which is used to update a local model at a first site with first medical data) from the first site that is used with second “calculated” distributed data associated with a second cohort for subsequent updating of its local model at the second site.  The examiner further notes that Szetko teaches “update the first model on the basis of at least part of the second data set based on the second cohort associated with the second site” as “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), and “The disclosed approach of distributed, online machine learning can leverage numerous techniques for validating trained models. One approach includes a first private data server sending its trained actual model to other private data servers. The other private data servers can then validate the trained actual model on their own local data and send the results back to the first private data server” (Paragraph 121).  The examiner further notes that a second site can receive distributed data associated with a first cohort (which is used to update a local model at a first site) from the first site that is used with second “calculated” distributed data associated with a second cohort for subsequent updating of its local model at the second site.  Moreover, updated models themselves can be sent from a first site to a second site.  Such updated models can be updated via the application of second distributed data at the second site.  The examiner further notes that Szetko teaches “transmit the updated first model to the first site” as “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), and “The disclosed approach of distributed, online machine learning can leverage numerous techniques for validating trained models. One approach includes a first private data server sending its trained actual model to other private data servers. The other private data servers can then validate the trained actual model on their own local data and send the results back to the first private data server” (Paragraph 121).  The examiner further notes that a second site can receive distributed data associated with a first cohort (which is used to update a local model at a first site) from the first site that is used with second “calculated” distributed data associated with a second cohort for subsequent updating of its local model at the second site.  Moreover, updated models themselves can be sent from a first site to a second site.  Such updated models can be updated (and subsequently sent back to the first site) via the application of second distributed data at the second site.  The examiner further notes that Szeto teaches “update the first model on the basis of the second data set based on the selected second cohort” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server updating a global model is based on acquired data distributions from various sites (including data (i.e. second medical data) based on a “selected” second cohort).  Alternatively, a specific site acquiring data distributions from another site in accordance with a first cohort also updates its local model based on data (i.e. second medical data) corresponding to its “selected” second cohort.     
	Szeto does not explicitly teach:
F)  calculate a first index value of the first data distribution and a second index value of a second data distribution for the second data set based on each of the plurality of second cohorts;
G)  calculate similarity between the first data distribution and each second data distribution based on the first index value and each second index value;
H)  select the second cohort of the second data distribution having the highest similarity from the plurality of second cohorts.
	Yurtsever, however, teaches “calculate a first index value of the first data distribution and a second index value of a second data distribution for the second data set based on each of the plurality of second cohorts” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2), “calculate similarity between the first data distribution and each second data distribution based on the first index value and each second index value” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2), and “select the second cohort of the second data distribution having the highest similarity from the plurality of second cohorts” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2).
	The examiner further notes that the secondary reference of Yurtsever teaches the concept of comparing histograms for subsequent selection via the use of KL divergence.  Moreover, the instant application (See Page 15, lines 19-20) explicitly states that KL divergence is used in its selection process.  The combination would result in using KL divergence to perform the selection of the second cohort in Szeto.
 It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Yurtsever’s would have allowed Szeto’s to provide a method for comparing histograms, as noted by Yurtsever (Page 92, Section 4.2).
10.	Claims 1, 3-4, 6, 8, and 10-13 are rejected under 35 U.S.C. 103 as being unpatentable over Szeto et al. (U.S. PGPUB 2018/0018590), in view of Yurtsever et al. (Article entitled “A Traffic Flow Simulation Framework for Learning Driver Heterogeneity from Naturalistic Driving Data using Autoencoders”, dated 2019) as applied to claim 9 above, and further in view of Vodencarevic et al. (U.S. PGPUB 2021/0097439).
11.	Regarding claims 1 and 10-11, Szeto teaches a medical learning system, learning method, and non-transitory computer-readable storage medium comprising:
A)  processing circuitry (Paragraph 27);
C)  a medical information database storing first medical data and second medical data (Paragraphs 53 and 67, Figure 2);
D)  wherein the processing circuitry is configured to: acquire a first data distribution for a first data set that is the first medical data out of data sets based on a first cohort associated with a target site of a medical facility (Paragraphs 33, 51, and 106-108, Figure 1); 
E)  select a second cohort that is used to update a first model that is used in the target site out of a plurality of second cohorts on the basis of the acquired first data distribution (Paragraphs 33, 51, and 106-108, Figure 1); 
F)  the second cohort being associated with a candidate site of another medical facility (Paragraphs 33, 51, and 106-108, Figure 1);
G)  update the first model on the basis of a second data set that is the second medical data out of data sets based on the selected second cohort (Paragraphs 33, 51, and 106-108, Figure 1); 
K)  update the first model on the basis of the second data set based on the selected second cohort (Paragraphs 33, 51, and 106-108, Figure 1).
	The examiner notes that Szeto teaches “processing circuitry” as “One of ordinary skill in the art should appreciate that the computing devices comprise one or more processors configured to execute software instructions that are stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, FPGA, PLA, PLD, solid state drive, RAM, flash, ROM, external drive, memory stick, etc.)” (Paragraph 27).  The examiner further notes that the one or more processors teach the claimed processing circuitry.  The examiner further notes that Szeto further teaches “a medical information database storing first medical data and second medical data” as “The example presented in FIG. 2 illustrates the inventive concepts from the perspective of how private data server 224 interacts with a remote computing device and private data 222. In more preferred embodiments, private data 222 comprises local private healthcare data, or more specifically includes patient-specific data (e.g., name, SSN, normal WGS, tumor WGS, genomic diff objects, a patient identifier, etc.). Entity 220 typically is an institution having private local raw data and subject to restrictions as discussed above. Example entities include hospitals, labs, clinics, pharmacies, insurance companies, oncologist offices, or other entities having locally stored data” (Paragraph 53) and “Modeling engine 226 leverages the data selection criteria from model instructions 230 to create a result set from private data 222, possibly via submitting a query to the database storing private data 222” (Paragraph 67).  The examiner further notes that a database housing private medical data 222 includes first medical data and second medical data.  The examiner further notes that Szeto teaches “wherein the processing circuitry is configured to: acquire a first data distribution for a first data set that is the first medical data out of data sets based on a first cohort associated with a target site of a medical facility” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server acquiring data distributions from various sites includes specific data distribution from a specific site for a specific cohort (such as smokers).  Alternatively, a specific site acquiring data distributions from another site in accordance with a cohort also teaches the claimed acquiring.  The examiner further notes that Szeto teaches “select a second cohort that is used to update a first model that is used in the target site out of a plurality of second cohorts on the basis of the acquired first data distribution” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server updating a global model is based on acquired data distributions from various sites (including a “selected” second cohort).  Alternatively, a specific site acquiring data distributions from another site in accordance with a first cohort also “selects” a second cohort from its own site to update its local model.  The examiner further notes that Szeto teaches “the second cohort being associated with a candidate site of another medical facility” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server updating a global model is based on acquired data distributions from various sites (including a “selected” second cohort that is ”associated” with a candidate medical site).  Alternatively, a specific site acquiring data distributions from another site in accordance with a first cohort also “selects” a second cohort that is ”associated” from its own medical site to update its local model.  The examiner further notes that Szeto teaches “update the first model on the basis of a second data set that is the second medical data out of data sets based on the selected second cohort” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server updating a global model is based on acquired data distributions from various sites (including data (i.e. second medical data) based on a “selected” second cohort).  Alternatively, a specific site acquiring data distributions from another site in accordance with a first cohort also updates its local model based on data (i.e. second medical data) corresponding to its “selected” second cohort.  The examiner further notes that Szeto teaches “update the first model on the basis of the second data set based on the selected second cohort” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server updating a global model is based on acquired data distributions from various sites (including data (i.e. second medical data) based on a “selected” second cohort).  Alternatively, a specific site acquiring data distributions from another site in accordance with a first cohort also updates its local model based on data (i.e. second medical data) corresponding to its “selected” second cohort.    
	Szeto does not explicitly teach:
H)  calculate a first index value of the first data distribution and a second index value of a second data distribution for the second data set based on each of the plurality of second cohorts;
I)  calculate similarity between the first data distribution and each second data distribution based on the first index value and each second index value;
J)  select the second cohort of the second data distribution having the highest similarity from the plurality of second cohorts.
	Yurtsever, however, teaches “calculate a first index value of the first data distribution and a second index value of a second data distribution for the second data set based on each of the plurality of second cohorts” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2), “calculate similarity between the first data distribution and each second data distribution based on the first index value and each second index value” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2), and “select the second cohort of the second data distribution having the highest similarity from the plurality of second cohorts” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2).
	The examiner further notes that the secondary reference of Yurtsever teaches the concept of comparing histograms for subsequent selection via the use of KL divergence.  Moreover, the instant application (See Page 15, lines 19-20) explicitly states that KL divergence is used in its selection process.  The combination would result in using KL divergence to perform the selection of the second cohort in Szeto.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Yurtsever’s would have allowed Szeto’s to provide a method for comparing histograms, as noted by Yurtsever (Page 92, Section 4.2).
	Szeto and Yurtsever do not explicitly teach:
B)  a model information database;
E)  a first model that is stored in the model information database.
	Vodencarevic, however, teaches “a model information database” as “The step of storing may be seen as a step of archiving the machine learned model in an appropriate repository or database of the central server unit such that the machine learned models may be retrieved from the repository for further use. For instance, the machine learned models may be stored according to their fields of application or local sites/client units of origin. The repository or database may comprise a plurality of different machine learned models” (Paragraph 98) and “a first model that is stored in the model information database” as “The step of storing may be seen as a step of archiving the machine learned model in an appropriate repository or database of the central server unit such that the machine learned models may be retrieved from the repository for further use. For instance, the machine learned models may be stored according to their fields of application or local sites/client units of origin. The repository or database may comprise a plurality of different machine learned models” (Paragraph 98).
	The examiner further notes that although Szeto clearly has multiple models that are stored, there is no explicit teaching that such models are stored in a database.  Nevertheless, the secondary reference of Vodencarevic teaches the concept of storing trained ML models in a database.  The combination would result in the ML models of Szeto to be stored in a database.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Vodencarevic’s would have allowed Szeto’s and Yurtsever’s to provide a method for retrieving ML models from a repository for further use, as noted by Vodencarevic (Paragraph 98).

Regarding claim 3, Szeto further teaches a medical learning system comprising:
A)  wherein the processing circuitry is further configured to extract a data set that is used to update the first model from the second data set on the basis of the first data distribution (Paragraphs 33, 51, and 106-108, Figure 1).
	The examiner notes that Szeto teaches “wherein the processing circuitry is further configured to extract a data set that is used to update the first model from the second data set on the basis of the first data distribution” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server updating a global model is based on acquired data distributions from various sites (including data based on a “selected” second cohort) (which can include data sets that can be extracted).  Alternatively, a specific site acquiring data distributions from another site in accordance with a first cohort also updates its local model based on data corresponding to its “selected” second cohort.  Such data distributions can have data sets that can be extracted.

Regarding claim 4, Szeto further teaches a medical learning system comprising:
A)  wherein a data volume of the data sets based on the second cohort is greater than a data volume of the data sets based on the first cohort (Paragraphs 54 and 72, Figures 1-2).
	The examiner notes that Szeto teaches “wherein a data volume of the data sets based on the second cohort is greater than a data volume of the data sets based on the first cohort” as “The private data server 224 provides access to private data 222 on behalf of the stakeholders of entity 220. In more preferred embodiments, private data server 224 represents a local cache of specific patient data, especially data sets of large sizes” (Paragraph 54) and “In some embodiments where the amount of private data 222 is considered of high quality and is of sufficient size, transmitting actual model parameters 245 to a remote device can be of great benefit. However, one should further appreciate that entity 220 might not have a sufficiently large amount of local data to complete a research task” (Paragraph 72).  The examiner further notes that different sites can house varying amounts of patient data (i.e. a first site can house a larger amount of data for a first cohort in comparison to a second site housing a smaller amount of data for a second cohort).

	Regarding claim 6, Szeto teaches a learning system comprising:
A)  a plurality of sites configured to collect a data set based on a cohort and to operate a trained model (Paragraphs 42 and 45, Figure 1); and 
B)  a central server configured to acquire a data distribution of the data set collected by each of the plurality of sites (Paragraphs 42 and 45, Figure 1); 
C)  wherein the plurality of sites comprise a first site and a second site (Paragraph 42, Figure 1); 
D)  wherein the first site comprises:  processing circuitry (Paragraph 27);
F)  a medical information database storing first medical data (Paragraphs 53 and 67, Figure 2);
G)  wherein the processing circuitry is configured to:  calculate a first data distribution for a first data set that is the first medical data, out of data sets based on a first cohort with the first site (Paragraphs 51, 106, and 121);
H)  wherein the second site comprises:  first processing circuitry (Paragraph 27); and
I)  a medical information database storing second medical data (Paragraphs 53 and 67, Figure 2); 
J)  wherein the first processing circuitry is configured to:  update a first model on the basis of at least part of a second data set that is the second medical data based on a second cohort associated with the second site (Paragraphs 51, 106, and 121); 
K)  wherein the central server comprises:  second processing circuitry (Paragraph 27); and
M)  wherein the second processing circuitry is configured to:  acquire the calculated first data distribution (Paragraphs 33 and 106-108, Figure 1); 
S)  wherein the first processing circuitry is configured to update the first model on the basis of the second data set based on the selected second cohort (Paragraphs 33, 51, and 106-108, Figure 1).
	The examiner notes that Szeto teaches “a plurality of sites configured to collect a data set based on a cohort and to operate a trained model” as “a researcher has permission to access a central machine learning hub represented as non-private computing device 130, possibly executing as a global modeling engine 136. Non-private computing device 130 can comprise one or more global model servers (e.g., cloud, SaaS, PaaS, IaaS, LaaS, farm, etc.) that offer distributed machine learning services to the researcher. However, data of interest to the researcher resides on one or more of private data servers 124A, 124B, through 124N (collectively referred to as private data servers 124) located at one or more entities 120A through 120N over network 115 (e.g., wireless network, an intranet, a cellular network, a packet switched network, an ad-hoc network, the Internet, WAN, VPN, LAN, P2P, etc.). Network 115 can include any combination of the aforementioned networks. The entities can include hospital 120A, clinic 120B, through laboratory 120N (collectively referred to as entities 120). Each of entity 120 has access to its own local private data 122A through 122N (collectively referred to as private data 122), possibly stored on a local storage facility (e.g., a RAID system, a file server, a NAS, a SAN, a network accessible storage device, a storage area network device, a local computer readable memory, a hard disk drive, an optical storage device, a tape drive, a tape library, a solid state disk, etc.)” (Paragraph 42) and “In the ecosystem/system presented in FIG. 1, the issues associated with privacy restrictions of private data 122 are addressed by focusing on the knowledge gained from a trained machine learning algorithm rather than the raw data itself. Rather than requesting raw data from each of entity 120, the researcher is able to define a desired machine learning model that he/she wishes to create. The researcher may interface with system 100 through the non-private computing device 130; through one of the private data servers 124, provided that the researcher has been granted access to the private data server; or through a device external to system 100 that can interface with non-private computing device 130. The programmatic model instructions on how to create the desired model are then submitted to each relevant private data server 124, which also has a corresponding modeling engine 126 (i.e., 126A through 126N). Each local modeling engine 126 accesses its own local private data 122 and creates local trained models according to model instructions created by the researcher. As each modeling engine 126 gains new learned information, the new knowledge is transmitted back to the researcher at non-private computing device 130 once transmission criteria have been met. The new knowledge can then be aggregated into a trained global model via global modeling engine 136” (Paragraph 45).  The examiner further notes that sites 120a-120n teach the claimed plurality of sites.  Moreover, each site 120a-120n operates a trained model based on local data (i.e. a data set).  The examiner further notes that Szeto teaches “a central server configured to acquire a data distribution of the data set collected by each of the plurality of sites” as “a researcher has permission to access a central machine learning hub represented as non-private computing device 130, possibly executing as a global modeling engine 136. Non-private computing device 130 can comprise one or more global model servers (e.g., cloud, SaaS, PaaS, IaaS, LaaS, farm, etc.) that offer distributed machine learning services to the researcher. However, data of interest to the researcher resides on one or more of private data servers 124A, 124B, through 124N (collectively referred to as private data servers 124) located at one or more entities 120A through 120N over network 115 (e.g., wireless network, an intranet, a cellular network, a packet switched network, an ad-hoc network, the Internet, WAN, VPN, LAN, P2P, etc.). Network 115 can include any combination of the aforementioned networks. The entities can include hospital 120A, clinic 120B, through laboratory 120N (collectively referred to as entities 120). Each of entity 120 has access to its own local private data 122A through 122N (collectively referred to as private data 122), possibly stored on a local storage facility (e.g., a RAID system, a file server, a NAS, a SAN, a network accessible storage device, a storage area network device, a local computer readable memory, a hard disk drive, an optical storage device, a tape drive, a tape library, a solid state disk, etc.)” (Paragraph 42) and “Each local modeling engine 126 accesses its own local private data 122 and creates local trained models according to model instructions created by the researcher. As each modeling engine 126 gains new learned information, the new knowledge is transmitted back to the researcher at non-private computing device 130 once transmission criteria have been met. The new knowledge can then be aggregated into a trained global model via global modeling engine 136” (Paragraph 45).  The examiner further notes that central machine learning hub 130 teaches the claimed central server as depicted in Figure 1.  Moreover, such a hub 130 receives distributed data from each site as depicted in Figure 1.  The examiner further notes that Szeto teaches “wherein the plurality of sites comprise a first site and a second site” as “a researcher has permission to access a central machine learning hub represented as non-private computing device 130, possibly executing as a global modeling engine 136. Non-private computing device 130 can comprise one or more global model servers (e.g., cloud, SaaS, PaaS, IaaS, LaaS, farm, etc.) that offer distributed machine learning services to the researcher. However, data of interest to the researcher resides on one or more of private data servers 124A, 124B, through 124N (collectively referred to as private data servers 124) located at one or more entities 120A through 120N over network 115 (e.g., wireless network, an intranet, a cellular network, a packet switched network, an ad-hoc network, the Internet, WAN, VPN, LAN, P2P, etc.). Network 115 can include any combination of the aforementioned networks. The entities can include hospital 120A, clinic 120B, through laboratory 120N (collectively referred to as entities 120). Each of entity 120 has access to its own local private data 122A through 122N (collectively referred to as private data 122), possibly stored on a local storage facility (e.g., a RAID system, a file server, a NAS, a SAN, a network accessible storage device, a storage area network device, a local computer readable memory, a hard disk drive, an optical storage device, a tape drive, a tape library, a solid state disk, etc.)” (Paragraph 42).  The examiner further notes that 120a and 120b (amongst sites 120a-120n) teach the claimed first and second sites respectively.  The examiner further notes that Szeto teaches “wherein the first site comprises:  processing circuitry” as “One of ordinary skill in the art should appreciate that the computing devices comprise one or more processors configured to execute software instructions that are stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, FPGA, PLA, PLD, solid state drive, RAM, flash, ROM, external drive, memory stick, etc.)” (Paragraph 27).  The examiner further notes that the one or more processors teach the claimed processing circuitry.  The examiner further notes that Szeto further teaches “a medical information database storing first medical data” as “The example presented in FIG. 2 illustrates the inventive concepts from the perspective of how private data server 224 interacts with a remote computing device and private data 222. In more preferred embodiments, private data 222 comprises local private healthcare data, or more specifically includes patient-specific data (e.g., name, SSN, normal WGS, tumor WGS, genomic diff objects, a patient identifier, etc.). Entity 220 typically is an institution having private local raw data and subject to restrictions as discussed above. Example entities include hospitals, labs, clinics, pharmacies, insurance companies, oncologist offices, or other entities having locally stored data” (Paragraph 53) and “Modeling engine 226 leverages the data selection criteria from model instructions 230 to create a result set from private data 222, possibly via submitting a query to the database storing private data 222” (Paragraph 67).  The examiner further notes that a database housing private medical data 222 includes first medical data.  The examiner further notes that Szetko teaches “wherein the processing circuitry is configured to:  calculate a first data distribution for a first data set that is the first medical data, out of data sets based on a first cohort with the first site” as “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), and “The disclosed approach of distributed, online machine learning can leverage numerous techniques for validating trained models. One approach includes a first private data server sending its trained actual model to other private data servers. The other private data servers can then validate the trained actual model on their own local data and send the results back to the first private data server” (Paragraph 121).  The examiner further notes that a second site can receive distributed data associated with a first cohort (which is used to update a local model at a first site with first medical data) from the first site that is “calculated” before being transmitted to a second site.  The examiner further notes that Szeto teaches “wherein the second site comprises:  first processing circuitry” as “One of ordinary skill in the art should appreciate that the computing devices comprise one or more processors configured to execute software instructions that are stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, FPGA, PLA, PLD, solid state drive, RAM, flash, ROM, external drive, memory stick, etc.)” (Paragraph 27).  The examiner further notes that the one or more processors teach the claimed processing circuitry.  The examiner further notes that Szeto further teaches “a medical information database storing second medical data” as “The example presented in FIG. 2 illustrates the inventive concepts from the perspective of how private data server 224 interacts with a remote computing device and private data 222. In more preferred embodiments, private data 222 comprises local private healthcare data, or more specifically includes patient-specific data (e.g., name, SSN, normal WGS, tumor WGS, genomic diff objects, a patient identifier, etc.). Entity 220 typically is an institution having private local raw data and subject to restrictions as discussed above. Example entities include hospitals, labs, clinics, pharmacies, insurance companies, oncologist offices, or other entities having locally stored data” (Paragraph 53) and “Modeling engine 226 leverages the data selection criteria from model instructions 230 to create a result set from private data 222, possibly via submitting a query to the database storing private data 222” (Paragraph 67).  The examiner further notes that a database housing private medical data 222 includes second medical data.  The examiner further notes that Szetko teaches “wherein the first processing circuitry is configured to:  update a first model on the basis of at least part of a second data set that is the second medical data based on a second cohort associated with the second site” as “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), and “The disclosed approach of distributed, online machine learning can leverage numerous techniques for validating trained models. One approach includes a first private data server sending its trained actual model to other private data servers. The other private data servers can then validate the trained actual model on their own local data and send the results back to the first private data server” (Paragraph 121).  The examiner further notes that a second site can receive distributed data associated with a first cohort (which is used to update a local model at a first site) from the first site that is used with second “calculated” distributed data associated with a second cohort for subsequent updating of its local model at the second site.  Moreover, updated models themselves can be sent from a first site to a second site.  Such updated models can be updated via the application of second distributed data at the second site.  The examiner further notes that Szeto teaches “wherein the central server comprises:  second processing circuitry” as “One of ordinary skill in the art should appreciate that the computing devices comprise one or more processors configured to execute software instructions that are stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, FPGA, PLA, PLD, solid state drive, RAM, flash, ROM, external drive, memory stick, etc.)” (Paragraph 27).  The examiner further notes that the one or more processors teach the claimed processing circuitry.  The examiner notes that Szeto teaches “wherein the second processing circuitry is configured to:  acquire the calculated first data distribution” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server acquiring data distributions from various sites includes specific data distribution from a specific site for a specific cohort (such as smokers).  The examiner further notes that Szeto teaches “select the second cohort that is used to update the first model out of a plurality of second cohorts on the basis of the first data distribution” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server updating a global model is based on acquired data distributions from various sites (including data based on a “selected” second cohort amongst multiple cohorts).  The examiner further notes that Szeto teaches “wherein the processing circuitry of the first site is configured to verify an aptitude of the updated first model to the first cohort” as “As proxy data 260 is generated and relayed to the global model server 130, the global model server aggregates the data and generates an updated global model. Once the global model is updated, it can be determined whether the updated global model is an improvement over the previous version of the global model. If the updated global model is an improvement (e.g., the predictive accuracy is improved), new parameters may be provided to the private data servers via the updated model instructions 230. At the private data server 124, the performance of the trained actual model (e.g., whether the model improves or worsens) can be evaluated to determine whether the models instructions provided by the updated global model result in an improved trained actual model. Parameters associated with various machine learning model versions may be stored so that earlier machine learning models may be later retrieved, if needed” (Paragraph 50) and “As proxy data 260 is generated and relayed to the global model server 130, the global model server aggregates the data and generates an updated global model. Once the global model is updated, it can be determined whether the updated global model is an improvement over the previous version of the global model. If the updated global model is an improvement (e.g., the predictive accuracy is improved), new parameters may be provided to the private data servers via the updated model instructions 230. At the private data server 124, the performance of the trained actual model (e.g., whether the model improves or worsens) can be evaluated to determine whether the models instructions provided by the updated global model result in an improved trained actual model. Parameters associated with various machine learning model versions may be stored so that earlier machine learning models may be later retrieved, if needed” (Paragraph 50) and “The disclosed approach of distributed, online machine learning can leverage numerous techniques for validating trained models. One approach includes a first private data server sending its trained actual model to other private data servers. The other private data servers can then validate the trained actual model on their own local data and send the results back to the first private data server. Additionally, a global modeling engine could also execute one or more cross-fold validation steps on the trained actual models using the global collection of aggregated proxy data. The reverse is also true. The global modeling engine can send the global mode to one or more private data servers to have the global model validated on each private data server's local data. One should appreciate that the validation of the various models is to be performed on data sets selected according to the same data selection requirements to ensure a proper analysis” (Paragraph 121).  The examiner further notes that the evaluation of an updated model (which can include validation of local models) teaches the claimed verifying.  The examiner further notes that Szeto teaches “wherein the first processing circuitry is configured to update the first model on the basis of the second data set based on the selected second cohort” as “The following discussion is presented from a health care perspective, and more specifically with respect to building trained machine learning models from genomic sequence data associated with cancer patients. However, it is fully contemplated that the architecture described herein can be adapted to other forms of research beyond oncology and can be leveraged wherever raw data is secured or considered private; insurance data, financial data, social media profile data, human capital data, proprietary experimental data, gaming or gambling data, military data, network traffic data, shopping or marketing data, or other types of data for example” (Paragraph 33), “a private data server 124 may receive proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) from a peer private data server (a different private data server 124). The private data server may generate models based on its own local private data, or based on both its own local private data and the received proxy related information from a peer private data server” (Paragraph 51), “Operation 580, performed by a global modeling engine or a peer private data machine, includes aggregating two or more proxy data sets from different private data servers. The aggregate proxy data sets (global proxy sets) are combined based on a given machine learning task and are generated according to the originally requested model instructions. Although each set of proxy data will likely be generated from different private data distributions, it should be appreciated that the corresponding private data training sets are constructed according to the same selection criteria. For example, a researcher might wish to build a prediction model on how well smokers respond to a lung cancer treatment. The research will request models to be built at many private hospitals where each hospital has its own private data. Each hospital receives the same data selection criteria; patients who are smokers, given the treatment, and their associated known outcome. Each hospital's local private data servers, via their modeling engines, constructs their own proxy data using training actual data as a foundation and based on the same data selection criteria. The global modeling engine then aggregates the individual proxy data sets together to create a global training data set. Operation 590 includes the global modeling engine train a global model on the aggregated sets of proxy data. The global model integrates the knowledge gained from each entity's private data. In some embodiments, the global modeling engine can create the trained global model by accumulating sets of actual model parameters and combining them into a single trained model” (Paragraph 106), “In other embodiments, the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine” (Paragraph 107), and “Machine learning systems may receive multiple inputs (e.g., private data), and through the machine learning process, may identify subsets of inputs that are the most important. Thus, it is contemplated that a given hospital may not collect exactly the same type of private data as other hospitals. Thus, the model instructions may be different for different hospitals or sites. However, by identifying which parameters are most predictive using the machine learning systems as described herein, data sets having in common these key predictive parameters may be combined. In other embodiments, model instructions may be modified, e.g., limited to include key predictive features, and used to regenerate proxy data, proxy data distributions, and other types of learned information. This regenerated information can then be sent to the global model server, where it is aggregated” (Paragraph 108).  The examiner further notes that a global server updating a global model is based on acquired data distributions from various sites (including data (i.e. second medical data) based on a “selected” second cohort).  Alternatively, a specific site acquiring data distributions from another site in accordance with a first cohort also updates its local model based on data (i.e. second medical data) corresponding to its “selected” second cohort.  
Szeto does not explicitly teach:
P)  calculate a first index value of the first data distribution and a second index value of a second data distribution for the second data set based on each of the plurality of second cohorts;
Q)  calculate similarity between the first data distribution and each second data distribution based on the first index value and each second index value;
R)  select the second cohort of the second data distribution having the highest similarity from the plurality of second cohorts.
	Yurtsever, however, teaches “calculate a first index value of the first data distribution and a second index value of a second data distribution for the second data set based on each of the plurality of second cohorts” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2), “calculate similarity between the first data distribution and each second data distribution based on the first index value and each second index value” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2), and “select the second cohort of the second data distribution having the highest similarity from the plurality of second cohorts” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2).
	The examiner further notes that the secondary reference of Yurtsever teaches the concept of comparing histograms for subsequent selection via the use of KL divergence.  Moreover, the instant application (See Page 15, lines 19-20) explicitly states that KL divergence is used in its selection process.  The combination would result in using KL divergence to perform the selection of the second cohort in Szeto.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Yurtsever’s would have allowed Szeto’s to provide a method for comparing histograms, as noted by Yurtsever (Page 92, Section 4.2).
Szeto and Yurtsever do not explicitly teach:
E)  a model information database; and 
J)  a first model that is stored in the model information database in the first site;
L)  a distribution database;
N)  store the first data distribution in the distribution database.
	Vodencarevic, however, teaches “a model information database” as “The step of storing may be seen as a step of archiving the machine learned model in an appropriate repository or database of the central server unit such that the machine learned models may be retrieved from the repository for further use. For instance, the machine learned models may be stored according to their fields of application or local sites/client units of origin. The repository or database may comprise a plurality of different machine learned models” (Paragraph 98), “a first model that is stored in the model information database in the first site” as “The step of storing may be seen as a step of archiving the machine learned model in an appropriate repository or database of the central server unit such that the machine learned models may be retrieved from the repository for further use. For instance, the machine learned models may be stored according to their fields of application or local sites/client units of origin. The repository or database may comprise a plurality of different machine learned models” (Paragraph 98), “a distribution database” as “The step of storing may be seen as a step of archiving the machine learned model in an appropriate repository or database of the central server unit such that the machine learned models may be retrieved from the repository for further use. For instance, the machine learned models may be stored according to their fields of application or local sites/client units of origin. The repository or database may comprise a plurality of different machine learned models” (Paragraph 98), and “store the first data distribution in the distribution database” as “The step of storing may be seen as a step of archiving the machine learned model in an appropriate repository or database of the central server unit such that the machine learned models may be retrieved from the repository for further use. For instance, the machine learned models may be stored according to their fields of application or local sites/client units of origin. The repository or database may comprise a plurality of different machine learned models” (Paragraph 98).
	The examiner further notes that although Szeto clearly has multiple models that are stored, there is no explicit teaching that such models are stored in a database.  Nevertheless, the secondary reference of Vodencarevic teaches the concept of storing trained ML models in a database.  The combination would result in the ML models of Szeto to be stored in a database.  Moreover, although Szeto clearly teaches a central server that receives distribution data (See Figure 1), there is no explicit teaching of storing such distribution data in an actual database.  Nevertheless, Vodencarevic teaches the concept of a central server storing data in a database.  The combination would result in the stored distribution data in the central server of Szeto to be stored in a database at its central server.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Vodencarevic’s would have allowed Szeto’s and Yurtsever’s to provide a method for retrieving ML models from a repository for further use, as noted by Vodencarevic (Paragraph 98).

Regarding claim 8, Szeto further teaches a learning system comprising:
A)  wherein the first site is configured to request the central server to update the first model (Paragraph 65).
	The examiner notes that Szeto teaches “wherein the first site is configured to request the central server to update the first model” as “the private data server may recognize that model accuracy is low, and may request additional updates from the global model server. The global model server using the global modeling engine aggregates data from different locations into a global model. For example, if an improved cancer survival model is requested, and data of the same type is not available, data from different tissue types may be combined to improve predictive accuracy of the cancer survival model” (Paragraph 65).  The examiner further notes that a request from a private server (which is housed at a first site) for updates for its local model from a global server teaches the claimed request of an update.

	Regarding claim 12, Szeto further teaches a medical learning system comprising:
A)  wherein the processing circuitry is further configured to verify an aptitude of the updated first model to the first cohort (Paragraphs 50 and 121).
	The examiner notes that Szeto teaches “wherein the processing circuitry is further configured to verify an aptitude of the updated first model to the first cohort” as “As proxy data 260 is generated and relayed to the global model server 130, the global model server aggregates the data and generates an updated global model. Once the global model is updated, it can be determined whether the updated global model is an improvement over the previous version of the global model. If the updated global model is an improvement (e.g., the predictive accuracy is improved), new parameters may be provided to the private data servers via the updated model instructions 230. At the private data server 124, the performance of the trained actual model (e.g., whether the model improves or worsens) can be evaluated to determine whether the models instructions provided by the updated global model result in an improved trained actual model. Parameters associated with various machine learning model versions may be stored so that earlier machine learning models may be later retrieved, if needed” (Paragraph 50) and “As proxy data 260 is generated and relayed to the global model server 130, the global model server aggregates the data and generates an updated global model. Once the global model is updated, it can be determined whether the updated global model is an improvement over the previous version of the global model. If the updated global model is an improvement (e.g., the predictive accuracy is improved), new parameters may be provided to the private data servers via the updated model instructions 230. At the private data server 124, the performance of the trained actual model (e.g., whether the model improves or worsens) can be evaluated to determine whether the models instructions provided by the updated global model result in an improved trained actual model. Parameters associated with various machine learning model versions may be stored so that earlier machine learning models may be later retrieved, if needed” (Paragraph 50) and “The disclosed approach of distributed, online machine learning can leverage numerous techniques for validating trained models. One approach includes a first private data server sending its trained actual model to other private data servers. The other private data servers can then validate the trained actual model on their own local data and send the results back to the first private data server. Additionally, a global modeling engine could also execute one or more cross-fold validation steps on the trained actual models using the global collection of aggregated proxy data. The reverse is also true. The global modeling engine can send the global mode to one or more private data servers to have the global model validated on each private data server's local data. One should appreciate that the validation of the various models is to be performed on data sets selected according to the same data selection requirements to ensure a proper analysis” (Paragraph 121).  The examiner further notes that the evaluation of an updated model (which can include validation of local models) teaches the claimed verifying.  

Regarding claim 13, Szeto does not explicitly teach a medical learning system comprising:
A)  wherein each of the first data distribution and the second data distribution is a distribution in which factors of medical data are represented as a histogram.
	Yurtsever, however, teaches “wherein each of the first data distribution and the second data distribution is a distribution in which factors of medical data are represented as a histogram” as “We compared the histogram of travel times of the simulated vehicles with the histogram of the recorded travel time of the surveyed vehicles in the benchmark data.  In Figure 11, an example of four travel time histograms are shown…We used KL-divergence score to compare the histograms…where P and Q are discrete probability distributions of travel times and I is the histogram bin index…The proposed framework got the lowest KL divergence score with respect to the benchmark data amongst the compared methods” (Page 92, Section 4.2).
	The examiner further notes that the secondary reference of Yurtsever teaches the concept of comparing histograms for subsequent selection via the use of KL divergence.  The combination would result in representing the medical data of Szeto in histogram form.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Yurtsever’s would have allowed Szeto’s to provide a method for comparing histograms, as noted by Yurtsever (Page 92, Section 4.2).
Response to Arguments
12.	Applicant’s arguments with respect to claims 1, 3-4, 6, and 8-13 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument (See newly applied secondary reference of Yurtsever).
Conclusion
13.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. PGPUB 2021/0097381 issued to Daykin et al. on 01 April 2021.  The subject matter disclosed therein is pertinent to that of claims 1, 3-4, 6, and 8-13 (e.g., methods to perform federated learning in a healthcare environment).
U.S. PGPUB 2022/0392048 issued to Henry et al. on 08 December 2022.  The subject matter disclosed therein is pertinent to that of claims 1, 3-4, 6, and 8-13 (e.g., methods to perform federated learning in a healthcare environment).
Contact Information
14.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Mahesh Dwivedi whose telephone number is (571) 272-2731.  The examiner can normally be reached on Monday to Friday 8:20 am – 4:40 pm.
	If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached (571) 272-4085.  The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
	Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see 20.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

Mahesh Dwivedi
Primary Examiner
Art Unit 2168

February 24, 2026
/MAHESH H DWIVEDI/Primary Examiner, Art Unit 2168
Read full office action
Prosecution Timeline

Jun 27, 2022
Application Filed
May 16, 2025
Non-Final Rejection — §103
Aug 18, 2025
Response Filed
Sep 30, 2025
Final Rejection — §103
Dec 26, 2025
Request for Continued Examination
Jan 21, 2026
Response after Non-Final Action
Feb 24, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/171,704
Patent 12591818
FORECASTING AND MITIGATING CONCEPT DRIFT USING NATURAL LANGUAGE PROCESSING
2y 5m to grant Granted Mar 31, 2026
18/653,456
Patent 12585690
COMPUTER-READABLE RECORDING MEDIUM STORING INFORMATION VERIFICATION PROGRAM, INFORMATION PROCESSING APPARATUS, AND INFORMATION PROCESSING SYSTEM
2y 5m to grant Granted Mar 24, 2026
18/918,887
Patent 12561366
Real-Time Micro-Profile Generation Using a Dynamic Tree Structure
2y 5m to grant Granted Feb 24, 2026
18/976,038
Patent 12561469
INFERRING SCHEMA STRUCTURE OF FLAT FILE
2y 5m to grant Granted Feb 24, 2026
18/506,032
Patent 12554730
HYBRID DATABASE IMPLEMENTATIONS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
69%
Grant Probability
74%
With Interview (+4.3%)
3y 6m
Median Time to Grant
High
PTA Risk
Based on 751 resolved cases by this examiner. Grant probability derived from career allow rate.