Office Action Analysis: 16926534 — PEER-TO-PEER TRAINING OF A MACHINE LEARNING MODEL

Examiner Intelligence

LEY, SALLY THI View full profile →
Grants only 15% of cases
Career Allow Rate
5 granted / 33 resolved
-39.8% vs TC avg
Strong +29% interview lift
Without
With
+28.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 10m
Avg Prosecution
35 currently pending
Career history
68
Total Applications
across all art units
Statute-Specific Performance

§101
29.2%
-10.8% vs TC avg
§103
50.2%
+10.2% vs TC avg
§102
10.8%
-29.2% vs TC avg
§112
9.8%
-30.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 33 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
	This Office Action is in response to the communication filed on 02 October 2025.
	Claims 1-3, 5, 8-9, 11-13, 15, and 18-21 are being considered on the merits.

Objections
The third to last limitations of claim 20 are filled with typographical errors, including reciting “the a second third node” “a second third local machine learning model”, “a second third training data at the second third node”.  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5, 8-9, 11-13, 15, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Szeto, et. al. (US 2018/0018590 A1; hereinafter “Szeto”), in view of Shahin Shahrampour and Ali Jadbabaie (arXiv:1309.2350v1  [cs.LG]  10 Sep 2013; hereinafter, “Shahrampour”) and further in view of  Zhu, et. al. (“Blockchain-Based Privacy Preserving Deep Learning.” In: Guo, F., Huang, X., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2018. Lecture Notes in Computer Science(), vol 11449; hereinafter “Zhu”)

Regarding Claim 1, Szeto teaches: 
A system, comprising: at least one processor; and (Szeto, para. 0027: “One of ordinary skill in the art should appreciate that the computing devices comprise one or more processors configured to execute software instructions that are stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, FPGA, PLA, PLD, solid state drive, RAM, flash, ROM, external drive, memory stick, etc.)”)
at least one memory including program code which when executed by the at least one processor provides operations comprising: (Szeto, para. 0027: “One of ordinary skill in the art should appreciate that the computing devices comprise one or more processors configured to execute software instructions that are stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, FPGA, PLA, PLD, solid state drive, RAM, flash, ROM, external drive, memory stick, etc.)”)
performing, based at least on a parameter set of the first local machine learning model trained based on the first training data, Bayesian inference to learn the approximate posterior distribution of the parameter set of the global machine learning model, (Szeto, para. 0068-0069: “Modeling engine 226 creates trained actual model 240 as a function of the results set representing at least some of private data 222. This is achieved by modeling engine 226 training the desired implementation of machine learning algorithm 295 on the private data 222 results set. In view that the desired machine learning algorithm 295 could include a wide variety of possible algorithms, model instructions 230 can include instructions that define the condition under which training occurs…Machine learning algorithms 295 can include quite a large number of different types of algorithms including…a Bayesian classifier algorithm… naive bayes, gaussian naive bayes, multinomial naive bayes, averaged one-dependence estimators (AODE), bayesian belief network (BBN), bayesian network (BN)” Examiner notes that Szeto teaches modeling an engine using any number of bayes algorithms for learning i.e. inferring, where use of a bayes algorithm necessarily involves the calculation of posterior probability and where multiple values are involved, a posterior distribution). 
updating, based at least on the training of the first local machine learning model and on the Bayesian inference of the parameter set of the global machine learning model (Szeto, para. 0068-0069: “Modeling engine 226 creates trained actual model 240 as a function of the results set representing at least some of private data 222. This is achieved by modeling engine 226 training the desired implementation of machine learning algorithm 295 on the private data 222 results set. In view that the desired machine learning algorithm 295 could include a wide variety of possible algorithms, model instructions 230 can include instructions that define the condition under which training occurs…Machine learning algorithms 295 can include quite a large number of different types of algorithms including…a Bayesian classifier algorithm… naive bayes, gaussian naive bayes, multinomial naive bayes, averaged one-dependence estimators (AODE), bayesian belief network (BBN), bayesian network (BN)”), a first local belief of the parameter set of the global machine learning model; (Szeto, para. 0050 and 0124: “As proxy data 260 is generated and relayed to the global model server 130, the global model server aggregates the data and generates an updated global model. Once the global model is updated, it can be determined whether the updated global model is an improvement over the previous version of the global model. If the updated global model is an improvement (e.g., the predictive accuracy is improved), new parameters may be provided to the private data servers via the updated model instructions 230.” “Still further, as the patient experiences treatment, their data can be fed back into the trained actual, proxy and global models to ensure the trained models are updated through additional training.” Examiner notes that Szeto teaches updating a global server and local servers and data feeding back into proxy and global models for updating i.e. training)
wherein the updating comprises aggregating the first local belief of the parameter set of the global machine learning model with at least the second local belief of the parameter set of the global machine learning model to learn, at the first node, an updated first local belief of the parameter set of the global machine learning model. (Szeto, para. 0045 and Fig. 2: “Each local modeling engine 126 accesses its own local private data 122 and creates local trained models according to model instructions created by the researcher. As each modeling engine 126 gains new learned information, the new knowledge is transmitted back to the researcher at non-private computing device 130 once transmission criteria have been met. The new knowledge can then be aggregated into a trained global model via global modeling engine 136. Examples of knowledge include (see, e.g., FIG. 2) but are not limited to proxy data 260, trained actual models 240, trained proxy models 270, proxy model parameters, model similarity scores, or other types of data that have been de-identified. In some embodiments, the global model server 130 analyzes sets of proxy related information (including for example proxy data 260, proxy data distributions 362, proxy model parameters 475, other proxy related data combined with seeds, etc.) to determine whether the proxy related information from one of private data server 124 has the same shape and/or overall properties as the proxy related data from another private data server 124, prior to combining such information.” Examiner notes that fig. 2 is an example embodiment with 3 data servers sharing information with each other and a non-private computing device which Szeto teaches may aggregate and average the private data collected from each)
Szeto does not explicitly disclose, but Shahrampour teaches: 
learning a parameter set of a global machine learning model using a decentralized peer-to- peer network of nodes that learn an approximate posterior distribution of the parameter set of the global machine learning model using a Bayesian inference, (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
the second local belief representing a Bayesian inference of the parameter set of the global machine learning model (Zhu, Fig. 1 supra), the second belief having been updated based at least on the second node training a second local machine learning model, and the second local machine learning model being trained based at least on a second training data available at the second node; and (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
the third local belief representing a Bayesian inference statistically inferring the parameter set of the global machine learning model (Zhu, Fig. 1 supra), the third local belief having been updated based at least on a third node training a third local machine learning model, and the third local machine learning model being trained based at least on a third training data available at the third node; (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
averaging a logarithm of the second local belief and a logarithm of the third local belief; and (Shahrampour, sec. V: “Using a randomized, gossip dual averaging, agents aggregate local log-likelihood functions, and then perform a Bayes-like update on the averaged information to collectively recover the truth.”)
updating, based at least on the averaged logarithm of the second and third local beliefs, the first local belief of the parameter set of the global machine learning model, (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Shahrampour into Szeto. Szeto teaches a distributed, online machine learning system using proxy data; Shahrampour teaches an optimization-based view of distributed parameter estimation and observational social learning in networks. One of ordinary skill would have motivation to combine the teachings of Shahrampour into Szeto in order to achieve exponentially fast convergence among distributed agents (Shahrampour, sec. II). 
Finally, Zhu teaches:
the plurality of peer-to-peer network of nodes comprising a first node, a second node, and a third node, wherein the second node and the third node each have an edge going to the first node: (Zhu, Fig. 1: Examiner notes Zhu figure 1 shows a distributed peer-to-peer network consisting of 4 nodes).  
training, based at least on a first training data available at the first node of a plurality of nodes in the decentralized peer-to-peer network, a first local machine learning model; (Zhu, sec. 3.1: “We assume that each participant (                                
                                    
                                            R
                                        
                                            i
                                        
                             ) relies on the deep learning models (DLMs), (                                
                                    
                                            M
                                        
                                            i
                                        
                            ) when conducting a inference. Specifically,                                 
                                    
                                            R
                                        
                                            i
                                        
                            checkouts                                 
                                    
                                            M
                                        
                                            i
                                        
                             from its local hub and uses it to conduct the inference. During this,                                 
                                    
                                            R
                                        
                                            i
                                        
                             stores the information captured by its sensors, producing the inference data (                                
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             ).                                 
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             allow                                 
                                    
                                            R
                                        
                                            i
                                        
                             to update/improve the existing                                 
                                    
                                            M
                                        
                                            i
                                        
                            .” Examiner notes that Mi is a local machine learning model)
receiving, from the second node of a plurality of nodes in the decentralized peer-to-peer network, a second local belief of the parameter set of the global machine learning model, and receiving, from the third node of the plurality of nodes in the decentralized peer-to-peer network, a third local belief of the parameter set of the global machine learning model, (Zhu, Fig. 1: Examiner notes Zhu figure 1 shows a distributed peer-to-peer network consisting of 4 nodes i.e. a first, second, third and forth node each with their own local beliefs).  
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto. Szeto teaches a distributed, online machine learning system using proxy data; Zhu teaches a blockchain-based approach in a collective deep-learning scenario. One of ordinary skill would have motivation to combine the teachings of Zhu into Szeto in order to track use of personal data and increase the users’ trust in the system and provides a
rich source of information that can be used to better design future services (Szeto sec. 2.2).  
 
Regarding Claim 2, Szeto, as modified, teaches claim 1 above. Zhu further teaches: 
sending, to the second node, the updated first local belief of the parameter set of the global machine learning model such that the second local belief of the second node is further updated based on the updated first local belief of the first node. (Zhu, sec. 3.3: “First, the source participant (                                
                                    
                                            R
                                        
                                            (
                                            s
                                            )
                                        
                             ) ‘advertises’ the new candidate model (                                
                                    
                                            M
                                        
                                            C
                                        
                                            s
                                        
                             ) by announcing the DLM updates to the entire network. Then, the destination participants (                                
                                    
                                            R
                                        
                                            (
                                            d
                                            ,
                                            i
                                            )
                                        
                             ), where                                 
                                    i
                                    =
                                    1
                                    …
                                    N
                                
                             denotes the target hubs, are notified by their local hub that there is an update available in the network.”)
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 3, Szeto, as modified, teaches claim 1 above. Shahrampour further teaches: 
wherein the second local belief and the third local belief are not normalized before averaging to improve computational efficiency of the learning of the parameter set of the global machine learning model using the decentralized peer-to-peer network of nodes; (Shahrampour sec. IV: “The communication structure is based on a randomized gossip scheme. Let the global Poisson clock at the beginning of the t-th slot tick for agent i (with probability 1 n ), and let agent i contact a neighboring node j (with probability Pij). Then, agents i and j average their accumulated observations from previous slots” Examiner notes Shahrampour teaches averaging without normalizing before). 
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 5, Szeto, as modified, teaches claim 1 above. Zhu further teaches:
wherein the second local belief of the second node is further updated based at least on a third local belief of a third node in the decentralized peer-to-peer network. (Zhu, sec. 3.3: “During the next stage,                                 
                                    
                                            R
                                        
                                            (
                                            s
                                            )
                                        
                             is to consolidate the feedback information from the destination hubs. This can be achieved using time-constraints (i.e., waiting for a pre-defined period of time to receive the feedback), and/or when a target consensus is achieved. If the score for the                                 
                                    
                                            M
                                        
                                            C
                                        
                             is higher than for the currently accepted model (                                
                                    
                                            M
                                        
                                            s
                                        
                            )…                                 
                                    
                                            R
                                        
                                            s
                                        
                             creates a new model                                 
                                    
                                            M
                                        
                                            (
                                            s
                                            +
                                            1
                                            )
                                        
                            , which is then published to all connected hubs, and committed to their participants’ local repositories…Additionally, in order to leverage the new local data, the destination hubs can also return the model updates to the source hub” Examiner notes that the Zhu teaches the exchange of model updates between nodes such that a local belief of a second node is further updated based on information received from a third node)
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 8, Szeto, as modified, teaches claim 1 above. Zhu further teaches: 
wherein the global machine learning model comprises a neural network, and (Zhu, sec. 4.3: “We used a Multi-Layer Perceptron as the supervised learning algorithm for recognising activity using accelerometer traces. A Multi-Layer Perceptron or MLP is a type of feed-forward Artificial Neural Network that consists of two layers, input and output, and one or more hidden layers between these two layers.”)
wherein the parameter set includes one or more weights applied by the neural network. (Zhu, sec. 4.3 and fig. 4: “Figure 4 shows a graphical representation of a MLP with a single hidden layer. Each node in a layer is connected to all the nodes in the previous layer. Training this structure is equivalent to finding proper weights and bias for all the connections between consecutive layers such that a desired output is generated for a corresponding input.”)
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 9, Szeto, as modified, teaches claim 1 above. Zhu further teaches: 
The system of claim 1, wherein the global machine learning model comprises a regression model, and (Zhu, sec. 4.4: “We set up a Multilayer Perceptron with 2 layers for activity recognition, including 1 hidden layer with 128 nodes and 1 logistic regression layer, resulting in 6, 406 parameters to be determined during training.”) 
wherein the parameter set includes a relationship between one or more independent variables and dependent variables. (Zhu, sec. 3.1: “Specifically, a supervised machine learning approach is adopted:                                 
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             are used as input to                                 
                                    
                                            M
                                        
                                            i
                                        
                            , while the class scores                                 
                                    
                                            O
                                        
                                            i
                                        
                             (e.g. human activities) as the target output. Then, the fine-tunning of the DLM parameters                                 
                                    
                                            W
                                        
                                            i
                                        
                             to the newly acquired data                                 
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             is accomplished using the standard back-propagation technique and by selecting an optimizer. With these new parameters, the participant is expected to increase its competences and adaptability to target inference.” Examiner notes that Zhu teaches a machine learning relationship between independent variables, inputs, and dependent variables, output)
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 11, Szeto teaches: 
performing, based at least on a parameter set of the first local machine learning model trained based on the first training data, Bayesian inference to learn the approximate posterior distribution of the parameter set of the global machine learning model, (Szeto, para. 0068-0069: “Modeling engine 226 creates trained actual model 240 as a function of the results set representing at least some of private data 222. This is achieved by modeling engine 226 training the desired implementation of machine learning algorithm 295 on the private data 222 results set. In view that the desired machine learning algorithm 295 could include a wide variety of possible algorithms, model instructions 230 can include instructions that define the condition under which training occurs…Machine learning algorithms 295 can include quite a large number of different types of algorithms including…a Bayesian classifier algorithm… naive bayes, gaussian naive bayes, multinomial naive bayes, averaged one-dependence estimators (AODE), bayesian belief network (BBN), bayesian network (BN)” Examiner notes that Szeto teaches modeling an engine using any number of bayes algorithms for learning i.e. inferring, where use of a bayes algorithm necessarily involves the calculation of posterior probability and where multiple values are involved, a posterior distribution). 
updating, based at least on the training of the first local machine learning model and on the Bayesian inference of the parameter set of the global machine learning model (Szeto, para. 0068-0069: “Modeling engine 226 creates trained actual model 240 as a function of the results set representing at least some of private data 222. This is achieved by modeling engine 226 training the desired implementation of machine learning algorithm 295 on the private data 222 results set. In view that the desired machine learning algorithm 295 could include a wide variety of possible algorithms, model instructions 230 can include instructions that define the condition under which training occurs…Machine learning algorithms 295 can include quite a large number of different types of algorithms including…a Bayesian classifier algorithm… naive bayes, gaussian naive bayes, multinomial naive bayes, averaged one-dependence estimators (AODE), bayesian belief network (BBN), bayesian network (BN)”), a first local belief of the parameter set of the global machine learning model; (Szeto, para. 0050 and 0124: “As proxy data 260 is generated and relayed to the global model server 130, the global model server aggregates the data and generates an updated global model. Once the global model is updated, it can be determined whether the updated global model is an improvement over the previous version of the global model. If the updated global model is an improvement (e.g., the predictive accuracy is improved), new parameters may be provided to the private data servers via the updated model instructions 230.” “Still further, as the patient experiences treatment, their data can be fed back into the trained actual, proxy and global models to ensure the trained models are updated through additional training.” Examiner notes that Szeto teaches updating a global server and local servers and data feeding back into proxy and global models for updating i.e. training)
Shahrampour teaches: 
learning a parameter set of a global machine learning model using a decentralized peer-to- peer network of nodes that learn an approximate posterior distribution of the parameter set of the global machine learning model using a Bayesian inference, (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
the second local belief representing a Bayesian inference of the parameter set of the global machine learning model (Zhu, Fig. 1 supra), the second belief having been updated based at least on the second node training a second local machine learning model, and the second local machine learning model being trained based at least on a second training data available at the second node; and (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
the third local belief representing a Bayesian inference statistically inferring the parameter set of the global machine learning model (Zhu, Fig. 1 supra), the third local belief having been updated based at least on a third node training a third local machine learning model, and the third local machine learning model being trained based at least on a third training data available at the third node; (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
averaging a logarithm of the second local belief and a logarithm of the third local belief; and (Shahrampour, sec. V: “Using a randomized, gossip dual averaging, agents aggregate local log-likelihood functions, and then perform a Bayes-like update on the averaged information to collectively recover the truth.”)
updating, based at least on the averaged logarithm of the second and third local beliefs, the first local belief of the parameter set of the global machine learning model, (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Shahrampour into Szeto. Szeto teaches a distributed, online machine learning system using proxy data; Shahrampour teaches an optimization-based view of distributed parameter estimation and observational social learning in networks. One of ordinary skill would have motivation to combine the teachings of Shahrampour into Szeto in order to achieve exponentially fast convergence among distributed agents (Shahrampour, sec. II). 
Finally, Zhu teaches:
the plurality of peer-to-peer network of nodes comprising a first node, a second node, and a third node, wherein the second node and the third node each have an edge going to the first node: (Zhu, Fig. 1: Examiner notes Zhu figure 1 shows a distributed peer-to-peer network consisting of 4 nodes).  
training, based at least on a first training data available at the first node of a plurality of nodes in the decentralized peer-to-peer network, a first local machine learning model; (Zhu, sec. 3.1: “We assume that each participant (                                
                                    
                                            R
                                        
                                            i
                                        
                             ) relies on the deep learning models (DLMs), (                                
                                    
                                            M
                                        
                                            i
                                        
                            ) when conducting a inference. Specifically,                                 
                                    
                                            R
                                        
                                            i
                                        
                            checkouts                                 
                                    
                                            M
                                        
                                            i
                                        
                             from its local hub and uses it to conduct the inference. During this,                                 
                                    
                                            R
                                        
                                            i
                                        
                             stores the information captured by its sensors, producing the inference data (                                
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             ).                                 
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             allow                                 
                                    
                                            R
                                        
                                            i
                                        
                             to update/improve the existing                                 
                                    
                                            M
                                        
                                            i
                                        
                            .” Examiner notes that Mi is a local machine learning model)
receiving, from the second node of a plurality of nodes in the decentralized peer-to-peer network, a second local belief of the parameter set of the global machine learning model, and receiving, from the third node of the plurality of nodes in the decentralized peer-to-peer network, a third local belief of the parameter set of the global machine learning model, (Zhu, Fig. 1: Examiner notes Zhu figure 1 shows a distributed peer-to-peer network consisting of 4 nodes i.e. a first, second, third and forth node each with their own local beliefs).  
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified. Szeto teaches a distributed, online machine learning system using proxy data; Zhu teaches a blockchain-based approach in a collective deep-learning scenario. One of ordinary skill would have motivation to combine the teachings of Zhu into Szeto in order to track use of personal data and increase the users’ trust in the system and provides a rich source of information that can be used to better design future services (Szeto sec. 2.2).  

Regarding Claim 12, Szeto, as modified, teaches claim 11 above. Zhu further teaches:
The method of claim 11, further comprising: sending, to the second node, the updated first local belief of the parameter set of the global machine learning model such that the second local belief of the second node is further updated based on the updated first local belief of the first node. (Zhu, sec. 3.3: “First, the source participant (                                
                                    
                                            R
                                        
                                            (
                                            s
                                            )
                                        
                             ) ‘advertises’ the new candidate model (                                
                                    
                                            M
                                        
                                            C
                                        
                                            s
                                        
                             ) by announcing the DLM updates to the entire network. Then, the destination participants (                                
                                    
                                            R
                                        
                                            (
                                            d
                                            ,
                                            i
                                            )
                                        
                             ), where                                 
                                    i
                                    =
                                    1
                                    …
                                    N
                                
                             denotes the target hubs, are notified by their local hub that there is an update available in the network.”)
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 13, Szeto, as modified, teaches claim 11 above. Zhu further teaches:
wherein the second local belief and the third local belief are not normalized before averaging to improve computational efficiency of the learning of the parameter set of the global machine learning model using the decentralized peer-to-peer network of nodes; (Shahrampour sec. IV: “The communication structure is based on a randomized gossip scheme. Let the global Poisson clock at the beginning of the t-th slot tick for agent i (with probability 1 n ), and let agent i contact a neighboring node j (with probability Pij). Then, agents i and j average their accumulated observations from previous slots” Examiner notes Shahrampour teaches averaging without normalizing before). 
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 15, Szeto, as modified, teaches claim 11 above. Zhu further teaches:
The method of claim 11, wherein the second local belief of the second node is further updated based at least on a third local belief of a third node in the decentralized peer-to-peer network. (Zhu, sec. 3.3: “During the next stage,                                 
                                    
                                            R
                                        
                                            (
                                            s
                                            )
                                        
                             is to consolidate the feedback information from the destination hubs. This can be achieved using time-constraints (i.e., waiting for a pre-defined period of time to receive the feedback), and/or when a target consensus is achieved. If the score for the                                 
                                    
                                            M
                                        
                                            C
                                        
                             is higher than for the currently accepted model (                                
                                    
                                            M
                                        
                                            s
                                        
                            )…                                 
                                    
                                            R
                                        
                                            s
                                        
                             creates a new model                                 
                                    
                                            M
                                        
                                            (
                                            s
                                            +
                                            1
                                            )
                                        
                            , which is then published to all connected hubs, and committed to their participants’ local repositories…Additionally, in order to leverage the new local data, the destination hubs can also return the model updates to the source hub” Examiner notes that the Zhu teaches the exchange of model updates between nodes such that a local belief of a second node is further updated based on information received from a third node)
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 18, Szeto, as modified, teaches claim 11 above. Zhu further teaches: 
The method of claim 11, wherein the global machine learning model comprises a neural network, and (Zhu, sec. 4.3: “We used a Multi-Layer Perceptron as the supervised learning algorithm for recognising activity using accelerometer traces. A Multi-Layer Perceptron or MLP is a type of feed-forward Artificial Neural Network that consists of two layers, input and output, and one or more hidden layers between these two layers.”)
wherein the parameter set includes one or more weights applied by the neural network. (Zhu, sec. 4.3 and fig. 4: “Figure 4 shows a graphical representation of a MLP with a single hidden layer. Each node in a layer is connected to all the nodes in the previous layer. Training this structure is
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 19, Szeto, as modified, teaches claim 11 above. Zhu further teaches: 
The method of claim 11, wherein the global machine learning model comprises a regression model, (Zhu, sec. 4.4: “We set up a Multilayer Perceptron with 2 layers for activity recognition, including 1 hidden layer with 128 nodes and 1 logistic regression layer, resulting in 6, 406 parameters to be determined during training.”) 
and wherein the parameter set includes a relationship between one or more independent variables and dependent variables. (Zhu, sec. 3.1: “Specifically, a supervised machine learning approach is adopted:                                 
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             are used as input to                                 
                                    
                                            M
                                        
                                            i
                                        
                            , while the class scores                                 
                                    
                                            O
                                        
                                            i
                                        
                             (e.g. human activities) as the target output. Then, the fine-tunning of the DLM parameters                                 
                                    
                                            W
                                        
                                            i
                                        
                             to the newly acquired data                                 
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             is accomplished using the standard back-propagation technique and by selecting an optimizer. With these new parameters, the participant is expected to increase its competences and adaptability to target inference.” Examiner notes that Zhu teaches a machine learning relationship between independent variables, inputs, and dependent variables, output)
	It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto, as modified, as set forth above with respect to claim  1. 

Regarding Claim 20, Szeto teaches: 
A non-transitory computer readable medium storing instructions, which when executed by at least one data processor, result in operations comprising: (Szeto, para. 0027: “One of ordinary skill in the art should appreciate that the computing devices comprise one or more processors configured to execute software instructions that are stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, FPGA, PLA, PLD, solid state drive, RAM, flash, ROM, external drive, memory stick, etc.)”)
performing, based at least on a parameter set of the first local machine learning model trained based on the first training data, Bayesian inference to learn the approximate posterior distribution of the parameter set of the global machine learning model, (Szeto, para. 0068-0069: “Modeling engine 226 creates trained actual model 240 as a function of the results set representing at least some of private data 222. This is achieved by modeling engine 226 training the desired implementation of machine learning algorithm 295 on the private data 222 results set. In view that the desired machine learning algorithm 295 could include a wide variety of possible algorithms, model instructions 230 can include instructions that define the condition under which training occurs…Machine learning algorithms 295 can include quite a large number of different types of algorithms including…a Bayesian classifier algorithm… naive bayes, gaussian naive bayes, multinomial naive bayes, averaged one-dependence estimators (AODE), bayesian belief network (BBN), bayesian network (BN)” Examiner notes that Szeto teaches modeling an engine using any number of bayes algorithms for learning i.e. inferring, where use of a bayes algorithm necessarily involves the calculation of posterior probability and where multiple values are involved, a posterior distribution). 
updating, based at least on the training of the first local machine learning model and on the Bayesian inference of the parameter set of the global machine learning model (Szeto, para. 0068-0069: “Modeling engine 226 creates trained actual model 240 as a function of the results set representing at least some of private data 222. This is achieved by modeling engine 226 training the desired implementation of machine learning algorithm 295 on the private data 222 results set. In view that the desired machine learning algorithm 295 could include a wide variety of possible algorithms, model instructions 230 can include instructions that define the condition under which training occurs…Machine learning algorithms 295 can include quite a large number of different types of algorithms including…a Bayesian classifier algorithm… naive bayes, gaussian naive bayes, multinomial naive bayes, averaged one-dependence estimators (AODE), bayesian belief network (BBN), bayesian network (BN)”), a first local belief of the parameter set of the global machine learning model; (Szeto, para. 0050 and 0124: “As proxy data 260 is generated and relayed to the global model server 130, the global model server aggregates the data and generates an updated global model. Once the global model is updated, it can be determined whether the updated global model is an improvement over the previous version of the global model. If the updated global model is an improvement (e.g., the predictive accuracy is improved), new parameters may be provided to the private data servers via the updated model instructions 230.” “Still further, as the patient experiences treatment, their data can be fed back into the trained actual, proxy and global models to ensure the trained models are updated through additional training.” Examiner notes that Szeto teaches updating a global server and local servers and data feeding back into proxy and global models for updating i.e. training)
Shahrampour teaches: 
learning a parameter set of a global machine learning model using a decentralized peer-to- peer network of nodes that learn an approximate posterior distribution of the parameter set of the global machine learning model using a Bayesian inference, (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
the second local belief representing a Bayesian inference of the parameter set of the global machine learning model (Zhu, Fig. 1 supra), the second belief having been updated based at least on the second node training a second local machine learning model, and the second local machine learning model being trained based at least on a second training data available at the second node; and (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
the third local belief representing a Bayesian inference statistically inferring the parameter set of the global machine learning model (Zhu, Fig. 1 supra), the third local belief having been updated based at least on the a second third node training a second third local machine learning model, and the third local machine learning model being trained based at least on a second third training data available at the third node; (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”; examiner notes for examination purposes only “a second third node” is interpreted as a fourth node as taught by Zhu). 
averaging a logarithm of the second local belief and a logarithm of the third local belief; and (Shahrampour, sec. V: “Using a randomized, gossip dual averaging, agents aggregate local log-likelihood functions, and then perform a Bayes-like update on the averaged information to collectively recover the truth.”)
updating, based at least on the averaged logarithm of the second and third local beliefs, the first local belief of the parameter set of the global machine learning model, (Shahrampour, sec. II(C): “We derived a closed-form solution for µt(θ) that essentially performs the Bayesian update; each agent aggregates information up to time t, and then, infers the posterior from prior”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Shahrampour into Szeto. Szeto teaches a distributed, online machine learning system using proxy data; Shahrampour teaches an optimization-based view of distributed parameter estimation and observational social learning in networks. One of ordinary skill would have motivation to combine the teachings of Shahrampour into Szeto in order to achieve exponentially fast convergence among distributed agents (Shahrampour, sec. II). 
Finally, Zhu teaches:
the plurality of peer-to-peer network of nodes comprising a first node, a second node, and a third node, wherein the second node and the third node each have an edge going to the first node: (Zhu, Fig. 1: Examiner notes Zhu figure 1 shows a distributed peer-to-peer network consisting of 4 nodes).  
training, based at least on a first training data available at the first node of a plurality of nodes in the decentralized peer-to-peer network, a first local machine learning model; (Zhu, sec. 3.1: “We assume that each participant (                                
                                    
                                            R
                                        
                                            i
                                        
                             ) relies on the deep learning models (DLMs), (                                
                                    
                                            M
                                        
                                            i
                                        
                            ) when conducting a inference. Specifically,                                 
                                    
                                            R
                                        
                                            i
                                        
                            checkouts                                 
                                    
                                            M
                                        
                                            i
                                        
                             from its local hub and uses it to conduct the inference. During this,                                 
                                    
                                            R
                                        
                                            i
                                        
                             stores the information captured by its sensors, producing the inference data (                                
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             ).                                 
                                    
                                            I
                                            D
                                        
                                            i
                                        
                             allow                                 
                                    
                                            R
                                        
                                            i
                                        
                             to update/improve the existing                                 
                                    
                                            M
                                        
                                            i
                                        
                            .” Examiner notes that Mi is a local machine learning model)
receiving, from the second node of a plurality of nodes in the decentralized peer-to-peer network, a second local belief of the parameter set of the global machine learning model, and receiving, from the third node of the plurality of nodes in the decentralized peer-to-peer network, a third local belief of the parameter set of the global machine learning model, (Zhu, Fig. 1: Examiner notes Zhu figure 1 shows a distributed peer-to-peer network consisting of 4 nodes i.e. a first, second, third and forth node each with their own local beliefs).  
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Zhu into Szeto. Szeto teaches a distributed, online machine learning system using proxy data; Zhu teaches a blockchain-based approach in a collective deep-learning scenario. One of ordinary skill would have motivation to combine the teachings of Zhu into Szeto in order to track use of personal data and increase the users’ trust in the system and provides a
rich source of information that can be used to better design future services (Szeto sec. 2.2).  

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Szeto, et. al. (US 2018/0018590 A1; hereinafter “Szeto”) in view of Shahin Shahrampour and Ali Jadbabaie (arXiv:1309.2350v1  [cs.LG]  10 Sep 2013; hereinafter, “Shahrampour”), in view of Zhu, et. al. (“Blockchain-Based Privacy Preserving Deep Learning.” In: Guo, F., Huang, X., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2018. Lecture Notes in Computer Science(), vol 11449; hereinafter “Zhu”) and further in view of in view of Lalitha, Anusha and Tara Javidi (“On the rate of learning in distributed hypothesis testing," 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2015, pp. 1-8; hereinafter, “Lalitha”), 

Regarding Claim 21, Zhu, as modified, teaches claim 1 above. Zhu further teaches: 
The system of claim 1, wherein an interaction of the plurality of nodes is characterized by an aperiodic, irreducible matrix.  (Lalitha, sec. III(B): “Along with the weights, the network can be thought of as a weighted strongly connected network. Hence, from Assumption 3, we have that weight matrix W is irreducible and aperiodic.”)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Lalitha into Szeto, as modified. Szeto teaches a distributed, online machine learning system using proxy data; Lalitha teaches a Bayesian of a parameter based on local observation, communication of updates to neighbors, and a “non-Bayesian” linear consensus using the log-beliefs of neighbors. One of ordinary skill would have motivation to combine the teachings of Lalitha into Szeto, as modified, in order to obtain stronger results for probability of concentration of learning rate (Lalitha, introduction). 
	
Response to Applicant Arguments and Remarks
35 U.S.C §103
	Applicant argues that Zhu, as previously cited, does not teach the limitations as currently amended. However, the rejections have been updated in light of applicant’s amendments. To applicant’s argument beginning at the bottom of page 13 of applicant’s remarks that Zhu does not teach decentralized learning among nodes to learn an approximate posterior distribution: Such claims are taught by Shahrampour, as referenced above. 
	Likewise, to applicant’s argument at the bottom of page 14 of applicant’s remarks regarding averaging logarithms, such claims are similarly taught by Shahrampour. 
	At the bottom of page 15, applicant remarks that independent claims 11 and 20 recite similar features to independent claim 1 and that all such arguments supporting claim 1 are equally applicable to claims 11 and 20. However, as the arguments are not persuasive with respect to claim 1, they are similarly not persuasive as to claims 11 and 20. 
	Applicant makes no independent argument regarding other dependent claims. Therefore, such claims remain rejected at least as a result of their dependency on rejected independent claims but also for the reasons set forth in the rejection above. 
 
Conclusion

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

/STL/Examiner, Art Unit 2147                                                                                                                                                                                                        
/VIKER A LAMARDO/Supervisory Patent Examiner, Art Unit 2147                                                                                                                                                                                                        Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sally T. Ley whose telephone number is (571)272-3406. The examiner can normally be reached Monday - Thursday, 10:00am - 6:00pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker Lamardo can be reached at (571) 270-5871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/STL/Examiner, Art Unit 2147
Read full office action
Prosecution Timeline

Jul 10, 2020
Application Filed
Jan 08, 2024
Non-Final Rejection — §103
Feb 22, 2024
Examiner Interview Summary
Feb 22, 2024
Applicant Interview (Telephonic)
Apr 05, 2024
Response Filed
May 14, 2024
Non-Final Rejection — §103
Sep 23, 2024
Response Filed
Dec 12, 2024
Final Rejection — §103
Mar 13, 2025
Applicant Interview (Telephonic)
Mar 13, 2025
Examiner Interview Summary
Mar 17, 2025
Request for Continued Examination
Mar 24, 2025
Response after Non-Final Action
May 28, 2025
Non-Final Rejection — §103
Oct 02, 2025
Response Filed
Nov 01, 2025
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

16/733,393
Patent 12443830
COMPRESSED WEIGHT DISTRIBUTION IN NETWORKS OF NEURAL PROCESSORS
2y 5m to grant Granted Oct 14, 2025
16/835,892
Patent 12135927
EXPERT-IN-THE-LOOP AI FOR MATERIALS DISCOVERY
2y 5m to grant Granted Nov 05, 2024
17/992,958
Patent 11880776
GRAPH NEURAL NETWORK (GNN)-BASED PREDICTION SYSTEM FOR TOTAL ORGANIC CARBON (TOC) IN SHALE
2y 5m to grant Granted Jan 23, 2024
Study what changed to get past this examiner. Based on 3 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds
Prosecution Projections

6-7
Expected OA Rounds
15%
Grant Probability
44%
With Interview (+28.8%)
3y 10m
Median Time to Grant
High
PTA Risk
Based on 33 resolved cases by this examiner. Grant probability derived from career allow rate.
PEER-TO-PEER TRAINING OF A MACHINE LEARNING MODEL

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

PEER-TO-PEER TRAINING OF A MACHINE LEARNING MODEL

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email