Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 1 December 2025 has been entered.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim(s) 1, 7, 10-11, 17, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo et al. (NPL: “HFEL: Joint Edge Association and Resource Allocation for Cost-Efficient Hierarchical Federated Edge Learning”, published 26 June 2020, hereinafter “Luo”) in view of Roy et al. (NPL: “BrainTorrent: A Peer-to-Peer Environment for Decentralized Federated Learning”, published May 2019, hereinafter “Roy”).
Regarding Claim 1, Luo teaches a system for decentralized federated learning, comprising:
multiple agents coupled to a communication network, each agent comprising a data collector collecting raw data; a memory storing the raw data and a local machine learning model; and a processor training the local machine learning model on the collected raw data to update the local machine learning model (Luo, Section 2 Paragraph 1 – “In the HFEL framework, we assume a set of mobile devices N={n:n=1,…,N} , a set of edge servers K={i:i=1,…,K} and a cloud server S . Let Ni⊆N represent the set of available mobile devices communicated with edge server i . In addition, each device n owns a local data set Dn={(xj,yj)}|Dn|j=1 where xj denotes the j -th input sample and yj is the corresponding labeled output of xj for n ’s federated learning task.” & Section 2A Paragraph 3 - “Step 1. Local Model Computation: At this step for a device n , it needs to solve the machine learning model parameter ω which characterizes each output value yj with loss function fn(xj,yj,ω) . The loss function on the data set of device n is defined as” – teaches multiple agents (devices) coupled to a communication network, each agent comprising a data collector collecting raw data (each device owns a local data set Dn), a memory storing the raw data and a local machine learning model (for a device n, it needs to solve machine learning model parameter), and a processor training the local machine learning model on the collected raw data to update the local machine learning model (parameter characterizes each output with a loss function)), and
multiple cluster aggregators coupled to the communication network, each cluster aggregator being uniquely associated with a respective subset of the agents (Luo, Section 2 Paragraph 1 – “In the HFEL framework, we assume a set of mobile devices N={n:n=1,…,N} , a set of edge servers K={i:i=1,…,K} and a cloud server S . Let Ni⊆N represent the set of available mobile devices communicated with edge server i” – teaches multiple cluster aggregators (edge servers) coupled to the communication network, each cluster aggregator uniquely associated with a respective subset of the agents (let Ni represent the set of available mobile devices communicated with edge server i)),
each cluster aggregator comprising a model collector collecting the local machine learning models from the associated agents; a memory storing the collected local machine learning models; and a processor creating a respective cluster machine learning model, which is a first-level aggregated model, from the collected local machine learning models from its associated agents (Luo, Section 2A Paragraphs 6 — “Step 2. Local model transmission: After finishing L(θ) local iterations, each device n will transmit its local model parameters ωtn to a selected edge server i , which incurs wireless transmission delay and energy. Then for an edge server i , we characterize the set of devices who choose to transmit their model parameters to i as Si⊆Ni” & Paragraph 9 - “Step 3. Edge model aggregation: At this step, each edge server i receives the updated model parameters from its connected devices Si and then averages them as” – teaches each aggregator comprising a model collector collecting the local machine learning models from the associated agents (each device transmits model parameters to a selected edge server), a memory storing the collected local machine learning models (server receives updated model parameters from devices), and a processor creating a respective cluster machine learning model, which is a first-level aggregated model, from the collected local machine learning models from its associated agents (aggregates collected model parameters from connected devices)),
wherein each of the multiple cluster aggregators is configured to act as a peer in a distributed network of aggregators and to communicate directly with at least a subset of other cluster aggregators within a group of cluster aggregators (Luo, Section 4, Paragraph 11 – “Based on the wireless communication between devices and edge servers, each device reports all its detailed information (including computing and communication parameters) to its available edge servers. Then each edge server i will calculate its own utility v(Si) , communicate with the other edge servers through cellular links and manage the edge association adjustments.” – teaches each of the multiple cluster aggregators configured to act as a peer in a distributed network of aggregators (each edge servers communicates with other edge servers) and communicate directly with at least a subset of other cluster aggregators (each edge server communicates with other edge servers through cellular links and manage edge association adjustments, which includes device transferring and device exchange (Luo, Definitions 4 and 5))),
each of the cluster aggregators is configured to send its synthesized semi-global machine learning model to its associated agents, and each of the agents is configured to update its local machine learning model with the semi-global machine learning model received from the associated cluster aggregator (Luo, Section 2A, Paragraph 18 — “pushing the global parameter w to all the devices (agents) via the edge servers (aggregators)”, and in addition to the previously cited passage, Luo further teaches in Section 2A, Paragraph 10 – “After that, edge server i broadcasts ωi to its devices in Si for the next round of local model computation (i.e. step 1)” – teaches each of the cluster aggregators (edge servers) configured to send the synthesized semi-global machine learning model (edge server broadcasts aggregated parameter w to its devices) to its associated agents, and each of the agents is configured to update its local machine learning model with the semi-global machine learning model received for the associated cluster aggregator (devices use aggregated model parameter w in the next round of local model computation, thus the agents are updated with the semi-global machine learning model received from the associated cluster aggregator)).
Luo fails to explicitly teach wherein each of the multiple cluster aggregators is configured to communicate directly with at least a subset of other cluster aggregators within a group of cluster aggregators to exchange their respective cluster machine learning models therebetween, each said cluster aggregator further configured to synthesize a semi-global machine learning model, which is a second-level aggregated model, from the cluster machine learning models received from said at least subset of other aggregators.
However, in the same field of endeavor of federated learning, Roy teaches:
wherein each of the multiple cluster aggregators is configured to act as a peer in a distributed network of aggregators and to communicate directly with at least a subset of other cluster aggregators within a group of cluster aggregators to exchange their respective cluster machine learning models therebetween, each said cluster aggregator further configured to synthesize a semi-global machine learning model, which is a second-level aggregated model, from the cluster machine learning models received from said at least subset of other aggregators (Roy, Fig. 1 and Section 2.2 Paragraph 1 – “Firstly, all clients {Ci} N i=1 in this environment are connected directly in a peer-to-peer fashion, as indicated in Fig. 1 (b). Unlike FLS, along with the model, each client maintains a vector v ∈ N N containing its own version and the last versions of models it used during merging. At the start, every entry is initialized to zero. Every time a fine-tuning step occurs, it increments its own version number. The training process at any step is conducted with the following steps:” and in Section 2.2 Steps 3 and 4 – “3. All clients Cj with updates, i.e., v j old < vj new, send their weights Wj and the training sample size aj to Ci . 4. This subset of models is merged with Ci ’s current model to a single model by weighted averaging. Then return to Step 1” – teaches aggregators (clients) configured to act as peers in a distributed network and configured to communicate with at least a subset of other aggregators within a group of aggregators to exchange their respective machine learning models therebetween (Clients connected in peer-to-peer fashion, and in step 3 all clients with updates send their weights to designated client or aggregator) each said aggregator configured to synthesize a semi-global machine learning model, which is a second-level aggregated model, from the machine learning models received from said at least subset of other aggregators (Step 4, subset of models merged or aggregated with Ci’s current model to a single model by weighted average)).
Therefore, it would have been obvious to a person with ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the model communication of Roy to the edge servers containing aggregated models of Luo in order to exchange and aggregate the models between the edge servers to synthesize a semi-global model. Doing so would enable mobile devices to first upload local models to proximate edge servers for partial model aggregation which can offer faster response rate and relieve core network congestion (Luo, Section 6) and enable centers to collaborate and benefit from each other without sharing data among them to reach an accuracy similar to a model trained with pooling the data from all the clients (Roy, Introduction).
Claim 11 is similar to Claim 1, hence rejected similarly.
Regarding Claim 7, the combination of Luo and Roy teach the system for decentralized federated learning according to Claim 1, to create a global machine learning model that aggregates learnings from agents associated with said different distinct groups of cluster aggregators (Luo, Section 2A, Paragraph 15 — “Step 1. Edge model uploading: Let ri denote the edge server i ’s transmission rate to the remote cloud for edge model uploading, pi the transmission power per sec and di the edge server i ’s model parameter size” – teaches wherein the cluster aggregators upload their models to create a global machine learning model that aggregates learnings from agents associated with said different cluster aggregators, and in addition to the previously cited passages, Luo further teaches in Section 2A Paragraph 16 – “Cloud model aggregation: At this final step, the remote cloud receives the updated models from all the edge servers and aggregates them as” – further teaching the aggregation of the cluster aggregators (edge servers) to create a global model that aggregates learnings from agents associated with the cluster aggregators).
Luo fails to explicitly teach wherein the multiple cluster aggregators form a plurality of distinct groups of cluster aggregators, and wherein cluster aggregators belonging to different said groups, communicate with each other periodically to exchange their respective synthesized semi-global machine learning models.
However, in the same field of federated learning, Roy teaches:
wherein the multiple cluster aggregators form a plurality of distinct groups of cluster aggregators, and wherein cluster aggregators belonging to different said groups, communicate with each other periodically to exchange their respective synthesized semi-global machine learning models (Roy, Fig. 1 and Section 2.2 Paragraph 1 – “Firstly, all clients {Ci} N i=1 in this environment are connected directly in a peer-to-peer fashion, as indicated in Fig. 1 (b). Unlike FLS, along with the model, each client maintains a vector v ∈ N N containing its own version and the last versions of models it used during merging. At the start, every entry is initialized to zero. Every time a fine-tuning step occurs, it increments its own version number. The training process at any step is conducted with the following steps:” and in Section 2.2 Steps 3 and 4 – “3. All clients Cj with updates, i.e., v j old < vj new, send their weights Wj and the training sample size aj to Ci . 4. This subset of models is merged with Ci ’s current model to a single model by weighted averaging. Then return to Step 1” – teaches aggregators forming a plurality of distinct groups to communicate and exchange their respective synthesized semi-global machine learning models periodically (Step 3 shows aggregators forming groups based on updates, step 4 shows aggregation of models within group)).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the aggregators exchanging models of Roy to the cluster aggregators forming a global model of Luo in order to create a global model based on the semi-global models synthesized by exchange. Doing so would enable mobile devices to first upload local models to proximate edge servers for partial model aggregation which can offer faster response rate and relieve core network congestion (Luo, Section 6) and enable centers to collaborate and benefit from each other without sharing data among them (Roy, Introduction).
Claim 17 is similar to Claim 7, hence rejected similarly.
Regarding Claim 10, the combination of Luo and Roy teach the system for decentralized federated learning according to Claim 1, wherein the agents retain the raw data and send only the trained local machine learning model, which comprises model parameters or model parameter updates but not the collected raw data itself, to the cluster aggregators (Luo, Section 2A Paragraphs 6 — “Step 2. Local model transmission: After finishing L(θ) local iterations, each device n will transmit its local model parameters ωtn to a selected edge server i , which incurs wireless transmission delay and energy. Then for an edge server i , we characterize the set of devices who choose to transmit their model parameters to i as Si⊆Ni”, and in addition to the previously cited passages, Luo further teaches in Section 2A, Paragraph 11 – “We can observe that each edge server i won’t access the local data Dn of each device n , thus preserving personal data privacy” – teaches wherein the agents retain the raw data and sed only the trained local machine learning model, which comprises model parameters or model parameter updates but not the collected raw data itself, to the cluster aggregators (device transmit local model parameters to a selected edge server, and it is observed that each edge server will not access the local data of each device)).
Claim 20 is similar to Claim 10, hence rejected similarly.
Claim(s) 2-3 and 12-13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo and Roy as applied to claims 1 and 11 above, and further in view of Qu et al. (NPL: Decentralized Privacy Using Blockchain-Enabled Federated Learning in Fog Computing, published March 2020, hereinafter “Qu”).
Regarding Claim 2, the combination of Luo and Roy teach the system for decentralized federated learning according to Claim 1, further comprising a distributed database operatively coupled to the multiple cluster aggregators and configured to store the local machine learning models, the respective cluster machine learning models created by the cluster aggregators (Luo, Section Algorithm 3, Paragraph 1 – “Specially, a historical group set hi is maintained for each edge server i to record the group composition it has formed before with the corresponding utility value so that repeated calculations can be avoided” – teaches a database operatively coupled to the multiple cluster aggregators and configured to store the local machine learning models and the respective cluster machine learning models created by the cluster aggregator (a historical group set is maintained for each edge server, recording group composition, thus recording the local machine learning models and the cluster machine learning models (groups)))
Luo fails to explicitly teach a distributed database operatively coupled to the multiple cluster aggregators and configured to store the semi-global machine learning model synthesized by said cluster aggregators.
However, analogous to the field of the claimed invention, Roy teaches:
store the semi-global machine learning model synthesized by said cluster aggregators (Roy, Section 2.2 Paragraph 1 – “Unlike FLS, along with the model, each client maintains a vector v ∈ N N containing its own version and the last versions of models it used during merging” – teaches a database operatively coupled to the multiple aggregators and configured to store the semi-global machine learning model synthesized by said aggregators (each client stores its own version and the last versions of models it used during merging)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the storing of the semi-global machine learning model synthesized by said aggregators of Roy to the storing of the local and cluster machine learning models of Luo in order to store all of the machine learning models of the system. Doing so would allow repeat calculations to be avoided (Luo, Section Algorithm 3) and allows clients to determine which other clients have a newer model to determine an update (Roy, Fig. 1).
The combination of Luo and Roy fails explicitly teach the wherein the distributed database identifies each of the local machine learning models, the respective cluster machine learning models and the semi-global machine learning models by a globally unique hash value to provide accountability and traceability of model provenance across the decentralized federated learning system.
However, in the field of federated learning, Qu teaches:
wherein the distributed database identifies each of the local machine learning models, the respective cluster machine learning models and the semi-global machine learning models by a globally unique hash value to provide accountability and traceability of model provenance across the decentralized federated learning system (Qu, Section 3B Paragraph 1 – “Each block in the distributed ledger contains body and header sectors. In the classic blockchain structure, the body sector usually contains a set of verified transactions. In FL-Block, the body stores the local model updates of the devices in V, i.e., (ω(l) i ,{fk(ω(l) )}sk∈Si) for the device Vi at the lth epoch, as well as its local computation time T(l) {local,i}. Extended from the classic blockchain structure, the header sector stores the information of a pointer to the previous block, block generation rate λ, and the output value of the consensus algorithm, namely, the nonce” – teaches a distributed database (distributed ledger) that stores machine learning models and in Section 4A Paragraph 3 – “In Protocol 1, we illustrate an instance of a sole end device identity owner (uo) and two guest end devices, namely, identity recipients (ug). The identity contains generating key pairs for the sole owner and two guests. In addition, a symmetric key is required to both encrypt and decrypt the data. In this way, these data are restricted to the other end devices in fog computing” and Section 4C Paragraphs 2-3 – “There are only pointers information encrypted with a hash function inside a public ledger. Even if we consider the case that an adversary controls one or some of the nodes in the DHT network, the adversary cannot learn anything about the data. The rationale behind this is that the data are encrypted with keys that no other nodes have access” – teaches wherein the distributed database (distributed ledger) identifies stored models by a globally unique hash value (pointers encrypted with hash functions, symmetric key is used to encrypt and decrypt data of models) to provide accountability and traceability of model provenance).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the distributed database (ledger) identifying stored models by a globally unique hash value of Qu to the local, cluster, and semi-global machine learning models of Luo and Roy in order to store and identify all the models in a distributed database. Doing so would enable the advantages of Blockchain such decentralization, nontempering, open autonomy, and anonymous traceability and enables decentralized privacy protection while preventing single point failure (Qu, Introduction).
Claim 12 is similar to Claim 2, hence rejected similarly.
Regarding Claim 3, the combination of Luo, Roy, and Qu teach the system for decentralized federated learning according to Claim 2, wherein each of the local machine learning models, the respective cluster machine learning models, and the semi-global machine learning models on the distributed database has meta information, the meta information including at least one of a model generation time, a size of training samples, or a task type, so that the models are searchable by specifying queries based on said meta information (Qu, Section 3B Paragraph 1 – “Each block in the distributed ledger contains body and header sectors. In the classic blockchain structure, the body sector usually contains a set of verified transactions. In FL-Block, the body stores the local model updates of the devices in V, i.e., (ω(l) i ,{fk(ω(l) )}sk∈Si) for the device Vi at the lth epoch, as well as its local computation time T(l) {local,i}. Extended from the classic blockchain structure, the header sector stores the information of a pointer to the previous block, block generation rate λ, and the output value of the consensus algorithm, namely, the nonce” – teaches wherein each of the models on the distributed database (distributed ledger) has meta information, the meta information including at least one of a model generation time (block generation rate, local computation time)).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the distributed database (ledger) identifying stored models with meta information of Qu to the local, cluster, and semi-global machine learning models of Luo and Roy in order to store all the models in a distributed database with meta information. Doing so would enable the advantages of decentralization, nontempering, open autonomy, and anonymous traceability (Qu, Introduction) and enables decentralized privacy protection while preventing single point failure (Qu, Introduction).
Claim 13 is similar to claim 3, hence rejected similarly.
Claim(s) 4 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo and Roy as applied to claims 1 and 11 above, and further in view of Choudhary et al. (US Patent No. 11,593,634, effective filing date of June 2019, hereinafter “Choudhary”).
Regarding Claim 4, the combination of Luo and Roy teach the system for decentralized federated learning according to Claim 1, wherein performances of the local machine learning models (Luo, Section 2A, Paragraph 4 — “To achieve a local accuracy θ∈(0,1) which is common to all the devices for a same model, device n needs to run a number of local iterations formulated as L(θ)=μlog(1/θ) for a wide range of iterative algorithms. Constant μ depends on the data size and the machine learning task. At t -th local iteration, each device n ’s task is to figure out its local update as” – teaches evaluating the performance of the local machine learning models) the respective cluster machine learning models (Luo, Section 2A, Paragraph 11 —“In other words, step 1 to step 3 of edge aggregation will iterate until edge server i reaches an edge accuracy ε which is the same for all the edge servers. We can observe that each edge server i won’t access the local data Dn of each device n , thus preserving personal data privacy. In order to achieve the required model accuracy, for a general convex machine learning task, the number of edge iterations is shown to be” – teaches evaluating cluster machine learning model accuracy (edge accuracy ε)), said performances including at least model accuracy and convergence metrics (Luo, Section 2A – Paragraph 11 – “In other words, step 1 to step 3 of edge aggregation will iterate until edge server i reaches an edge accuracy ε which is the same for all the edge servers. We can observe that each edge server i won’t access the local data Dn of each device n , thus preserving personal data privacy. In order to achieve the required model accuracy, for a general convex machine learning task, the number of edge iterations is shown to be Eq. (9)… Note that our analysis framework can also be applied when the relation between the convergence iterations and model accuracy is known in non-convex learning tasks” – teaches performance evaluations including at least model accuracy (edge accuracy) and convergence metrics (Eq. 9 shows the number of iterations required to reach required accuracy, and thus Eq. 9 describes metrics for convergence)), are constantly evaluated (Luo, Section 4, Paragraph 13 — “An edge association strategy DS∗ is at a stable system point if no edge server i will change S∗i∈DS∗ to obtain lower global training overhead with S∗−i∈DS∗ unchanged” – teaches constantly evaluating the system until the performance reaches a stable point, and in addition to the previously cited passages, Luo further teaches in Section 4 Paragraph 7 – “Next, we can solve the overhead minimization problem by constantly adjusting edge association strategy DS , i.e., each edge server’s training group formation, to gain lower overhead in accordance with preference order ⊳ . The edge association adjusting will result in termination with a stable DS∗ where no edge server i in the system will deviate its local training group from S∗i∈DS∗” – further teaches constantly evaluating system performance)
Luo fails to explicitly teach the synthesized semi-global machine learning models.
However, analogous to the field of the claimed invention, Roy teaches:
the synthesized semi-global machine learning models (Roy, Section 2.2 Step 4 – “This subset of models is merged with Ci ’s current model to a single model by weighted averaging. Then return to Step 1” – teaches the synthesized semi-global machine learning models (subset of models is merged with Ci’s current model into a single model by weighted averaging))
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date, to incorporate the synthesized semi-global machine learning models of Roy to the performance evaluations and systems of Luo in order to constantly evaluate the performance of every model in the system. Doing so would enable more robust training of clients through highly dynamic updates, reaching performance similar to a model trained on data pooled across clients (Roy, Conclusion).
The combination of Luo and Roy fails to explicitly teach wherein performances are visualized on a graphical user interface of the agents to allow users to monitor the learning progress and the effectiveness of the received semi-global machine learning models.
However, analogous to the field of the claimed invention, Choudhary teaches:
visualized on a graphical user interface of the agents to allow users to monitor the learning progress and the effectiveness of the received semi-global machine learning models (Choudhary, Page 30 Col. 28 Line 59 – Page 31 Col. 29 Line 2 – “Turning now to FIG. 7, this figure illustrates a graphical user interface 700 of a client device presenting various performance parameters for the client device executing a local machine learning model in accordance with one or more embodiments. The graphical user interface 700 depicts a screenshot of actual performance parameters. As shown in FIG. 7, the client device corresponding to the graphical user interface 700 is a resource-constrained device, such as a smartphone. As indicated by the graphical user interface 700, the client device is executing a native application during a training iteration of a local machine learning model.” – teaches a graphical user interface of the agents visualizing performance to allow users to monitor the learning progress and the effectiveness of the received semi-global machine learning models. In addition to the previously cited passages, Choudhary further teaches in Page 31 Col. 29 Lines 18-27 – “FIGS. 8A and 8B respectively illustrate graphical user interfaces 800a and 800b of a spam-email-detector application showing the accuracy of a machine learning model in multiple training iterations of classifying emails in accordance with one or more embodiments.” – teaches a graphical user interface of the agents allowing users to monitor the learning progress and effectiveness of received models (accuracy of model in multiple training iterations of classification)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the graphical user interface of the agents visualizing performance of the received models of Choudhary to the local, cluster, and semi-global machine learning models and performance evaluations including at least model accuracy and convergence metrics of Luo and Roy in order to visualize performance including model accuracy and convergence on an interface of the agents. Doing so would allow researchers to monitor the learning and accuracy of the models (Choudhary, Page 31 Col. 29)
Claim 14 is similar to Claim 4, hence rejected similarly.
Claim(s) 5 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo and Roy as applied to claims 1 and 11 above, and further in view of Jacobs et al. (US Patent No. 11,544,743, effective filing date of Oct. 2017, hereinafter “Jacobs”).
Regarding Claim 5, the combination of Luo and Roy teach the system for decentralized federated learning according to Claim 1, wherein at least one of the agents utilizes a neural network to train the local machine learning model (Luo, Section 2A Paragraph 3 - “Step 1. Local Model Computation: At this step for a device n , it needs to solve the machine learning model parameter ω which characterizes each output value yj with loss function fn(xj,yj,ω) . The loss function on the data set of device n is defined as” – teaches wherein at least one of the agents (devices) train the local machine learning model).
Luo and Roy fail to explicitly teach the neural network comprising an embedding block configured to receive a state of the local machine learning model as an input and convert the state into a common representation by applying a learned transformation that maps heterogenous local model states from different agents into said common representation space, thereby accounting for heterogeneity of the local machine learning models across said agents, an inference block configured to receive the common representation of the input from the embedding block and to produce an output indicative of a task-specific interface , and a transfer block configured to receive the common representation of the output from the inference block and to convert the common representation of the output into an output value compatible with a process model utilized by the agent.
However, in the same field of federated learning, Jacobs teaches
the neural network comprising an embedding block configured to receive a state of the local machine learning model as an input and convert the state into a common representation by applying a learned transformation that maps heterogenous local model states from different agents into said common representation space, thereby accounting for heterogeneity of the local machine learning models across said agents (Jacobs, Page 16 Col. 6 Lines 50-58 – “FIG. 2 depicts a system 200 and FIG. 3 depicts a procedure 300 in an example implementation in which a software development kit is obtained having functionality to embed a machine learning module 124 as part of an application 120. FIG. 4 depicts a system 400 in an example implementation in which a model 126, embedded as part of a machine learning module 124 within an application 120, is trained within a context of execution of the application 120 by a client device 104.” – teaches a neural network comprising an embedding block configured to convert a state of a machine model into a common representation by applying a learned transformation that maps the model states into common representation space, thereby accounting for heterogeneity of the local machine learning models across agents (model 126 embedded as part of a machine learning model within a system, trained within context of execution)), an inference block configured to receive the common representation of the input from the embedding block and to produce an output indicative of a task-specific interface (Jacobs, Page 16 Col. 6 Lines 54-65 – “FIG. 4 depicts a system 400 in an example implementation in which a model 126, embedded as part of a machine learning module 124 within an application 120, is trained within a context of execution of the application 120 by a client device 104. FIG. 5 depicts a system 500 in an example implementation in which the model 126 is employed within execution of the application 120 to control digital content access. FIG. 6 depicts a procedure 600 in an example implementation in which a model 126 embedded as part of an application 120 is trained within a context of execution of the application and used to generate a recommendation to control output of an item of digital content.” and in Page 18 Col. 9 Lines 25-28 – “Accordingly, the model 126, one trained, is usable to infer likely user preferences of a user that is a source of this user interaction and thus use these inferred preferences to personalize user interaction with the application 120.” – teaches an inference block configured to receive the common representation of the input from the embedding block (receives embedding of machine learning model) and to produce an output indicative of a task-specific interface (trains embedded model in context of execution of application to produce an output of an item, thus infers an output)), and a transfer block configured to receive the common representation of the output from the inference block and to convert the common representation of the output into an output value compatible with a process model utilized by the agent (Jacobs, Page 18 Col. 9 Lines 37-46 – “A recommendation 502 is then generated by processing the data using machine learning based on the embedded trained model 126 (block 608) of the machine learning module 124. The recommendation 502, for instance, may describe digital marketing content 112 that is likely to cause conversion or other types of digital content 504, digital images 506 (e.g., stock images), digital videos 508 and digital audio 510 (e.g., from a streaming service system or available for local download), or other types of digital media 512.” and in Page 18 Col. 9 Lines 52-65 – “In the illustrated example, the recommendation 502 is transmitted for receipt by the service provider system 106 via the network 108 (block 610), e.g., without identifying how the model 126 is trained or even identifying a user associated with the model. The service provider system 106 then uses the recommendation 502 to select digital content 504 from a plurality of items of digital content 402 to be provided back to the client device 104. The recommendation 502, for instance, may identify particular characteristics of digital content 504 that is likely to be of interest, e.g., genres, products or services in a digital marketing scenario, and so forth. In another instance, the recommendation 502 identifies the particular items of digital content 504 itself based on the previously processed data.” – teaches a transfer block that receives the output from the inference block and converts the common representation of the output into a value compatible with a process model utilized by the agent (recommendation 502 produced by inference block is transmitted for receipt by service provider system 106 (thus the transfer block) to convert the recommendation to a selected digital content provided back to the client (thus converting the common representation into a value compatible with the process model of the agent))).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the embedding, inference, and transfer blocks of Jacobs to the local, cluster, and semi-global machine learning models and system of Luo and Roy in order to convert the machine learning models into common representations, produce an output based on the representation, and output a value compatible with a model utilized by the agent. Doing so would employ the model locally at agent devices without exposing the model or information to how the model is trained, thereby preserving user privacy while still supporting rich personalization in a computationally efficient manner (Jacobs, Page 18, Col. 9)
Claim 15 is similar to Claim 5, hence rejected similarly.
Claim(s) 6 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo and Roy as applied to claims 1 and 11 above, and further in view of Xu et al. (US Pub. No. 2021/0143987, effective filing date of Nov. 2019, hereinafter “Xu”).
Regarding Claim 6, the combination of Luo and Roy teach the system for decentralized federated learning according to Claim 1.
The combination of Luo and Roy fail to explicitly teach wherein at least one of the cluster aggregators comprises an agent simulator configured to emulate an agent’s operational environment and test at least one of the local machine learning models received from an associated agent, the respective cluster machine learning model, and the synthesized semi-global machine learning model created by said cluster aggregator, prior to distributing said semi-global machine learning model, to verify a validity of the tested machine learning model against a predefined test case and performance thresholds, wherein the validity relates to whether the machine learning model with perform a task on an agent to a required level of performance without causing operational failure.
However, analogous to the field of the claimed invention, Xu teaches:
wherein at least one of the cluster aggregators comprises an agent simulator configured to emulate an agent’s operational environment and test at least one of the local machine learning models received from an associated agent, the respective cluster machine learning model, and the synthesized semi-global machine learning model created by said cluster aggregator, prior to distributing said semi-global machine learning model, to verify a validity of the tested machine learning model against a predefined test case and performance thresholds, wherein the validity relates to whether the machine learning model will perform a task on an agent to a required level of performance without causing operational failure (Xu, [0075] – “The Validation Component 550 receives decryption requests from the Aggregator 110, where the request specifies an aggregation vector. In an embodiment, the Validation Component 550 validates the request by confirming that there is a quorum (e.g., that the number of non-zero weights and/or the number of weights exceeding a predefined minimum threshold, satisfies predefined criteria). In some embodiments, the Validation Component 550 further compares the non-zero weights to each other, to confirm that the difference between each is satisfactory. For example, the Validation Component 550 can confirm that all of the non-zero weights are equal. If the vector is successfully validated, the Validation Component 550 generates and transmits a corresponding secret key to the Aggregator 110. In some embodiments, if the vector is insufficient, the Validation Component 550 rejects it and asks the Aggregator 110 to send another request. Additionally, in at least one embodiment, the Validation Component 550 can notify one or more of the Participants 120 about this failure.” and in [0073] – “Although depicted as discrete components for conceptual clarity, in embodiments, the operations of the Key Component 540, Distribution Component 545, and Validation Component 550 can be combined or distributed across any number of components.” – teaches wherein at least one of the aggregators comprises an agent simulator (validation component can be combined or distributed across any components) and test at least one of the received machine learning models (aggregation vector received by request) to verify a validity of the tested machine learning model against predefined test cases and performance thresholds (verifies aggregated vector by thresholds such as quorum, equal non-zero weights, difference between weights, thus predefined test cases and performance thresholds), to verify that the model will perform a task on an agent to a required level of performance without causing operational failure (if aggregated vector is rejected, agents are notified of the operational failure of the model, thus verifying if the model can perform tasks on an agent to a required level of performance)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the agent simulators testing machine learning models to verify a validity of the test machine learning models against performance thresholds of Xu to the local, cluster, cluster aggregators, and semi-global machine learning models of Luo and Roy in order to verify a validity of the tested machine learning models of the decentralized federated learning system. Doing so would allow agents to identify untrustworthy aggregators (Xu, [0075]) and improves efficiency in federated learning environments while maintaining privacy (Xu, [0030]).
Claim 16 is similar to Claim 6, hence rejected similarly.
Claim(s) 8 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo and Roy as applied to claims 1 and 11 above, and further in view of Kushwah et al. (US Pub. No. 2021/0327562, effective filing date of April 2020, hereinafter “Kushwah”).
Regarding Claim 8, the combination of Luo and Roy teach the system for decentralized federated learning according to Claim 7, further comprising
updates its semi-global machine learning model with the retrieved global machine learning model (Luo, Fig. 1 – shows the cluster aggregators (edge servers) updating their models with the retrieved global machine learning model (edge servers download global model))
The combination of Luo and Roy fails to explicitly teach a model repository, accessible by the plurality of cluster aggregators, storing a plurality of the global machine learning models previously created by the system as Crowdsourced Global Manipulation Models and meta-data indicating tasks used for training the respective global machine learning models and wherein at least one of the cluster aggregators comprises a Model Selector function configured to, upon receiving a new task from one of the associated agents, as a preliminary step before training for the new task, compute[[s]] similarity distances between the new task and the tasks described in the meta-data of the global machine learning models stored in the model repository and the received new task, and to select and retrieve from the model repository a global machine learning model having a smallest similarity distance to the new task, and updates its semi-global machine learning model with the retrieved global machine learning model having the smallest similarity distance as a pre-trained starting point to initialize or enhance learning for the new task.
However, analogous to the field of the claimed invention, Faulhaber teaches:
a model repository, accessible by the plurality of cluster aggregators, storing a plurality of the global machine learning models previously created by the system as Crowdsourced Global Manipulation Models and meta-data indicating tasks used for training the respective global machine learning models (Kushwah, [0055] – “The infectious disease detection system 160 also includes an analytical model database 168 and a repository storing multiple analytical models 170. The analytical model database 168 includes several algorithms, training data set, rules for development of analytical models. Each algorithm is selected based on multiple factors and depending upon the data available in the database, training data, type of data and other variables such as number of input devices 120 to be utilized for detection of the infectious disease(s). For example, the type of data may be an image data, a voice data, clinical test data, and/or a body temperature of the individual or some other type of data.” – teaches a model repository, accessible by the network (as in [0057]), storing a plurality of machine learning models previously created by the system as Crowdsourced Global Manipulation Models and meta-data indicating tasks used for training the respective global machine learning models (model database and repository storing multiple analytical models and meta data including algorithms, data, and rules for development. Algorithms selected based on data available, training data, and type of data to account for the various data received by input devices)),
wherein at least one of the cluster aggregators comprises a Model Selector function configured to, upon receiving a new task from one of the associated agents, as a preliminary step before training for the new task, compute[[s]] similarity distances between the new task and the tasks described in the meta-data of the global machine learning models stored in the model repository and the received new task, and to select and retrieve from the model repository a global machine learning model having a smallest similarity distance to the new task, and updates its machine learning model with the retrieved global machine learning model having the smallest similarity distance as a pre-trained starting point to initialize or enhance learning for the new task (Kushwah, [0078] – “In certain embodiments, the processed data is passed to the detection module 176. The detection module 176 access the analytical model selector module 184 to select an analytical model based on a pre-determined criterion. In embodiments, the pre-determined criteria for selection of an analytical model may be based on parameter type, feature type, feature characteristics, gender, age, medical history or some other variable associated with the individual 110. Once the analytical model selector module 184 has selected an analytical model to be applied for detection of the infectious disease from the analytical models 170, the process of analysis of infectious disease using the selected analytical model is initiated.” – teaches a Model selector function (model selector module) that, upon receiving a new task (processed input passed into module), to select and retrieve from the model repository a machine learning model having a smallest similarity distance to the new task, and updates a machine learning model with the retrieved machine learning model having the smallest similarity distance as a pre-trained starting point to initialize or enhance learning for the new task (model selector selects model based on a pre-determined criterion, thus prior to training for the new task, where the pre-determined criterion may be based on task similarities (feature type, feature characteristics, some associated variable), and the selected model analysis process initiates, thus the selected model is a pre-trained starting point to initiate or enhance learning for new task)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the model repository and model selection function of Kushwah to the local, cluster, semi-global, and global machine learning models and federated learning system of Luo and Roy in order to update the semi-global model with a model best fit for a new task. Doing so would enable systems to select models best fit for tasks or situations (Kushwah, [0055]) and provide a storage for models created by the system (Kushwah, [0056]).
Claim 18 is similar to Claim 8, hence rejected similarly.
Claim(s) 9 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo and Roy as applied to claims 1 and 11 above, and further in view of Deng et al. (NPL: Adaptive Personalized Federated Learning, published March 2020, hereinafter “Deng”).
Regarding Claim 9, the combination of Luo and Roy teaches the system for decentralized federated learning according to Claim 1.
The combination of Luo and Roy fails to explicitly teach wherein at least one of the agents is further configured to generate a learnable personalization rate as a parameter to be optimized, said personalization rate initialized within a range from more than 0 to less than 1; performs a given number of gradient descents for parameters of the global machine learning model received from its associated cluster aggregator, parameters of its updated local machine learning models and for the learnable personalization rate itself to iteratively update the parameter and the personalization rate; obtains a personalized machine learning model by combining its updated local machine learning model and the semi-global machine learning model using the iteratively updated learnable personalization rate, where the updated learnable personalization rate measures an extent to which the personalized machine learning model mixes the updated local and the semi-global machine learning models; tests the personalized model to check whether a certain performance criteria is met; and, when the performance criteria is met, outputs the personalized machine learning model.
However, analogous to the field of the claimed invention, Deng teaches:
wherein at least one of the agents is further configured to generate a learnable personalization rate as a parameter to be optimized (Deng, Section 4 Paragraph 3 – “Motivated by the trade-off between the global model and local model generalization errors in Theorem 1, we need to learn a personalized model as in (1) to optimize the local empirical risk. To this end, each client needs to solve the following optimization problem over its local data: Eq. (8)” – teaches wherein at least one of the agents (in Deng, the client is an agent) is further configured to generate a learnable personalization rate as a parameter to be optimized (each client needs to solve the optimization problem for the personalization parameter a)), said personalization rate initialized within a range from more than 0 to less than 1; (Deng, Section 5.1 Paragraph 1 – “As Theorem 1 suggests the generalization bound is a linear function of α, which means its optimal is reached when α = 0 or 1” – teaches the learnable personalization rate as a parameter to be optimized, wherein the personalization rate is within a range from than 0 to less than 1)
performs a given number of gradient descents for parameters of the global machine learning model received from its associated cluster aggregator, parameters of its updated local machine learning models and for the learnable personalization rate itself to iteratively update the parameter and the personalization rate (Deng, Section 4.1 Paragraph 1 – “Each selected client will maintain three models at iteration t: local version of the global model w (t) i , its own local model v (t) i , and the mixed personalized model v¯ (t) i = αiv (t) i + (1 − αi)w (t) i . Then, selected clients will perform the following updates locally on their own data: Eq. (9) Eq. (10) where ∇fi (.; ξ) denotes the stochastic gradient of f(.) evaluated at mini-batch ξ. Then, using the updated version of the global model and the local model, we update the personalized model v¯ (t) i as well. The clients that are not selected in this round will keep their previous step local model v (t) i = v (t−1) I” and in Section 5.2 Paragraph 1 – “where we can use the gradient descent to optimize it at every communication round, using the following step: Eq. (22)” – teaches performing a given number of gradient descents for parameters of the global machine learning model received from the associated cluster aggregator, parameters of its updated local machine learning models and for the learnable personalization rate itself to iteratively update the parameters and the personalization rate (clients perform update on local, global, and personalized models locally, thus optimizing the parameters and performs gradient descent to optimize personalization rate a). In addition to the previously cited passages, Deng further teaches in Algorithm 1 – teaches updating parameters of local models, global models, and the personalization rate (contained within personalized model) independently and within a single training iteration to iteratively update the parameters and personalization rate (each model is updated independently at iteration t));
obtains a personalized machine learning model by combining its updated local machine learning model and the semi-global machine learning model using the iteratively updated learnable personalization rate, where the updated learnable personalization rate measures an extent to which the personalized machine learning model mixes the updated local and the semi-global machine learning models; (Deng, Section 3.1 Paragraph 2 – “In the adaptive personalized federated learning the goal is to find the optimal combination of the global model and the local model, in order to achieve a better client-specific model. In this setting, each user trains a local model while incorporating part of the global model, with some mixing weight αi , i.e.,” – teaches obtaining a personalized machine learning model by combining its updated local machine learning model and the global machine learning model using the iteratively updated learnable personalization rate, where the updated learnable personalization rate measures an extent to which the personalized machine learning model mixes the updated local and semi-global machine learning models (mixing weight a indicates how much the local model incorporates the global model, thus the learnable personalization rate, when optimized, measures the extent to which the models are mixed))
tests the personalized model to check whether a certain performance criteria is met; and, when the performance criteria is met, outputs the personalized machine learning model. (Deng, Section 4.1 Paragraph 2 – “Then the server will choose another set of K clients for the next round of training and broadcast this new model to them. This process continues until convergence” and Theorem 2 – teaches testing the personalized model to check whether a certain performance criteria is met and when the performance criteria is met, outputs the personalized machine learning model (Theorem 2 states when client’s objective function satisfies assumptions and parameters satisfy certain thresholds, model reaches convergence)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the learnable personalization rate of Deng to the local, cluster, and semi-global machine learning models and cluster aggregators of Luo and Roy in order to create personalized models for agents based on a learnable parameter that indicates how much the models are to be mixed. Doing so would adaptively learn a personalized model by leveraging the relatedness between its local and global model as learning proceeds, and the personalized model is more robust to increasing diversity of data and can generalize well (Deng, Introduction).
Claim 19 is similar to Claim 9, hence rejected similarly.
Claim(s) 21-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Luo and Roy as applied to claims 1 and 11 above, and further in view of Rajamoni et al. (US Pub. No. 2021/0304062, effective filing date of March 2020, hereinafter “Rajamoni”).
Regarding claim 21, the combination of Luo and Roy teaches the system of claim 1, wherein
at least one agent that was associated with the first cluster aggregator is automatically re-assigned to a second cluster aggregator of the multiple cluster aggregators, and wherein a state of the first cluster aggregator is recovered by the second cluster aggregator to seamlessly continue the decentralized federated learning without interrupting the agents (Luo, Section 4 Paragraph 9 – “Definition 4:A device transferring adjustment by n means that device n∈Si with |Si|>2 retreats its current training group Si and joins another training group S−i . Causing a change from DS1 to DS2 , the device transferring adjustment is permitted if and only if DS2⊳DS1 .” – teaches at least one agent that was associated with the first cluster aggregator is automatically re-assigned to a second cluster aggregator of the multiple cluster aggregators, and wherein a state of the first cluster aggregator is recovered by the second cluster aggregator to seamlessly continue the decentralized learning without interrupting the agents (device retreats current training group and joins another training group, occurs during training without interruption of the agents)).
The combination of Luo and Roy fails to explicitly teach a failure of the first cluster aggregator of the multiple cluster aggregators.
However, analogous to the field of the claimed invention, Rajamoni teaches:
a failure of the first cluster aggregator of the multiple cluster aggregators (Rajamoni, [0048] - “The system 300 has high requirements for infrastructure at the aggregator 310 to ensure failure recovery and high availability. For example, the system 300 comprises a backup aggregator 320 for the aggregator 310.” – teaches an aggregator of multiple aggregators failing (system ensures failure recovery)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the aggregator failure of Rajamoni to the device transfer of Luo and Roy in order to transfer devices in response to an aggregator failure. Doing so would provide a scalable and fault tolerant solution to federated learning that is simple, robust, and cost efficient (Rajamoni, [0078]).
Claim 22 is similar to Claim 21, hence rejected similarly.
Response to Arguments
Applicant's arguments filed 1 December 2025 have been fully considered but they are not persuasive. Regarding the rejection of claims 1 and 11, Applicant argues that the Examiner’s analogy of Roy’s clients is improper and that the Examiner has not provided sufficient motivation to combine prior art references. Applicant further argues that the combination of Roy to Luo would fundamentally alter Luo’s architecture. Examiner respectfully disagrees. Roy’s clients are equated to aggregators as the clients aggregate models received by other clients (See Roy at Fig. 1 & Section 2.2 Steps 3 & 4). Incorporating Roy’s communication between clients to aggregate client models would not fundamentally alter the architecture of Luo.
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007). In this case, a person of ordinary skill in the art would have found it obvious to incorporate Roy’s model communication to the models stored in the edge servers of Luo in order to synthesize a semi-global model. As explained in the current 35 U.S.C. 103 rejection, the motivation to combine Luo and Roy is the enablement of mobile devices to first upload local models to proximate edge servers for partial model aggregation which can offer faster response rate and relieve core network congestion (Luo, Section 6) and enablement of centers to collaborate and benefit from each other without sharing data among them to reach an accuracy similar to a model trained with pooling the data from all the clients (Roy, Introduction).
In response to applicant's argument that the examiner's conclusion of obviousness is based upon improper hindsight reasoning, it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning. But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the applicant's disclosure, such a reconstruction is proper. See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971). The rejection of claims 1 and 11 is maintained.
Applicant's arguments filed 1 December 2025 have been fully considered but they are not persuasive. Regarding the 35 U.S.C. 103 rejection of claims 2-3 and 12-13, Applicant argues that the motivation to incorporate Qu’s blockchain ledger is lacking and that Qu fails to meet the requirements of the claim regarding the hash value providing accountability and traceability. Examiner respectfully disagrees. The specification at [0042] defines traceability as the accountability of decisions during learning, stating that it is crucial to track the learning history of models and [0067] states that a distributed database such as Blockchain collaboratively guarantees the accountability of global model updates. The specification at [0104] further states a “Blockchain-based Model Update Recording for Accountability” which includes a Blockchain ledger. Qu states in Section 3B Paragraph 1 – “Each block in the distributed ledger contains body and header sectors. In the classic blockchain structure, the body sector usually contains a set of verified transactions. In FL-Block, the body stores the local model updates of the devices in V, i.e., (ω(l) i ,{fk(ω(l) )}sk∈Si) for the device Vi at the lth epoch, as well as its local computation time T(l) {local,i}.” and in Section 4C Paragraphs 2-3 – “There are only pointers information encrypted with a hash function inside a public ledger.” Qu teaches storing local model updates of devices Vi at the lth epoch as a hash function inside a distributed ledger, thus satisfying that the hash value provides traceability of model provenance (FL-block stores local model updates, thus tracking learning history) and provides accountability of model provenance (Blockchain guarantees accountability, traceability is accountability of decisions made by learning).
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007). In this case, a person of ordinary skill in the art would have found it obvious to incorporate the distributed ledger, or distributed database, of Qu to the system of Luo and Roy in order to store models in a distributed database. Doing so would enable the advantages of Blockchain such decentralization, nontempering, open autonomy, and anonymous traceability and enables decentralized privacy protection while preventing single point failure (Qu, Introduction). The rejection of claims 2-3 and 12-13 is maintained.
Applicant's arguments filed 1 December 2025 have been fully considered but they are not persuasive. Regarding the 35 U.S.C. 103 rejection of claims 4 and 14, applicant argues that Choudhary only visualizes system load (CPU/GPU), and not model performance metrics. Examiner respectfully disagrees. Upon further view, Choudhary further teaches in Page 31 Col. 29 Lines 18-27 – “FIGS. 8A and 8B respectively illustrate graphical user interfaces 800a and 800b of a spam-email-detector application showing the accuracy of a machine learning model in multiple training iterations of classifying emails in accordance with one or more embodiments.” Choudhary teaches visualizing model accuracy on graphical user interfaces. Thus, it would have been obvious to a person of ordinary skill in the art to incorporate the graphical user interface of the agents visualizing performance of the received models of Choudhary to the local, cluster, and semi-global machine learning models and performance evaluations including at least model accuracy and convergence metrics of Luo and Roy in order to visualize performance including model accuracy and convergence on an interface of the agents. Doing so would allow researchers to monitor the learning and accuracy of the models (Choudhary, Page 31 Col. 29). The rejection claims 4 and 14 is maintained.
Applicant’s arguments, see pp. 5, filed 1 December 2025, with respect to the rejection(s) of claim(s) 5 and 15 under 35 U.S.C. 103 over Luo in view of Roy, and further in view of Arivazhagan have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made over Luo in view of Roy, further in view of Jacobs et al. (US Patent No. 11,544,743, effective filing date of Oct. 2017). The combination of Luo and Roy teaches the limitation of claim 5 regarding “wherein at least one of the agents utilize a neural network to train the local machine learning model” and Jacobs teaches the limitations of claim 5 regarding “the neural network comprising an embedding block… an inference block… and a transfer block…”.
Applicant’s arguments, see pp. 3-4, filed 1 December 2025, with respect to the rejection(s) of claim(s) 6 and 16 under 35 U.S.C. 103 over Luo in view of Roy, and further in view of Qu have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made over Luo in view of Roy, and further in view of Xu et al. (US Pub. No. 2021/0143987, effective filing date of Nov. 2019). Xu teaches the limitations of claim 6 regarding “wherein at least one of the cluster aggregators comprises an agent simulator configured to…”.
Applicant’s arguments, see pp. 5-6, filed 1 December 2025, with respect to the rejection(s) of claim(s) 8 and 18 under 35 U.S.C. 103 over Luo in view of Roy, and further in view of Faulhaber have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made over Luo in view of Roy, and further in view of Kushwah et al. (US Pub. No. 2021/0327562, effective filing date of April 2020). The combination of Luo and Roy teaches the limitation of claim 8 regarding “updates its semi-global machine learning model with the retrieved global machine learning model” and Kushwah teaches the limitations of claim 8 regarding “a model repository…” and “wherein at least one of the cluster aggregators comprises a Model Selector function…”.
Applicant's arguments filed 1 December 2025 have been fully considered but they are not persuasive. Regarding the 35 U.S.C. rejection of claims 9 and 19, Applicant argues that Deng fails to teach performing gradient descent for the parameters and personalization rate independently and within a single training iteration. Examiner respectfully disagrees, and points to Deng, Algorithm 1 – which teaches updating the local parameters, global parameters, and personalization rate independently (each has a respective function) and within a single training iteration (iteration t) to iteratively update. In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., “independent and within a single training iteration”) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). The rejection of claims 9 and 19 is maintained.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LOUIS C NYE whose telephone number is (571)272-0636. The examiner can normally be reached Monday - Friday 9:00AM - 5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MATT ELL can be reached at 571-270-3264. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LOUIS CHRISTOPHER NYE/Examiner, Art Unit 2141
/MATTHEW ELL/Supervisory Patent Examiner, Art Unit 2141