Last updated: April 19, 2026
Application No. 18/009,425
GROUPING NODES IN A SYSTEM

Final Rejection §101§102§103§112
Filed
Dec 09, 2022
Examiner
SIPPEL, MOLLY CLARKE
Art Unit
2122
Tech Center
2100 — Computer Architecture & Software
Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
OA Round
2 (Final)
This examiner grants 50% of cases after interview

— +58.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 14 resolved cases, 2023–2026
Examiner Intelligence

SIPPEL, MOLLY CLARKE View full profile →
Grants 50% of resolved cases
Career Allow Rate
7 granted / 14 resolved
-5.0% vs TC avg
Strong +58% interview lift
Without
With
+58.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 7m
Avg Prosecution
25 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
33.8%
-6.2% vs TC avg
§103
32.0%
-8.0% vs TC avg
§102
9.8%
-30.2% vs TC avg
§112
23.6%
-16.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 14 resolved cases
Office Action

§101 §102 §103 §112
DETAILED ACTION
	This action is responsive to the amendment filed on 01/07/2026. Claims 1-17 and 30 are pending in the case. Claims 1 and 30 are independent claims. Claims 1, 14, and 30 are currently amended. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant's claim for domestic priority based on PCT application number PCT/EP2020/066235 filed on 06/11/2020.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claim 7 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 7, the claim recites: “the group” on line 1. The parent claim recites: “a group of a plurality of groups” on lines 3-4 and “a first group” on line 5. It is unclear if applicant is attempting to refer to “a group of a plurality of groups” or “a first group”. For examination purposes, this limitation has been interpreted to mean “the first group” because claim 7 refers to the “worker nodes of the group” and claim 1 recites “subgrouping workers nodes within a first group”. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-17 and 30 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Regarding claim 1: 
Step 1 Statutory Category: Claim 1 recites, in part, “grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case evaluation. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “subgrouping worker nodes within a first group of the plurality of groups into subgroups based on characteristics of a worker neural network model of each worker node from the first group of the plurality of groups”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case evaluation. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “averaging the worker neural network models of worker nodes within one of the subgroups to generate a subgroup average model”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mathematical calculation, as directed to “a claim that recites a mathematical calculation, when the claim is given its broadest reasonable interpretation in light of the specification, will be considered as falling within the "mathematical concepts" grouping. A mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”. See MPEP §2106.04(a)(2)(I)(C).
Step 2A Prong 2 Integration into a Practical Application: This judicial exception is not integrated into a practical application. In particular the claim recites: “a machine learning system comprising a master node and a plurality of worker nodes”. This limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP §2106.05(h). Further, the claim recites: “distributing the subgroup average model”. This limitation is an additional element that amounts to a post-solution step for transmitting data output – a nominal addition to the claim that does not meaningfully limit the claim, thus this is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception. See MPEP §2106.05(g).
Step 2B Significantly More: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element: “a machine learning system comprising a master node and a plurality of worker nodes” generally links the use of the judicial exception to a particular technological environment or field of use. Elements that merely generally link the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. Further, the claim recites the additional element: “distributing the subgroup average model” that amounts to adding insignificant extra-solution activity to the judicial exception. Further, this element is directed to receiving or transmitting data over a network which courts have recognized as well-understood, routine, and conventional when they are claimed in a generic manner, see MPEP §2106.05(d)(II). The claim is not patent eligible. 

Regarding claim 2, the rejection of claim 1 is incorporated, and further, the claim recites: “after the grouping of the worker nodes, first determining if there is a substantial change in any local dataset of a worker node from among the plurality of worker nodes; wherein if there is no substantial change in any of the local datasets, the method proceeds to the subgrouping; or if there is a substantial change in any of the local datasets, the grouping is repeated”. This limitation recites mental processes in addition to those identified in the rejection of the parent claim, and thus the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 3, the rejection of claim 1 is incorporated, and further, the claim recites: “after the subgrouping of the worker nodes, second determining if there is a substantial change in any local data sets of the plurality of worker nodes; wherein if there is no substantial change in any of the local datasets, the subgrouping is repeated; or if there is a substantial change in any of the local datasets, the method is repeated from the grouping”. This limitation recites mental processes in addition to those identified in the rejection of the parent claim, and thus the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 4, the rejection of claim 1 is incorporated, and further, the claim recites: “updating the worker neural network model of each worker node of the subgroup with the subgroup average model”. This limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible. 

Regarding claim 5, the rejection of claim 1 is incorporated, and further, the claim recites: “after the grouping, averaging the worker neural network model of each worker node of a group of the plurality of groups to generate a group average model”. This limitation recites mathematical concepts in addition to those identified in the rejection of the parent claim. Thus, the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

	Regarding claim 6, the rejection of claim 5 is incorporated, and further, the claim recites: “updating the worker neural network model of each worker node of the group with the corresponding group average model”. This limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible.

Regarding claim 7, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein the worker nodes of the group comprise data distributions with similar characteristics”. This limitation is a continuation of the “grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes” limitation identified as an abstract idea in the rejection of the parent claim. Thus, the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 8, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein the worker nodes of the subgroup comprise neural network models with similar characteristics”. This limitation is a continuation of the “subgrouping worker nodes within the group of the plurality of groups into subgroups based on characteristics of a worker neural network model of each worker node from the group of the plurality of groups” limitation identified as an abstract idea in the rejection of the parent claim. Thus, the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 9, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein the grouping and/or the subgrouping is performed using a clustering algorithm”. This limitation is an additional element that amounts to generally linking the use of the judicial exception to a particular technological environment or field of use.  See MPEP §2106.05(h). Elements that merely amount to generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible. 

Regarding claim 10, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein a representative data set is used to perform the grouping”. This limitation is a continuation of the “grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes” limitation identified as an abstract idea in the rejection of the parent claim, thus the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 11, the rejection of claim 10 is incorporated, and further, the claim recites: “the representative dataset is encoded … to generate encoded data”. This limitation recites mathematical concepts in addition to those identified in the rejection of the parent claim. Thus, the claim recites a judicial exception. 
Further, the claim recites: “in the grouping, an encoder model is trained using the representative data set”. This limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §2106.05(f). Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, the claim recites: “using the encoder model”. This limitation is an additional element that amounts to generally linking the use of the judicial exception to a particular technological environment or field of use.  See MPEP §2106.05(h). Elements that merely amount to generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible. 

	Regarding claim 12, the rejection of claim 11 is incorporated, and further, the claim recites: “determine clusters” and “a cluster representative for each cluster is identified, wherein each cluster representative corresponds to a group of the plurality of groups”. These limitations recite mental processes in addition to those identified in the rejection of the parent claim, thus the claim recites a judicial exception. 
	Further, the claim recites: “in the grouping, a clustering algorithm is run on the encoded data”. This limitation is an additional element that amounts to generally linking the use of the judicial exception to a particular technological environment or field of use.  See MPEP §2106.05(h). Elements that merely amount to generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible.

	Regarding claim 13, the rejection of claim 12 is incorporated, and further, the claim recites: “wherein, in the grouping, the method further comprises determining to which group a worker node belongs by encoding the local data set of a worker node using the encoder model and using the cluster representative for each cluster”. This limitation is a continuation of the “grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes” limitation identified as an abstract idea in the rejection of the parent claim, thus the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.
 
Regarding claim 14, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein the subgrouping further comprises: computing an inverse of a neural network of each of the worker nodes to generate a backward neural network”. This limitation recites mathematical concepts in addition to those identified in the parent claim. Further, the claim recites: “generate a set of representations”. This limitation recites mathematical concepts in addition to those identified in the rejection of the parent claim. Further, the claim recites “generate a set of predicted responses”. This limitation recites mathematical concepts in addition to those identified in the rejection of the parent claim. Further, the claim recites: “determining a loss value between the set of responses and the set of predicted responses”. This limitation recites mathematical concepts in addition to those identified in the rejection of the parent claim. Further, the claim recites: “group the worker nodes into subgroups”. This limitation recites mental processes in addition to those identified in the rejection of the parent claim. 
Further, the claim recites: “obtaining a set of responses using the representative dataset”, “feeding the set of responses into the backward neural network”, and “feeding the set of representations into the neural network”. These limitations are additional elements that amount to adding insignificant extra-solution activity to the judicial exception.  See MPEP §2106.05(g). Further, these limitations are directed to receiving or transmitting data over a network which courts have recognized as well-understood, routine, and conventional when they are claimed in a generic manner, see MPEP §2106.05(d)(II). Further, the claim recites: “running a clustering algorithm on the loss values”. This limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use.  See MPEP §2106.05(h). Elements that amount to generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible. 

Regarding claim 15, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein each of the worker nodes comprise the same neural network architecture for at least a portion of the neural network of each worker node”. This limitation is an additional element that amounts to generally linking the use of the judicial exception to a particular technological environment or field of use.  See MPEP §2106.05(h). Elements that merely generally link use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible. 

Regarding claim 16, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein the dataset of the worker node is at least one of: time series data generated from network performance measurements, counters, sensor data from IoT devices, temperature, vibration, data from computer/cloud deployments, CPU usage, memory usage”. This limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use.  See MPEP §2106.05(h). Elements that amount to generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible. 

Regarding claim 17, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein at least one worker node of the plurality of worker nodes is grouped into multiple groups of the plurality of groups”. This limitation is a continuation of the “grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes” limitation identified as an abstract idea in the rejection of the parent claim. Thus, the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 30: 
Step 1 Statutory Category: Claim 30 recites, in part, “grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case evaluation. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “subgrouping worker nodes within a first group of the plurality of groups into subgroups based on characteristics of a worker neural network model of each worker node from the first group of the plurality of groups”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case evaluation. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “averaging the worker neural network models of worker nodes within one of the subgroups to generate a subgroup average model”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mathematical calculation, as directed to “a claim that recites a mathematical calculation, when the claim is given its broadest reasonable interpretation in light of the specification, will be considered as falling within the "mathematical concepts" grouping. A mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”. See MPEP §2106.04(a)(2)(I)(C).
	Step 2A Prong 2 Integration into a Practical Application: This judicial exception is not integrated into a practical application. In particular the claim recites: “a master node configured to communicate with a plurality of worker nodes in a machine learning system”. This limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP §2106.05(h). Further, the claim recites: “processing circuitry and a non-transitory machine-readable medium storing instructions”. This limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §2106.05(f). Further, the claim recites: “distributing the subgroup average model”. This limitation is an additional element that amounts to a post-solution step for transmitting data output – a nominal addition to the claim that does not meaningfully limit the claim, thus this is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception. See MPEP §2106.05(g).
Step 2B Significantly More: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element: “a master node configured to communicate with a plurality of worker nodes in a machine learning system” generally links the use of the judicial exception to a particular technological environment or field of use. Elements that merely generally link the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. Further, the claim recites the additional element: “processing circuitry and a non-transitory machine-readable medium storing instructions” that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, the claim recites the additional element: “distributing the subgroup average model” that amounts to adding insignificant extra-solution activity to the judicial exception. Further, this element is directed to receiving or transmitting data over a network which courts have recognized as well-understood, routine, and conventional when they are claimed in a generic manner, see MPEP §2106.05(d)(II). The claim is not patent eligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 4-9, 15, and 30 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Sattler et al., Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints, 10/4/2019, https://arxiv.org/pdf/1910.01991, hereinafter referred to as “Sattler”. 

Regarding claim 1, Sattler teaches A method for grouping worker nodes in a machine learning system comprising a master node and a plurality of worker nodes (Sattler, Page 1, Abstract, Lines 6-10, “we present Clustered Federated Learning (CFL), a novel Federated Multi-Task Learning (FMTL) framework, which exploits geometric properties of the FL loss surface, to group the client population into clusters with jointly trainable data distributions”) , the method comprising:
grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes (Sattler, Page 5, Section 3, Col 2, Lines 10-15, “the server separates the clients into two clusters in such a way that the maximum similarity between clients from different clusters is minimized … This optimal bi-partitioning problem at the core of CFL can be solved in O(m3) using Algorithm 1”; Sattler, Page 3, Col 1, Paragraph 2, Lines 1-3, “a bi-partitioning is called correct, if clients with the same data generating distribution end up in the same cluster”);
subgrouping worker nodes within a first group of the plurality of groups into subgroups based on characteristics of a worker neural network model of each worker node from the first group of the plurality of groups (Sattler, Page 5, Section 3, Paragraph 3, Lines 1-6, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution                         
                            
                                    θ
                                
                                    *
                                
                    ”. Splitting recursively continues on until (after at most k - 1 recursions) none of the sub-clusters violate the stopping criterion anymore, at which point all groups of mutually congruent clients C = {c1, …, ck} have been identified”; Sattler, Page 3, Col 2, Lines 5-9, “in a stationary solution of the Federated Learning objective                         
                            
                                    θ
                                
                                    *
                                
                    , we can distinguish clients based on their hidden data generating distribution by inspecting the cosine similarity between their gradient updates” The “cosine similarity based clustering” method used by the reference clusters based on the cosine similarity between clients’ “gradient updates” which are considered to be “characteristics of a worker neural network model of each worker node”);
averaging the worker neural network models of worker nodes within one of the subgroups to generate a subgroup average model (Sattler, Page 5, Section 3, Paragraph 2, Lines 1-3, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution”; Sattler, Page 6, Algorithm 3, Steps 2, 7, and 8; Sattler, Page 6, Algorithm 2 (Specifically step 7); Sattler, Page 1, Section 1, Col 2, Lines 3-7 and Equation 2, “all clients upload their computed weight-updates to the server, where they are aggregated by weighted averaging according to                         
                            
                                    θ
                                
                                    t
                                    +
                                    1
                                
                            =
                            
                                    θ
                                
                                    t
                                
                            +
                            
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                        m
                                    
                                                            D
                                                        
                                                            i
                                                        
                                                    D
                                                
                                    ∆
                                    
                                            θ
                                        
                                            i
                                        
                                            t
                                            +
                                            1
                                        
                     to create the next master model. The procedure is summarized in Algorithm 2”; When steps 7 and 8 of Algorithm 3 are called on a subgroup, “Algorithm 2” is performed on the subgroup and step 7 of algorithm 2 performs “averaging the worker neural network models of worker nodes within one of the subgroups” and the resulting                         
                            θ
                        
                     is considered to be the “subgroup average model”); and
distributing the subgroup average model (Sattler, Page 5, Section 3, Paragraph 2, Lines 1-3, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution”; Sattler, Page 6, Algorithm 3, Steps 2, 7, and 8; Sattler, Page 6, Algorithm 2 (Specifically steps 3-4); Sattler, Page 1, Section 1, Lines 7-9, “in every communication round t, the clients first synchronize with the server by downloading the latest master model                         
                            θ
                        
                    t.”; When steps 7 and 8 of Algorithm 3 are called on a subgroup, “Algorithm 2” is performed on the subgroup and steps 3-4 of algorithm 2 performs “distributing the subgroup average model”).

Regarding claim 4, the rejection of claim 1 is incorporated, and further, Sattler teaches the method further comprising updating the worker neural network model of each worker node of the subgroup with the subgroup average model (Sattler, Page 5, Section 3, Paragraph 2, Lines 1-3, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution”; Sattler, Page 6, Algorithm 3, Steps 2, 7, and 8; Sattler, Page 6, Algorithm 2 (Specifically step 5); Sattler, Page 1, Section 1, Lines 9-12, “Every client then proceeds to improve the downloaded model, by performing multiple iterations of stochastic gradient descent with mini-batches sampled from it’s local data Di”; When steps 7 and 8 of Algorithm 3 are called on a subgroup, “Algorithm 2” is performed on the subgroup and step 5 of algorithm 2 performs “updating the worker neural network model”). 

Regarding claim 5, the rejection of claim 1 is incorporated, and further, Sattler teaches the method further comprising, after the grouping, averaging the worker neural network model of each worker node of a group of the plurality of groups to generate a group average model (Sattler, Page 5, Section 3, Paragraph 2, Lines 1-3, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution”; Sattler, Page 6, Algorithm 3, Steps 2, 7, and 8; Sattler, Page 6, Algorithm 2 (Specifically step 7); Sattler, Page 1, Section 1, Col 2, Lines 3-7 and Equation 2, “all clients upload their computed weight-updates to the server, where they are aggregated by weighted averaging according to                         
                            
                                    θ
                                
                                    t
                                    +
                                    1
                                
                            =
                            
                                    θ
                                
                                    t
                                
                            +
                            
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                        m
                                    
                                                            D
                                                        
                                                            i
                                                        
                                                    D
                                                
                                    ∆
                                    
                                            θ
                                        
                                            i
                                        
                                            t
                                            +
                                            1
                                        
                     to create the next master model. The procedure is summarized in Algorithm 2”; When steps 7 and 8 of Algorithm 3 are called on a group, “Algorithm 2” is performed on the group and step 7 of algorithm 2 performs “averaging the worker neural network model of each worker node of a group” and the resulting                         
                            θ
                        
                     is considered to be the “group average model”).

Regarding claim 6, the rejection of claim 5 is incorporated, further, Sattler teaches updating the worker neural network model of each worker node of the group with the corresponding group average model (Sattler, Page 5, Section 3, Paragraph 2, Lines 1-3, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution”; Sattler, Page 6, Algorithm 3, Steps 2, 7, and 8; Sattler, Page 6, Algorithm 2 (Specifically step 5); Sattler, Page 1, Section 1, Lines 9-12, “Every client then proceeds to improve the downloaded model, by performing multiple iterations of stochastic gradient descent with mini-batches sampled from it’s local data Di”; When steps 7 and 8 of Algorithm 3 are called on a group, “Algorithm 2” is performed on the group and step 5 of algorithm 2 performs “updating the worker neural network model”). 

Regarding claim 7, the rejection of claim 1 is incorporated, and further, Sattler teaches wherein the worker nodes of the group comprise data distributions with similar characteristics (Sattler, Page 5, Section 3, Col 2, Lines 10-15, “the server separates the clients into two clusters in such a way that the maximum similarity between clients from different clusters is minimized … This optimal bi-partitioning problem at the core of CFL can be solved in O(m3) using Algorithm 1”; Sattler, Page 3, Col 1, Paragraph 2, Lines 1-3, “a bi-partitioning is called correct, if clients with the same data generating distribution end up in the same cluster”; Sattler, Page 5, Lines 1-5, “This means that using the cosine similarity criterion (25) we can readily find a correct bi-partitioning c1, c2 even if the number of data generating distributions is high and the empirical risk on every client’s data is only a very loose approximation of the true risk”; The method readily finds a “correct bi-partitioning” which means each “group” contains clients “with the same data gathering distribution” which is considered to be “data distributions with similar characteristics”). 

Regarding claim 8, the rejection of claim 1 is incorporated, and further, Sattler teaches wherein the worker nodes of the subgroup comprise neural network models with similar characteristics (Sattler, Page 5, Section 3, Paragraph 3, Lines 1-6, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution                         
                            
                                    θ
                                
                                    *
                                
                    ”. Splitting recursively continues on until (after at most k - 1 recursions) none of the sub-clusters violate the stopping criterion anymore, at which point all groups of mutually congruent clients C = {c1, …, ck} have been identified”; Sattler, Page 3, Col 2, Lines 5-9, “in a stationary solution of the Federated Learning objective                         
                            
                                    θ
                                
                                    *
                                
                    , we can distinguish clients based on their hidden data generating distribution by inspecting the cosine similarity between their gradient updates” The “cosine similarity based clustering” method used by the reference clusters based on the cosine similarity between clients’ “gradient updates” which are considered to be “characteristics of a worker neural network model of each worker node”).

Regarding claim 9, the rejection of claim 1 is incorporated, and further, Sattler teaches wherein the grouping and the subgrouping is performed using a clustering algorithm (Sattler, Page 6, Algorithm 1, “Optimal Bipartition”, Line 2, “output: bi-partitioning c1, c2 satisfying (25)”; Sattler, Page 2, Section 2, “Cosine Similarity Based Clustering” The grouping and subgrouping are performed using the “Optimal Bipartition” algorithm, which satisfies “(25)” as required by “Cosine Similarity Based Clustering” which is considered to be “a clustering algorithm”).

Regarding claim 15, the rejection of claim 1 is incorporated, and further, Sattler teaches wherein each of the worker nodes comprise the same neural network architecture for at least a portion of the neural network of each worker node (Sattler, Page 9, Col 1, Lines 4-5, “In all experiments we train multi-layer convolutional neural networks”; Since the “worker nodes” are all “multi-layer convolutional neural networks” they must have the same network architecture for at least a portion of the neural network).

Regarding claim 30, Sattler teaches A master node configured to communicate with a plurality of worker nodes in a machine learning system (Sattler, Page 1, Abstract, Lines 6-10, “we present Clustered Federated Learning (CFL), a novel Federated Multi-Task Learning (FMTL) framework, which exploits geometric properties of the FL loss surface, to group the client population into clusters with jointly trainable data distributions”), the master node comprising processing circuitry and a non-transitory machine-readable medium storing instructions, wherein the master node is configured to perform (Sattler, Page 5, Section 3, Col 2, Lines 15-18, “Since in Federated Learning it is assumed that the server has far greater computational power than the clients, the overhead of clustering will typically be negligible” The “server” is considered to be the “master node”; Sattler, Page 10, Section C, Lines 1-8, “we apply CFL as described in Algorithm 5 to different Federated Learning setups, which are inspired by our motivating examples in the introduction. In all experiments, the clients perform 3 epochs of local training at a batch-size of 100 in every communication round. Label permutation on Cifar-10: We split the CIFAR-10 training data randomly and evenly among m = 20 clients, which we group into k = 4 different clusters”; Sattler, Page 10, Section C, Paragraph 2, 8-10, “The clients then jointly train a 5-layer convolutional neural network on the modified data using CFL with 3 epochs of local training at a batch-size of 100” A person of ordinary skill in the art would recognize that these experiments must be done on a computer, thus providing evidence for “processing circuitry” and “a non-transitory machine-readable storage medium”. Further, a person of ordinary skill in the art would recognize that a “server” contains “processing circuitry” and “a non-transitory machine-readable storage medium”) a method comprising:
grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes (Sattler, Page 5, Section 3, Col 2, Lines 10-15, “the server separates the clients into two clusters in such a way that the maximum similarity between clients from different clusters is minimized … This optimal bi-partitioning problem at the core of CFL can be solved in O(m3) using Algorithm 1”; Sattler, Page 3, Col 1, Paragraph 2, Lines 1-3, “a bi-partitioning is called correct, if clients with the same data generating distribution end up in the same cluster”);
subgrouping worker nodes within a first group of the plurality of groups into subgroups based on characteristics of a worker neural network model of each worker node from the first group of the plurality of groups (Sattler, Page 5, Section 3, Paragraph 3, Lines 1-6, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution                         
                            
                                    θ
                                
                                    *
                                
                    ”. Splitting recursively continues on until (after at most k - 1 recursions) none of the sub-clusters violate the stopping criterion anymore, at which point all groups of mutually congruent clients C = {c1, …, ck} have been identified”; Sattler, Page 3, Col 2, Lines 5-9, “in a stationary solution of the Federated Learning objective                         
                            
                                    θ
                                
                                    *
                                
                    , we can distinguish clients based on their hidden data generating distribution by inspecting the cosine similarity between their gradient updates” The “cosine similarity based clustering” method used by the reference clusters based on the cosine similarity between clients’ “gradient updates” which are considered to be “characteristics of a worker neural network model of each worker node”);
averaging the worker neural network models of worker nodes within one of the subgroups to generate a subgroup average model (Sattler, Page 5, Section 3, Paragraph 2, Lines 1-3, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution”; Sattler, Page 6, Algorithm 3, Steps 2, 7, and 8; Sattler, Page 6, Algorithm 2 (Specifically step 7); Sattler, Page 1, Section 1, Col 2, Lines 3-7 and Equation 2, “all clients upload their computed weight-updates to the server, where they are aggregated by weighted averaging according to                         
                            
                                    θ
                                
                                    t
                                    +
                                    1
                                
                            =
                            
                                    θ
                                
                                    t
                                
                            +
                            
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                        m
                                    
                                                            D
                                                        
                                                            i
                                                        
                                                    D
                                                
                                    ∆
                                    
                                            θ
                                        
                                            i
                                        
                                            t
                                            +
                                            1
                                        
                     to create the next master model. The procedure is summarized in Algorithm 2”; When steps 7 and 8 of Algorithm 3 are called on a subgroup, “Algorithm 2” is performed on the subgroup and step 7 of algorithm 2 performs “averaging the worker neural network models of worker nodes within one of the subgroups” and the resulting                         
                            θ
                        
                     is considered to be the “subgroup average model”); and
distributing the subgroup average model (Sattler, Page 5, Section 3, Paragraph 2, Lines 1-3, “CFL is recursively re-applied to each of the two separate groups starting from the stationary solution”; Sattler, Page 6, Algorithm 3, Steps 2, 7, and 8; Sattler, Page 6, Algorithm 2 (Specifically steps 3-4); Sattler, Page 1, Section 1, Lines 7-9, “in every communication round t, the clients first synchronize with the server by downloading the latest master model                         
                            θ
                        
                    t.”; When steps 7 and 8 of Algorithm 3 are called on a subgroup, “Algorithm 2” is performed on the subgroup and steps 3-4 of algorithm 2 performs “distributing the subgroup average model”). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Sattler in view of Desai et al., U.S. Patent Application Publication No. 20190325350, hereinafter referred to as Sattler. 

Regarding claim 2, the rejection of claim 1 is incorporated, and further, Sattler teaches after the grouping of the worker nodes, first determining and if there [condition is not met], the method proceeds to the subgrouping (Sattler, Page 5, Lines 3-8, “Splitting recursively continues on until (after at most k - 1 recursions) none of the sub-clusters violate the stopping criterion anymore, at which point all groups of mutually congruent clients C = {c1,…,ck} have been identified, and the Clustered Federated Learning problem characterized by Assumption 2 is solved”). 
Sattler does not explicitly teach determining if there is a substantial change in any local dataset of a worker node from among the plurality of worker nodes and if there is a substantial change in any of the local datasets, the grouping is repeated. 
Desai teaches determining if there is a substantial change in any local dataset of a worker node from among the plurality of worker nodes and if there is a substantial change in any of the local datasets, the grouping is repeated (Desai, Paragraph 0058, “During the iterative distributed learning process described herein, the composition of a cluster can change. For example, a node can become active and exhibit characteristics similar to a cluster, causing an embodiment to join the node to the existing cluster. As another example, a node might become inactive, causing an embodiment to remove the node from the existing cluster. As another example, a node can change its behavior and begin exhibiting characteristics similar to a different cluster, causing an embodiment to remove the node from its original cluster and join the node to the other cluster” By the nodes moving clusters during the method, the grouping is repeated). 
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention to have modified the clustered federated learning method of Sattler to include repeating the grouping if there is a substantial change in any of the local datasets as taught by Desai. The motivation for doing so would have been to keep the nodes with similar characteristics within the same group (Desai, Paragraph 0058). 

Regarding claim 3, the rejection of claim 1 is incorporated, and further, Sattler teaches after the subgrouping of the worker nodes, second determining and if there [condition is not met], the subgrouping is repeated (Sattler, Page 5, Lines 3-8, “Splitting recursively continues on until (after at most k - 1 recursions) none of the sub-clusters violate the stopping criterion anymore, at which point all groups of mutually congruent clients C = {c1,…,ck} have been identified, and the Clustered Federated Learning problem characterized by Assumption 2 is solved”). 
Sattler does not explicitly teach determining if there is a substantial change in any local data sets of the plurality of worker nodes and if there is a substantial change in any of the local datasets, the method is repeated from the grouping
Desai teaches determining if there is a substantial change in any local data sets of the plurality of worker nodes and if there is a substantial change in any of the local datasets, the method is repeated from the grouping (Desai, Paragraph 0058, “During the iterative distributed learning process described herein, the composition of a cluster can change. For example, a node can become active and exhibit characteristics similar to a cluster, causing an embodiment to join the node to the existing cluster. As another example, a node might become inactive, causing an embodiment to remove the node from the existing cluster. As another example, a node can change its behavior and begin exhibiting characteristics similar to a different cluster, causing an embodiment to remove the node from its original cluster and join the node to the other cluster” By the nodes moving clusters during the method, the grouping is repeated). 
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention to have modified the clustered federated learning method of Sattler to include repeating the subgrouping if there is a substantial change in any of the local datasets as taught by Desai. The motivation for doing so would have been to keep the nodes with similar characteristics within the same group (Desai, Paragraph 0058). 

Regarding claim 10, the rejection of claim 1 is incorporated. 
Sattler does not explicitly teach wherein a representative data set is used to perform the grouping.
Desai teaches wherein a representative data set is used to perform the grouping (Desai, Paragraph 0042, Lines 1-7, “An embodiment at the training system receives meta-metrics from at least some of the nodes, and preferably the complete set of nodes. Using the meta-metrics, the embodiment determines that a subset of the nodes has at least one meta-metric whose value is same or similar within a tolerance for all nodes within the subset. The embodiment groups the nodes in such a subset into one cluster” The “meta-metrics” are considered to be the “representative data set”). 
It would have been obvious, to a person of ordinary skill in the art, before the effective filing date of the invention to have modified the clustered federated learning method taught by Sattler to include grouping using a representative dataset as taught by Desai. The motivation for doing so would have been to group the worker nodes without local data export (Desai, Paragraphs 0018-0019; Desai, Figures 3A and 3B). 

Regarding claim 17, the rejection of claim 1 is incorporated. 
Sattler does not explicitly teach wherein at least one worker node of the plurality of worker nodes is grouped into multiple groups of the plurality of groups. 
Desai teaches wherein at least one worker node of the plurality of worker nodes is grouped into multiple groups of the plurality of groups (Desai, Paragraph 0043, Line 7, “A node can belong to more than one clusters”).
It would have been obvious, to a person of ordinary skill in the art before the effective filing date of the invention, to have modified the clustered federated learning method of Sattler to include grouping at least one worker node into multiple groups as taught by Desai. The motivation for doing so would have been to properly represent the overlaps among the data distributions of the worker nodes (Desai, Paragraph 0043, Lines 1-7, “A cluster may be based on one or more meta-metrics whose values are similar within the cluster. Any number of clusters can be formed in this manner. Different clusters have different sets of meta-metrics in which similarities exist (hereinafter referred to as cluster meta-metrics), but different clusters can have overlaps in such sets of meta-metrics”). 

	Claims 11-13 are rejected under 35 U.S.C. 103 as being unpatentable over Sattler in view of Desai in further view of  Huang et al., Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records, Journal of Biomedical Informatics, Volume 99, November 2019, https://doi.org/10.1016/j.jbi.2019.103291, hereinafter referred to as “Huang”. 

Regarding claim 11, the rejection of claim 10 is incorporated.
The proposed combination does not explicitly teach wherein, in the grouping, an encoder model is trained using the representative data set, and the representative dataset is encoded using the encoder model to generate encoded data. 
Huang teaches wherein, in the grouping, an encoder model is trained using the representative data set, and the representative dataset is encoded using the encoder model to generate encoded data (Huang, Page 3, Lines 1-3, “During encoder training, each client (that is, hospital) learnt a denoising autoencoder fautoencoder”; Huang, Algorithm 1, Steps 1-6; Huang, Page 3, Paragraph 2, Lines 1-2, “During k-means clustering, each client used f encoder to transform its data into representations Xc”; Huang, Algorithm 1, Steps 8-9; The “fautoencoder” is considered to be the “encoder model” and the “representations Xc” are considered to be the “encoded data”). 
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the clustered federated learning method of the proposed combination to include encoding the representative dataset with an encoder model trained on the dataset as taught by Huang. The motivation for doing so would have been the ability to use the method on non-identically independently distributed data (Huang, Page 2, Paragraph 2, Lines 2-7, “To tackle this non-IID challenge and inspired by deep embedding clustering [31], we proposed a community-based federated learning (CBFL) algorithm that clustered EMR data into several communities and simultaneously trained one model per community, so that the learning process became markedly more efficient than FL”). 

	Regarding claim 12, the rejection of claim 11 is incorporated, and further the proposed combination teaches wherein, in the grouping, a clustering algorithm is run on the encoded data to determine clusters, and a cluster representative for each cluster is identified, wherein each cluster representative corresponds to a group of the plurality of groups (Huang, Page 3, Paragraph 2, Lines 3-4, “Then, the server learnt a k-means clustering model fkmeans with K centroids (that is, communities) on Xcs from all clients”; Huang, Page 3, Paragraph 3, Lines 4-5, “fencoder and fkmeans were used to determine which cluster each example belonged to”; Huang, Page 3, Algorithm 1, Steps 11-14; The “centroids” are considered to be the “cluster representative”).

	Regarding claim 13, the rejection of claim 12 is incorporated, and further, the proposed combination teaches wherein, in the grouping, the method further comprises determining to which group a worker node belongs by encoding the local data set of a worker node using the encoder model and using the cluster representative for each cluster (Huang, Algorithm 1, Steps 1-6; Huang, Page 3, Paragraph 2, Lines 1-2, “During k-means clustering, each client used f encoder to transform its data into representations Xc”; Huang, Algorithm 1, Steps 8-9; Huang, Page 3, Paragraph 2, Lines 3-4, “Then, the server learnt a k-means clustering model fkmeans with K centroids (that is, communities) on Xcs from all clients”; Huang, Page 3, Paragraph 3, Lines 4-5, “fencoder and fkmeans were used to determine which cluster each example belonged to”; Huang, Page 3, Algorithm 1, Steps 11-14; The “centroids” are considered to be the “cluster representative”). 

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Sattler in view of Cortes et al., Optimization of operating conditions for compressor performance by means of neural network inverse, Applied Energy, Volume 86, Issue 11, 2009, Pages 2487-2493, ISSN 0306-2619, https://doi.org/10.1016/j.apenergy.2009.03.001, hereinafter referred to as “Cortes” in further view of Solmer et al., U.S. Patent Application Publication No. 20120233127, hereinafter referred to as Solmer. 

Regarding claim 14, the rejection of claim 1 is incorporated.
Sattler does not explicitly teach wherein the subgrouping further comprises:
computing an inverse of a neural network of each of the worker nodes to generate a backward neural network; obtaining a set of responses using the representative dataset; feeding the set of responses into the backward neural network to generate a set of representations; feeding the set of representations into the neural network to generate a set of predicted responses; determining a loss value between the set of responses and the set of predicted responses; and running a clustering algorithm on the loss values to group the worker nodes into subgroups.
	Cortes teaches computing an inverse of a neural network of each of the worker nodes to generate a backward neural network (Cortes, Page 2487, Abstract, Lines 2-4, “It inverts the neural network to find the optimum parameter value under given conditions (artificial neural network inverse, ANNi)”. 
	It would have been obvious, to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the clustered federated learning method taught by Sattler, to include computing the inverse of a neural network as taught by Cortes. The motivation for doing so would have been to optimize the parameters related to device performance (Cortes, Page 2487, Abstract, Lines 1-4). 
Sattler in view of Cortes does not explicitly teach obtaining a set of responses using the representative dataset; feeding the set of responses into the backward neural network to generate a set of representations; feeding the set of representations into the neural network to generate a set of predicted responses; determining a loss value between the set of responses and the set of predicted responses; and running a clustering algorithm on the loss values to group the worker nodes into subgroups.
	Solmer teaches obtaining a set of responses using the representative dataset; feeding the set of responses into the backward neural network to generate a set of representations; feeding the set of representations into the neural network to generate a set of predicted responses; determining a loss value between the set of responses and the set of predicted responses; and running a clustering algorithm on the loss values to group the worker nodes into subgroups (Solmer, Paragraph 0094, Lines 10-24, “In some embodiments, when applied to classification, the autoencoder may be expanded to include another layer (in addition to the typical three layers) when the labels for different classes are made available. In this case, the number of inputs of the additional layer is equal to the dimensionality of the code layer and the number of outputs of the added layer equals to the number of underlying categories. The input weights of the added layer may be initialized with small random values and then trained with, e.g., gradient descent or conjugate gradient for a few epochs while keeping the rest of the weights in the neural network fixed. Once this added "classification layer" is trained for a few epochs, the entire network is then trained using, e.g., back propagation. Such a trained ANN can then be used for classification of incoming data into different classes”; See also Solmer, Fig 3(a) Clustering the autoencoder residuals is considered to be “running a clustering algorithm on the loss values”; the encoder is considered to be the “backward neural network” and the decoder is considered to be the “neural network”). 
	It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the subgrouping method of the proposed combination to include the clustering of loss values as taught by Solmer. The motivation for doing so would have been the ability to have unified information representation (Solmer, Paragraph 0094). 

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Sattler in view of Liu et al., Privacy-preserving Traffic Flow Prediction: A Federated Learning Approach, 03/19/2020, https://arxiv.org/pdf/2003.08725, hereinafter referred to as “Liu”. 

	Regarding claim 16, the rejection of claim 1 is incorporated. 
	Sattler does not explicitly teach wherein the dataset of the worker node is at least one of: time series data generated from network performance measurements, counters, sensor data from IoT devices, temperature, vibration, data from computer/cloud deployments, CPU usage, memory usage. 
Liu teaches wherein the dataset of the worker node is sensor data from IoT devices (Liu, Page 7, Section 5, Lines 1-10, “the proposed FedGRU and clustering based FedGRU algorithms are applied to the real-world data collected from the Caltrans Performance Measurement System (PeMS) [48] database for performance demonstration. The traffic flow data in PeMS database was collected from over 39,000 individual detectors in real time. These sensors span the freeway system across all major metropolitan areas of the State of California [1]. In this paper, traffic flow data collected during the first three months of 2013 is used for experiments”; Liu, Page 3, Col 2, Lines 1-4, “We use the term “client” to describe computing nodes that correspond to one or multiple sensors in FL and use the term “device” to describe the sensor in the organizations”).
	It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the clustered federated learning method taught by Liu to include using sensor data from IoT devices as taught by Liu. The motivation for doing so would have been that using the model on sensor data allows for traffic flow predictions which is useful for urban residents and businesses (Liu, Page 1, Section 1, Lines 1-6, “Contemporary urban residents, taxi drivers, business sectors, and government agencies have a strong need of accurate and timely traffic flow information [1] as these road users can utilize such information to alleviate traffic congestion, control traffic light appropriately, and improve the efficiency of traffic operations [2]–[4]”). 

Response to Arguments
Applicant’s arguments with regard to claims 2 and 3 with respect to the 35 U.S.C. 112(b) indefiniteness rejections have been fully considered, and are persuasive. Consequently, the 35 U.S.C. 112(b) indefiniteness rejections to claims 2 and 3 have been withdrawn. 
Applicant’s amendments to claim 14 with respect to 35 U.S.C. 112(b) indefiniteness rejections have been fully considered, and overcome the rejections set forth in the nonfinal office action dated 10/10/2025. Consequently, the rejection to claim 14 has been withdrawn. 

Applicant’s arguments regarding the 35 U.S.C. 101 rejections of the claims have been fully considered but are unpersuasive.
	Applicant first argues, on page 8, paragraph 2 – page 9, paragraph 2 of the response, that the claims are not abstract because the claims recite specific elements that improve upon existing technological methods and nodes configured to communicate with a plurality of workers nodes in a machine learning system, and directly points to “a technologically improved method for grouping worker nodes in a machine learning system”. Examiner respectfully disagrees. An improvement to grouping worker nodes may be an improvement in an abstract idea, but not an improvement in the functioning of a computer, as a computer. It is important to note that an improvement in the abstract idea itself (e.g. a recited mental process) is not an improvement in technology, see MPEP 2106.05(a)(II). Further, applicant points to applicant’s specification paragraphs 0024-0025 which discloses improved model performance. However, claiming the improved speed or efficiency inherent with applying the abstract idea on a computer does not integrate a judicial exception into a practical application or provide an inventive concept, see MPEP 2106.05(f). 
Applicant's arguments regarding the remainder of the claims rely upon the arguments asserted with respect to the independent claims, and are thus unpersuasive.

Applicant’s arguments regarding the 35 U.S.C. 102 rejections of the claims have been fully considered but are unpersuasive.
Applicant argues, on page 9, paragraph 5 – page 10, paragraph 1 of the response, that claim 1 is not anticipated by Sattler because it does not teach “grouping each worker node of the plurality of worker nodes into a group of a plurality of groups based on characteristics of a data distribution of each of the plurality of worker nodes”. Examiner respectfully disagrees. The grouping is determined to be correct based on “if clients with the same data generating distribution end up in the same cluster” (Sattler, Page 3, Col 1, Paragraph 2, Lines 1-3) and thus, the grouping is considered to be “based on” characteristics of a data distribution of each of the plurality of worker nodes.
Applicant's arguments regarding the remainder of the claims rely upon the arguments asserted with respect to the independent claims, and are thus unpersuasive.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOLLY CLARKE SIPPEL whose telephone number is (571)272-3270. The examiner can normally be reached Monday - Friday, 7:30 a.m. - 4:30 p.m. ET..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571)272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/M.C.S./            Examiner, Art Unit 2122                                                                                                                                                                                            
/KAKALI CHAKI/            Supervisory Patent Examiner, Art Unit 2122
Read full office action
Prosecution Timeline

Dec 09, 2022
Application Filed
Sep 30, 2025
Non-Final Rejection — §101, §102, §103
Jan 07, 2026
Response Filed
Feb 26, 2026
Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/929,541
Patent 12602592
NOISE COMMUNICATION FOR FEDERATED LEARNING
2y 5m to grant Granted Apr 14, 2026
17/932,941
Patent 12596916
CONSTRAINED MASKING FOR SPARSIFICATION IN MACHINE LEARNING
2y 5m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
50%
Grant Probability
99%
With Interview (+58.3%)
3y 7m
Median Time to Grant
Moderate
PTA Risk
Based on 14 resolved cases by this examiner. Grant probability derived from career allow rate.