Office Action Analysis: 17544307 — ADAPTIVE MODEL PRUNING TO IMPROVE PERFORMANCE OF FEDERATED LEARNING

Examiner Intelligence

SUSSMAN MOSS, JACOB ZACHARY View full profile →
Grants only 14% of cases
Career Allow Rate
1 granted / 7 resolved
-40.7% vs TC avg
Minimal -20% lift
Without
With
+-20.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
26 currently pending
Career history
33
Total Applications
across all art units
Statute-Specific Performance

§101
37.3%
-2.7% vs TC avg
§103
35.2%
-4.8% vs TC avg
§102
11.9%
-28.1% vs TC avg
§112
15.5%
-24.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 7 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
	This action is in response to amendments filed September 9th, 2025, in which claims 1, 8 and 15  have been amended. No claims have been cancelled nor added. The amendments have been entered, and claims 1-20  are currently pending in the case. Claims 1, 8 and 15 are independent claims.

Specification
The use of the term “BLUETOOTH” ¶12, “WI-FI” ¶12 which are a trade name or a mark used in commerce, has been noted in this application. The term should be accompanied by the generic terminology; furthermore the term should be capitalized wherever it appears or, where appropriate, include a proper symbol indicating use in commerce such as ™, SM , or ® following the term.
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) are permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  The following sections follow the 2019 PEG guidelines for analyzing subject matter eligibility.

Regarding claim 1:
Step 1: Claim 1 is directed to a system, therefore it falls under the statuary category of machine.
Step 2A Prong 1: The claim recites, in part:
“determine a loss reduction for each received data set of the plurality of data sets, representing a loss reduction since a previous local loss value included in a previous received data set corresponding to the given vehicle;” This encompasses the mental determination of a loss reduction value by comparing previously observed local loss values.
“determine whether the loss reduction for each received data set of the plurality of data sets exceeds a predefined threshold cutoff value that varies over successive rounds of training and which results in exclusion of at least a plurality of data sets having a positive loss reduction but which positive loss reduction is insufficient to exceed the predefined threshold cutoff value” This encompasses the mental determination of whether an observed loss value for observed data exceeds a predefined threshold value that varies, such that some observed values fail to exceed the threshold, despite a positive observed loss. Further, this limitation is a mathematical concept. Further, this limitation is a mathematical concept.  
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows:
“a processor configured to:” the limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §§ 2106.04(d), 2106.05(f)(2). “receive a plurality of data sets relating to differently trained versions of a global machine learning model, from a plurality of vehicles, the data sets including at least a present local loss value experienced by a current version of the global model executing on a given vehicle for which a data set of the plurality of data sets was received;”, “train the global model using federated learning and based on the data sets of the plurality of data sets for which the loss reduction exceeds the predefined cutoff value.” the limitation is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP §§ 2106.04(d), 2106.05(g). 
Step 2B: The claim does not contain significantly more than the judicial exception. The limitations
“a processor configured to:” the limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer.  See MPEP § 2106.05(f)(1). Further, “receive a plurality of data sets relating to differently trained versions of a global machine learning model, from a plurality of vehicles, the data sets including at least a present local loss value experienced by a current version of the global model executing on a given vehicle for which a data set of the plurality of data sets was received;” is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g).  Furthermore the additional element is directed to receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d.  See MPEP § 2106.05(d)(II). Further, “train the global model using federated learning and based on the data sets of the plurality of data sets for which the loss reduction exceeds the predefined cutoff value.” is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). Furthermore the additional element is well‐understood, routine, and conventional as taught by Vijaya et al. (WO 2021213626 A1), page 2, lines 8-11 “A typical structure of a federated learning framework is as shown in the Fig.1. Thus, Fig. 1 shows a typical federated learning system, wherein a top node is the global model, which is trained from using client models such as UEs, loT capable devices, etc.”.  See MPEP § 2106.05(d). 

Regarding claim 2, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: The claim recites, in part: “wherein the loss reduction is an average loss reduction over a plurality of reporting cycles, wherein a reporting cycle represents an iteration of the data set, including the present local loss value, received for the given vehicle.” This encompasses the mental process of averaging observed data over time. 
Step 2A Prong 2: The judicial exception is not integrated into a practical application.
Step 2B: The claim does not contain significantly more than the judicial exception. 

Regarding claim 3, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: The claim recites, in part:
“set a mask vector for each vehicle of the plurality of vehicles for which the loss reduction does not exceed the predefined cutoff value;” This encompasses the mental setting of a mask vector for data that does not exceed the threshold. 
“modified by the mask vector, such that data sets from each vehicle for which the loss reduction does not exceed the predefined cutoff value are masked out when training the global model.” This encompasses the mental process of excluding certain data from further steps, when it does not meet certain criteria.
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows:
“wherein the training is based on the received plurality of data sets” The limitation is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP §§ 2106.04(d), 2106.05(g). 
Step 2B: The claim does not contain significantly more than the judicial exception. The limitation
“wherein the training is based on the received plurality of data sets” is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). Furthermore the additional element is well‐understood, routine, and conventional as taught by the background section of specification of the application ¶3 “Machine learning models often utilize large data sets gathered from thousands of sources.”.

Regarding claim 4, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: The claim recites, in part:
“wait for all of the plurality of vehicles to complete at least one reporting cycle since a prior reporting cycle before training the global model, wherein a reporting cycle represents an iteration of the data set, including the present local loss value, received for the given vehicle.” This encompasses the mental process of waiting for all data to be reported before going on to a next step.
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “the processor is configured to” The limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §§ 2106.04(d), 2106.05(f)(2). 
Step 2B: The claim does not contain significantly more than the judicial exception. The limitation “the processor is configured to” is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer.  See MPEP § 2106.05(f)(1).

Regarding claim 5, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: The claim is a continuation of the abstract idea identified in the parent claim.
“wait for at least one of a predefined total number or percentage of the plurality of vehicles to complete at least one reporting cycle since a prior reporting cycle before training the global model, wherein a reporting cycle represents an iteration of the data set, including the present local loss value, received for the given vehicle.” This encompasses the mental process of waiting for some predetermined amount of data to be reported before going on to a next step.
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “the processor is configured to” The limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §§ 2106.04(d), 2106.05(f)(2). 
Step 2B: The claim does not contain significantly more than the judicial exception. The limitation “the processor is configured to” is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer.  See MPEP § 2106.05(f)(1).

Regarding claim 6, the rejection of claim 5 is incorporated and further:
Step 2A Prong 1: The claim recites, in part:
“determine whether at least one of a total number or percentage of vehicles for which the loss reduction exceeds the predefined threshold cutoff value exceeds a predefined value representing sufficient training data;” This encompasses the mental determination of whether observed loss values exceed a threshold of sufficient training data.
“responsive to the at least one of the total number or percentage not exceeding the predefined value, wait for data sets to be received from an additional number or additional percentage of the plurality of vehicles.” The encompasses the mental process of waiting for data to observe from further sources.
Step 2A Prong 2: The judicial exception is not integrated into a practical application.
Step 2B: The claim does not contain significantly more than the judicial exception. 

Regarding claim 7, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: The claim recites, in part:
“determine whether at least one of a total number or percentage of vehicles for which the loss reduction exceeds the predefined threshold cutoff value exceeds a predefined value representing sufficient training data;” This encompasses the mental determination of whether observed loss values exceed a threshold of sufficient training data. 
“responsive to the at least one of the total number or percentage not exceeding the predefined value, decrement the threshold cutoff value to include, in the training, data sets of additional vehicles for which the loss reduction did not exceed the threshold cutoff value prior to decrementing the cutoff value.” This encompasses the mental process of decrementing the threshold value to include further datasets to observe.
Step 2A Prong 2: The judicial exception is not integrated into a practical application.
Step 2B: The claim does not contain significantly more than the judicial exception. 

Regarding claim 8:
Step 1: Claim 1 is directed to a method, therefore it falls under the statuary category of a process.
Step 2A Prong 1: The claim recites, in part:
“determining a loss reduction for each received data set of the plurality of data sets, representing a loss reduction since a previous local loss value included in a previous received data set corresponding to the given vehicle;” This encompasses the mental determination of a loss reduction value by comparing previously observed local loss values.
“determine whether the loss reduction for each received data set of the plurality of data sets exceeds a predefined threshold cutoff value that varies over successive rounds of training and which results in exclusion of at least a plurality of data sets having a positive loss reduction but which positive loss reduction is insufficient to exceed the predefined threshold cutoff value” This encompasses the mental determination of whether an observed loss value for observed data exceeds a predefined threshold value that varies, such that some observed values fail to exceed the threshold, despite a positive observed loss. Further, this limitation is a mathematical concept. Further, this limitation is a mathematical concept.   
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows:
“receiving a plurality of data sets relating to differently trained versions of a global machine learning model, from a plurality of vehicles, the data sets including at least a present local loss value experienced by a current version of the global model executing on a given vehicle for which a data set of the plurality of data sets was received;”, “training the global model using federated learning and based on the data sets of the plurality of data sets for which the loss reduction exceeds the predefined cutoff value.” the limitation is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP §§ 2106.04(d), 2106.05(g). 
Step 2B: The claim does not contain significantly more than the judicial exception. The limitations
Further, “receiving a plurality of data sets relating to differently trained versions of a global machine learning model, from a plurality of vehicles, the data sets including at least a present local loss value experienced by a current version of the global model executing on a given vehicle for which a data set of the plurality of data sets was received;” is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g).  Furthermore the additional element is directed to receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d.  See MPEP § 2106.05(d)(II). Further, “training the global model using federated learning and based on the data sets of the plurality of data sets for which the loss reduction exceeds the predefined cutoff value.” is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). Furthermore the additional element is well‐understood, routine, and conventional as taught by Vijaya et al. (WO 2021213626 A1), page 2, lines 8-11 “A typical structure of a federated learning framework is as shown in the Fig.1. Thus, Fig. 1 shows a typical federated learning system, wherein a top node is the global model, which is trained from using client models such as UEs, loT capable devices, etc.”.  See MPEP § 2106.05(d). 

Regarding claims 9-14:
The rejection of claim 8 is further incorporated, the rejection of claims 2-7 are  equally applicable to claims 9-14, respectively.

Regarding claim 15:
Step 1: Claim 15 is directed to “A non-transitory storage medium”, therefore it falls under the statuary category of a manufacture.
Step 2A Prong 1: The claim recites, in part:
“determining a loss reduction for each received data set of the plurality of data sets, representing a loss reduction since a previous local loss value included in a previous received data set corresponding to the given vehicle;” This encompasses the mental determination of a loss reduction value by comparing previously observed local loss values.
“determine whether the loss reduction for each received data set of the plurality of data sets exceeds a predefined threshold cutoff value that varies over successive rounds of training and which results in exclusion of at least a plurality of data sets having a positive loss reduction but which positive loss reduction is insufficient to exceed the predefined threshold cutoff value” This encompasses the mental determination of whether an observed loss value for observed data exceeds a predefined threshold value that varies, such that some observed values fail to exceed the threshold, despite a positive observed loss. Further, this limitation is a mathematical concept. Further, this limitation is a mathematical concept.  
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows:
“receiving a plurality of data sets relating to differently trained versions of a global machine learning model, from a plurality of vehicles, the data sets including at least a present local loss value experienced by a current version of the global model executing on a given vehicle for which a data set of the plurality of data sets was received;”, “training the global model using federated learning and based on the data sets of the plurality of data sets for which the loss reduction exceeds the predefined cutoff value.” the limitation is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP §§ 2106.04(d), 2106.05(g). 
Step 2B: The claim does not contain significantly more than the judicial exception. The limitations
Further, “receiving a plurality of data sets relating to differently trained versions of a global machine learning model, from a plurality of vehicles, the data sets including at least a present local loss value experienced by a current version of the global model executing on a given vehicle for which a data set of the plurality of data sets was received;” is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g).  Furthermore the additional element is directed to receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d.  See MPEP § 2106.05(d)(II). Further, “training the global model using federated learning and based on the data sets of the plurality of data sets for which the loss reduction exceeds the predefined cutoff value.” is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). Furthermore the additional element is well‐understood, routine, and conventional as taught by Vijaya et al. (WO 2021213626 A1), page 2, lines 8-11 “A typical structure of a federated learning framework is as shown in the Fig.1. Thus, Fig. 1 shows a typical federated learning system, wherein a top node is the global model, which is trained from using client models such as UEs, loT capable devices, etc.”.  See MPEP § 2106.05(d).

Regarding claims 16-20:
The rejection of claim 15 is further incorporated, the rejection of claims 2-3, 5-7 are equally applicable to claims 16-20, respectively.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-8, 10-15 and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Tuor et al. (US20210158099A1, cited in IDS filed on 07 December 2021) hereinafter Tuor in view of Xu et al. ("Accelerating Federated Learning for IoT in Big Data Analytics With Pruning, Quantization and Selective Updating", Xu et al.) hereinafter Xu in further view of Deng ("FAIR: Quality-Aware Federated Learning with Precise User Incentive and Model Aggregation", Deng et al., July 2021) hereinafter Deng. 

Regarding claim 1, Tuor teaches A system comprising: 
a processor configured to (Tuor, Claim 15 “execution by at least one of the one or more processors capable of performing a method, the method comprising:”):
receive a plurality of data sets relating to differently trained versions of a global machine learning model, from a plurality of vehicles, (Tuor, ¶34 “The federated exchange client 112 may be configured to prepare datasets that are to be exchanged for purposes of generating a global model. The federated exchange client 112 may receive the datasets from the further clients and/or programs that train the local models on the local data samples and generate packets of the datasets that are to be transmitted to a further processing component that generates the global model.”, here the “further clients” can be vehicles as shown in Tuor, Fig 4, 54N “automobile computer system”) 
train the global model using federated learning and based on the data sets of the plurality of data sets (Tuor, ¶3 “The federated learning may be used to train a machine learning algorithm such as a global model (e.g., deep neural network) on a plurality of localized datasets”)
Tuor does not teach “the data sets including at least a present local loss value experienced by a current version of the global model executing on a given vehicle for which a data set of the plurality of data sets was received; 
determine a loss reduction for each received data set of the plurality of data sets, representing a loss reduction since a previous local loss value included in a previous received data set corresponding to the given vehicle;
determine whether the loss reduction for each received data set of the plurality of data sets exceeds a predefined threshold cutoff value; and
for which the loss reduction exceeds the predefined cutoff value.” 
However, Xu teaches the data sets including at least a present local loss value experienced by a current version of the global model executing on a given client for which a data set of the plurality of data sets was received (Xu, page 38459-38460, col 2, section III/A “• Step 2 (Training and updating local model): In round t, based on the global model                         
                            
                                    M
                                
                                    G
                                
                                    t
                                
                    , client c updates the local model parameters                         
                            
                                    M
                                
                                    c
                                
                                    t
                                
                     using its local data. Specifically, it aims to obtain optimal parameters                         
                            
                                    M
                                
                                    c
                                
                                    t
                                
                     that minimize the loss function                         
                            
                                    L
                                    (
                                    M
                                
                                    c
                                
                                    t
                                
                            )
                        
                    . Then, client c uploads the updated local parameters to S. 

    PNG
    media_image1.png
    101
    400
    media_image1.png
    Greyscale
• Step 3 (Aggregating and updating global model): In round t, S aggregates all the received local models, with the objective to minimize the global loss function [3]:”  
Since S is the central server which compiles the local loss model, partly by summing                         
                            
                                    L
                                    (
                                    M
                                
                                    c
                                
                                    t
                                
                            )
                        
                    , it is shown that the local loss value experienced by a current version of the global model is sent to the server.)
determine a loss reduction for each received data set of the plurality of data sets, representing a loss reduction since a previous local loss value included in a previous received data set corresponding to the given client (Xu, page 38462, col 1, section E ,¶2 “Intuitively, the loss function is anticipated to be converged in each round, i.e., the loss value should be gradually reduced and closer to the optimal value. However, the loss value often fluctuates erratically in the training process. In our design, each client records the loss value llast in the last upload, and compares it with the current value lcurrent .”);
determine whether the loss reduction for each received data set of the plurality of data sets exceeds a predefined threshold cutoff value (Xu, page 38462, col 1, section E ,¶2 “If llast > lcurrent , this indicates that the current update is beneficial to the loss function and should be uploaded to the FL server. Otherwise, this indicates that in current round lcurrent is undesirable” here, by comparing the loss value to the previous round and finding those where it was higher than the previous round are undesirable, the loss reduction threshold is effectively set to 0 (i.e. only those with a positive loss reduction are desirable, “exceeds a predefined threshold value”.); and
for which the loss reduction exceeds the predefined cutoff value (Xu, page 38462, col 1, section E ,¶2 “If llast > lcurrent , this indicates that the current update is beneficial to the loss function and should be uploaded to the FL server. Otherwise, this indicates that in current round lcurrent is undesirable. Accordingly, the current update are prevented by the client to be uploaded, and the FL server will have to use the latest update of this client instead.”).
Tuor and Xu are analogous art because both references concern methods for federated learning. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Tuor’s federated learning algorithm on vehicles to incorporate the loss reduction taught by Xu. The motivation for doing so would have been to only incorporate those local models in the federated learning that have shown an accuracy improvement from the last iteration, and are thus beneficial. Xu, page 38462, Section E, ¶2 “If llast > lcurrent , this indicates that the current update is beneficial to the loss function and should be uploaded to the FL server. Otherwise, this indicates that in current round lcurrent is undesirable.”
Tuor in view of Xu does not teach “that varies over successive rounds of training and which results in exclusion of at least a plurality of data sets having a positive loss reduction but which positive loss reduction is insufficient to exceed the predefined threshold cutoff value”
However Deng teaches that varies over successive rounds of training and which results in exclusion of at least a plurality of data sets having a positive loss reduction but which positive loss reduction is insufficient to exceed the predefined threshold cutoff value (Deng, page 4, col 2, ¶1 “Suppose the average test loss value of task                         
                            
                                    l
                                
                                    j
                                
                                    t
                                
                    ’s global model at time ts is lossj(ts) and the average training loss value of node i’s local model at time te is lossi,j(te). We define the training data quality of node i in iteration t as
                        
                                    m
                                
                                    ⅈ
                                    ,
                                    j
                                
                                    t
                                
                            =
                            
                                    l
                                    o
                                    s
                                    s
                                
                                    j
                                
                                            t
                                        
                                            s
                                        
                            -
                            
                                    l
                                    o
                                    s
                                    s
                                
                                    i
                                    ,
                                    j
                                
                                            t
                                        
                                            e
                                        
                    .		(7)
Combining the amount of data (denoted by                         
                            
                                    D
                                
                                    ⅈ
                                    ,
                                    j
                                
                                    t
                                
                     ) used for training, the learning quality of node i in iteration t is defined as follows 

    PNG
    media_image2.png
    209
    1281
    media_image2.png
    Greyscale
” here, it can be seen that the threshold defined in equation 8 varies with successive rounds of training, and further can exclude (set to 0) those nodes which do not meet the loss reduction threshold, because the threshold is set from an average of loss reductions, it can exclude nodes even if the loss reduction is positive.)
Tuor in view of Xu and Deng are analogous art because both references concern methods for federated learning. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Tuor/Xu’s federated learning algorithm on vehicles to incorporate the varying loss reduction threshold taught by Deng. The motivation for doing so would have been to effectively aggregate the model updates and generate a superior globe learning model as stated in Deng, page 2, col 1, ¶6 “Extensive experiments are carried out to demonstrate the efficacy of FAIR, where the incentive mechanism can stimulate more high-quality model updates, and the devised aggregation algorithm can effectively aggregate the model updates, collectively contributing to a superior globe learning model.”

Regarding claim 3, Tuor in view of Xu in further view of Deng teaches The system of claim 1, wherein the processor is further configured to:
set a mask vector for each client of the plurality of clients for which the loss reduction does not exceed the predefined cutoff value (Xu, page 38462, col 1, section E, ¶2 “If llast > lcurrent , this indicates that the current update is beneficial to the loss function and should be uploaded to the FL server. Otherwise, this indicates that in current round lcurrent is undesirable.” Here, by not uploading the current update to the server, it can be considered masked from training.); and
wherein the training is based on the received plurality of data sets modified by the mask vector, such that data sets from each client for which the loss reduction does not exceed the predefined cutoff value are masked out when training the global model (Xu, page 38462, col 1, section E ,¶2 “If llast > lcurrent , this indicates that the current update is beneficial to the loss function and should be uploaded to the FL server. Otherwise, this indicates that in current round lcurrent is undesirable.” Here, only those that exceeded a cutoff value are used for training, the equivalent of being masked out. Further, Xu, page 38460, col 1, section B, ¶1 “Finally, we propose selective updating to avoid unnecessary local model updates from some clients, so as to reduce the total number of updates in training”)

Regarding claim 4, Tuor in view of Xu in further view of Deng teaches The system of claim 1, wherein the processor is configured to wait for all of the plurality of clients to complete at least one reporting cycle since a prior reporting cycle before training the global model, wherein a reporting cycle represents an iteration of the data set, including the present local loss value, received for the given client (Xu, Page 38459, col 2, section III/A, ¶1 “Each client c uses its local dataset dc to train a local model mc, and then sends the local parameters as an update to S. The FL server S collects all the local models                         
                            m
                            =
                            
                                    ∪
                                    
                                        c
                                        ∈
                                        C
                                    
                                    m
                                
                                    c
                                
                    , and finally obtains the global model MG according to some aggregation rule [2].” Here, it is shown that that the server collects all local models. Further, as shown above, the server collecting the models includes receiving the local loss value, Xu, page 38459, col 2, section A “S aggregates all the received local models, with the objective to minimize the global loss function [3]”).

Regarding claim 5, Tuor in view of Xu in further view of Deng teaches The system of claim 1, wherein the processor is configured to wait for at least one of a predefined total number or percentage of the plurality of clients to complete at least one reporting cycle since a prior reporting cycle before training the global model, wherein a reporting cycle represents an iteration of the data set, including the present local loss value, received for the given client (Xu, Page 38459, col 2, section III/A, ¶1 “Each client c uses its local dataset dc to train a local model mc, and then sends the local parameters as an update to S. The FL server S collects all the local models                         
                            m
                            =
                            
                                    ∪
                                    
                                        c
                                        ∈
                                        C
                                    
                                    m
                                
                                    c
                                
                    , and finally obtains the global model MG according to some aggregation rule [2].” Here, it is shown that that the server collects all local models, which under the broadest reasonable interpretation means waiting for “at least one of a predefined total number or percentage”. Further, as shown above, the server collecting the models includes receiving the local loss value, Xu, page 38459, col 2, section A “S aggregates all the received local models, with the objective to minimize the global loss function [3]” ).

	Regarding claim 6, Tuor in view of Xu in further view of Deng teaches The system of claim 5, wherein the processor is further configured to:
determine whether at least one of a total number or percentage of vehicles for which the loss reduction exceeds the predefined threshold cutoff value exceeds a predefined value representing sufficient training data (Tuor, ¶45 “When the contribution program 132 determines that the currently available contributors have a sufficient usefulness to perform the federated learning process, the contribution program 132 may determine that the modelling program 122 should perform the federated learning process based on the datasets of the currently available contributors.” here the “contributors” can be vehicles as shown in Tuor, Fig 4, 54N “automobile computer system”); and
responsive to the at least one of the total number or percentage not exceeding the predefined value, wait for data sets to be received from an additional number or additional percentage of the plurality of vehicles (Tuor, ¶45 “However, when the contribution program 132 determines that the currently available contributors have an insufficient usefulness to perform the federated learning process, the contribution program 132 may determine that the modelling program 122 should hold on performing the federated learning process and wait for more and/or different contributors to become available.” here the “contributors” can be vehicles as shown in Tuor, Fig 4, 54N “automobile computer system”).

Regarding claim 7, Tuor in view of Xu in further view of Deng teaches The system of claim 1, wherein the processor is further configured to:
determine whether at least one of a total number or percentage of vehicles for which the loss reduction exceeds the predefined threshold cutoff value exceeds a predefined value representing sufficient training data (Tuor, ¶45 “When the contribution program 132 determines that the currently available contributors have a sufficient usefulness to perform the federated learning process, the contribution program 132 may determine that the modelling program 122 should perform the federated learning process based on the datasets of the currently available contributors.” here the “contributors” can be vehicles as shown in Tuor, Fig 4, 54N “automobile computer system”); and
responsive to the at least one of the total number or percentage not exceeding the predefined value, decrement the threshold cutoff value to include, in the training, data sets of additional vehicles for which the loss reduction did not exceed the threshold cutoff value prior to decrementing the cutoff value (Tuor, ¶68 “The exemplary embodiments are also described with regard to the usefulness threshold (e.g., local and/or global) being set to a predetermined value. However, the exemplary embodiments may modify the usefulness threshold to be dynamic and/or adaptable according to the conditions being experienced by the federated learning system 100. For example, the contribution program 132 may have determined that the global usefulness metric has been below the global usefulness threshold for a predetermined duration of time. As a result of such a condition, the contribution program 132 may select to modify the global usefulness threshold (e.g., lower the usefulness threshold).” Here, the “usefulness threshold” can be a loss reduction as shown from the fact the usefulness score may be determined by the datasets Tuor, ¶60 “The execution server 130 may determine a usefulness of the datasets that are provided by the currently available contributors (step 204). For example, the execution server 130 may utilize a local and/or a global usefulness metric relative to a local and/or global usefulness threshold, respectively.” and that the datasets can contain loss information, over iterations Tuor, ¶48 “For example, the distance of features of the dataset from the contributor and the validation dataset may not be too large or too small. The features of the dataset may include, for example, an output from lower layers of a cellular neural network (CNN) or a deep neural network, a gradient of a model parameter, a gradient computed on a minibatch of training data during a model training process, etc.”).

Regarding claim 8, Tour teaches A method comprising:
receiving a plurality of data sets relating to differently trained versions of a global machine learning model, from a plurality of vehicles, (Tuor, ¶34 “The federated exchange client 112 may be configured to prepare datasets that are to be exchanged for purposes of generating a global model. The federated exchange client 112 may receive the datasets from the further clients and/or programs that train the local models on the local data samples and generate packets of the datasets that are to be transmitted to a further processing component that generates the global model.”, here the “further clients” can be vehicles as shown in Tuor, Fig 4, 54N “automobile computer system”) 
training the global model using federated learning and based on the data sets of the plurality of data sets (Tuor, ¶3 “The federated learning may be used to train a machine learning algorithm such as a global model (e.g., deep neural network) on a plurality of localized datasets”)
Tuor does not teach “the data sets including at least a present local loss value experienced by a current version of the global model executing on a given vehicle for which a data set of the plurality of data sets was received; 
determining a loss reduction for each received data set of the plurality of data sets, representing a loss reduction since a previous local loss value included in a previous received data set corresponding to the given vehicle;
determining whether the loss reduction for each received data set of the plurality of data sets exceeds a predefined threshold cutoff value; and
for which the loss reduction exceeds the predefined cutoff value.” 
However, Xu teaches the data sets including at least a present local loss value experienced by a current version of the global model executing on a given client for which a data set of the plurality of data sets was received (Xu, page 38459-38460, col 2, section III/A “• Step 2 (Training and updating local model): In round t, based on the global model                         
                            
                                    M
                                
                                    G
                                
                                    t
                                
                    , client c updates the local model parameters                         
                            
                                    M
                                
                                    c
                                
                                    t
                                
                     using its local data. Specifically, it aims to obtain optimal parameters                         
                            
                                    M
                                
                                    c
                                
                                    t
                                
                     that minimize the loss function                         
                            
                                    L
                                    (
                                    M
                                
                                    c
                                
                                    t
                                
                            )
                        
                    . Then, client c uploads the updated local parameters to S. 

    PNG
    media_image1.png
    101
    400
    media_image1.png
    Greyscale
• Step 3 (Aggregating and updating global model): In round t, S aggregates all the received local models, with the objective to minimize the global loss function [3]:”  
Since S is the central server which compiles the local loss model, partly by summing                         
                            
                                    L
                                    (
                                    M
                                
                                    c
                                
                                    t
                                
                            )
                        
                    , it is shown that the local loss value experienced by a current version of the global model is sent to the server.)
determining a loss reduction for each received data set of the plurality of data sets, representing a loss reduction since a previous local loss value included in a previous received data set corresponding to the given client (Xu, page 38462
Read full office action
Prosecution Timeline

Dec 07, 2021
Application Filed
Apr 03, 2025
Non-Final Rejection — §101, §103
Sep 09, 2025
Response Filed
Nov 14, 2025
Final Rejection — §101, §103 (current)
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds
Prosecution Projections

3-4
Expected OA Rounds
14%
Grant Probability
-6%
With Interview (-20.0%)
3y 3m
Median Time to Grant
Moderate
PTA Risk
Based on 7 resolved cases by this examiner. Grant probability derived from career allow rate.
ADAPTIVE MODEL PRUNING TO IMPROVE PERFORMANCE OF FEDERATED LEARNING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

ADAPTIVE MODEL PRUNING TO IMPROVE PERFORMANCE OF FEDERATED LEARNING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email