Last updated: April 19, 2026
Application No. 18/365,487
UTILIZING ELASTIC WEIGHT CONSOLIDATION (EWC) LOSS TERM(S) TO MITIGATE CATASTROPHIC FORGETTING IN TRAINING MACHINE LEARNING MODEL(S)

Non-Final OA §102§103
Filed
Aug 04, 2023
Examiner
KHAN, USMAN A
Art Unit
2637
Tech Center
2600 — Communications
Assignee
Google LLC
OA Round
1 (Non-Final)
Interview Optional

— +12.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 866 resolved cases, 2023–2026
Examiner Intelligence

KHAN, USMAN A View full profile →
Grants 75% — above average
Career Allow Rate
646 granted / 866 resolved
+12.6% vs TC avg
Moderate +12% lift
Without
With
+12.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
29 currently pending
Career history
895
Total Applications
across all art units
Statute-Specific Performance

§101
4.1%
-35.9% vs TC avg
§103
46.6%
+6.6% vs TC avg
§102
32.6%
-7.4% vs TC avg
§112
13.0%
-27.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 866 resolved cases
Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 08/04/2023, 10/11/2023, and 11/15/2024 have been considered by the examiner.  The submissions are in compliance with the provisions of 37 CFR 1.97.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1 – 15 and 19 – 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Jhunjhunwala Divyansh et al. (Towards a Theoretical and Practical Understanding of One-Shot Federated Learning with Fisher Information [Reference provided by applicant in IDS dated 11/15/2024]). 
Regarding claim 1, Jhunjhunwala Divyansh et al. teaches a method implemented by one or more processors, the method comprising: receiving, at a client device and from a remote system, global weights of a global machine learning (ML) model (Page 1, paragraph starting with [Most standard FL algorithms…]; between clients and server in order to train a global model in federated settings; also Algorithm 1 lines 2 - 3); obtaining, at the client device, a client data set that is accessible locally at the client device and that is not accessible by the remote system (Page 1, paragraph starting with [Data collection and storage is becoming…]; decentralized data that is distributed across a network of clients under the supervision of a central server; also, Page 1, paragraph starting with [Here, A1 is the number…]; A1 is the number of clients, Di is the i-th client's local dataset consisting of input-label pairs); determining, at the client device, and based on the global weights of the global ML model, a Fisher information matrix for the client data set (Abstract, novel algorithm for one-shot FL that makes use of the Fisher information matrices computed at the local models of clients); transmitting, from the client device and to the remote system, the Fisher information matrix for the client data set (Page 2, paragraph starting with [Our Contributions. In this…]; using the local models at each client and the Fisher information matrices computed at these local models; also equation 4 and algorithm 1 lines 9 - 10); determining, at the remote system, based on the Fisher information matrix received from the client device and based on a plurality of additional Fisher information matrices received from corresponding additional client devices, a corresponding elastic weight consolidation (EWC) loss term for each of the global weights (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also Algorithm 1 lines 9 - 10); generating, at the remote system, and based on processing corresponding server data remotely at the remote system and using the global ML model, and based on the corresponding EWC loss term for each of the global weights, a server update for the global ML model (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also equations 6 – 7); and updating, at the remote system, and based on the server update, the global weights of the ML model to generate an updated global ML model (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also equations 6 – 7).

Regarding claim 2, as mentioned above in the discussion of claim 1, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches prior to receiving the global weights of the global ML model: pre-training the global ML model in a decentralized manner for a plurality of rounds of decentralized learning (Page 1 , paragraph starting with [Data collection and storage.…]; Federated Learning (FL) is a framework that is designed to learn the parameters w E JH;,d of a model J(w, •) on decentralized data that is distributed across a network of clients under the supervision of a central server (l; 2; 3). Our focus is particularly on the case where f ( w, ·) is a neural network as is usually the case in practice).

Regarding claim 3, as mentioned above in the discussion of claim 2, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein pre-training the global ML model in the decentralized manner for a given round of decentralized learning, of the plurality of rounds of decentralized learning, comprises: identifying, at the remote system, a plurality of client devices that will participate in the given round of decentralized learning (Page 1 , paragraph starting with [Data collection and storage.…]; Federated Learning (FL) is a framework that is designed to learn the parameters w E JH;,d of a model J(w, •) on decentralized data that is distributed across a network of clients under the supervision of a central server (l; 2; 3). Our focus is particularly on the case where f ( w, ·) is a neural network as is usually the case in practice); transmitting, from the remote system and to each of the plurality of client devices, the global weights of the global ML model (Page 1 , paragraph starting with [Data collection and storage.…]; Federated Learning (FL) is a framework that is designed to learn the parameters w E JH;,d of a model J(w, •) on decentralized data that is distributed across a network of clients under the supervision of a central server (l; 2; 3). Our focus is particularly on the case where f ( w, ·) is a neural network as is usually the case in practice); receiving, at the remote system and from a given client device, of the plurality of client devices, a corresponding client update for the global ML model, the corresponding client update being generated locally at the given client device and based on processing client device data, that is accessible locally at the given client device and that is not accessible by the remote system, using the global ML model; and updating, at the remote system, and based on the corresponding client update received from the given client device and one or more additional corresponding client updates received from one or more further additional client devices, of the plurality of client devices, the global weights of the global ML model (Page 1 , paragraph starting with [Data collection and storage.…]; Federated Learning (FL) is a framework that is designed to learn the parameters w E JH;,d of a model J(w, •) on decentralized data that is distributed across a network of clients under the supervision of a central server (l; 2; 3). Our focus is particularly on the case where f ( w, ·) is a neural network as is usually the case in practice).

Regarding claim 4, as mentioned above in the discussion of claim 1, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches for n iterations, where n is a positive integer: continue generating, at the remote system, and based on processing the corresponding server data remotely at the remote system and using the global ML model, and based on the corresponding EWC loss term for each of the global weights, additional server updates for the global ML model (Page 1, paragraph starting with [DENSE. For DENSE.…]; We use the validation data to determine the model which achieves the best validation performance during the server training and use this as the final model); and continue updating, at the remote system, and based on the additional server updates, the global weights of the ML model to generate a further updated global ML model (Page 1, paragraph starting with [DENSE. For DENSE.…]; We use the validation data to determine the model which achieves the best validation performance during the server training and use this as the final model).

Regarding claim 5, as mentioned above in the discussion of claim 4, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches comprising: subsequent to the n iterations: determining, at the remote system, whether one or more conditions for deploying the further updated global ML model are satisfied (Page 1, paragraph starting with [DENSE. For DENSE.…]; We use the validation data to determine the model which achieves the best validation performance during the server training and use this as the final model); and in response to determining that the one or more conditions for deploying the further updated global ML model are satisfied (Page 1, paragraph starting with [DENSE. For DENSE.…]; We use the validation data to determine the model which achieves the best validation performance during the server training and use this as the final model): causing, by the remote system, the further updated global ML model to be deployed (Page 1, paragraph starting with [DENSE. For DENSE.…]; We use the validation data to determine the model which achieves the best validation performance during the server training and use this as the final model).

Regarding claim 6, as mentioned above in the discussion of claim 5, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein the one or more conditions comprise one or more of: whether a threshold quantity of server updates have been utilized in updating the further updated global ML model, whether a threshold duration of time has elapsed since the further updated global ML model was updated, or whether performance of the further updated global ML model satisfies a threshold performance measure (Page 19, paragraph starting with [DENSE. For DENSE.…]; We use the validation data to determine the model which achieves the best validation performance during the server training and use this as the final model).

Regarding claim 7, as mentioned above in the discussion of claim 5, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches subsequent to the n iterations: in response to determining that the one or more conditions for deploying the further updated global ML model are not satisfied: receiving, at an additional client device and from the remote system, the global weights of the further updated global ML model (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model); obtaining, at the additional client device, an additional client data set that is accessible locally at the additional client device and that is not accessible by the remote system (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model); determining, at the additional client device, and based on the global weights of the further updated global ML model, an updated Fisher information matrix for the additional client data set (Page 1, paragraph starting with [Most standard FL algorithms…]; between clients and server in order to train a global model in federated settings; also Algorithm 1 lines 2 - 3); and transmitting, from the additional client device and to the remote system, the updated Fisher information matrix for the additional client data set (Page 2, paragraph starting with [Our Contributions. In this…]; using the local models at each client and the Fisher information matrices computed at these local models; also equation 4 and algorithm 1 lines 9 - 10).

Regarding claim 8, as mentioned above in the discussion of claim 7, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches determining, at the remote system, based on the updated Fisher information matrix received from the additional client device and based on a plurality of additional updated Fisher information matrices received from corresponding further additional client devices, an updated corresponding EWC loss term for each of the global weights (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model; Also, Page 19, paragraph starting with [The local optimization procedure.…]; For hyperparameter tuning we assume that the server has access to a dataset of 500 samples, sampled uniformly from the original training set. We describe the hyperparameters tuned for each of the algorithms below.); and for m iterations, where m is a positive integer: continue generating, at the remote system, and based on processing the corresponding server data remotely at the remote system and using the global ML model, and based on the corresponding updated EWC loss term for each of the global weights, further additional server updates for the global ML model (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model; Also, Page 19, paragraph starting with [The local optimization procedure.…]; For hyperparameter tuning we assume that the server has access to a dataset of 500 samples, sampled uniformly from the original training set. We describe the hyperparameters tuned for each of the algorithms below.); and continue updating, at the remote system, and based on the further additional server updates, the global weights of the ML model to generate a yet further updated global ML model (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model; Also, Page 19, paragraph starting with [The local optimization procedure.…]; For hyperparameter tuning we assume that the server has access to a dataset of 500 samples, sampled uniformly from the original training set. We describe the hyperparameters tuned for each of the algorithms below.).

Regarding claim 9, as mentioned above in the discussion of claim 1, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein determining the corresponding EWC loss term for each of the global weights based on the Fisher information matrix received from the client device and based on the plurality of additional Fisher information matrices received from corresponding additional client devices comprises: combining the Fisher information matrix received from the client device with the plurality of additional Fisher information matrices received from corresponding additional client devices to generate an aggregated Fisher information matrix (Page 3, paragraph starting on page 2 and continuing into page 3 with [Now if w; fits the data at client.…]; clients can replace the full Fisher F; with another practical approximation F; such as the diagonal Fisher or K-FAC (21); also equation 7 and page 4 section 4 Practical Variants of FedFisher and Experiments); and determining the corresponding EWC loss term for each of the global weights based on the aggregated Fisher information matrix (Page 3, paragraph starting on page 2 and continuing into page 3 with [Now if w; fits the data at client.…]; clients can replace the full Fisher F; with another practical approximation F; such as the diagonal Fisher or K-FAC (21); also equation 7 and page 4 section 4 Practical Variants of FedFisher and Experiments).

Regarding claim 10, as mentioned above in the discussion of claim 9, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein the corresponding EWC loss term for each of the global weights corresponds to a corresponding diagonal element of the aggregated Fisher information matrix (Page 3, paragraph starting on page 2 and continuing into page 3 with [Now if w; fits the data at client.…]; clients can replace the full Fisher F; with another practical approximation F; such as the diagonal Fisher or K-FAC (21); also equation 7 and page 4 section 4 Practical Variants of FedFisher and Experiments).

Regarding claim 11, as mentioned above in the discussion of claim 9, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein combining the Fisher information matrix received from the client device with the plurality of additional Fisher information matrices received from corresponding additional client devices to generate the aggregated Fisher information matrix comprises: averaging the Fisher information matrix received from the client device with the plurality of additional Fisher information matrices received from corresponding additional client devices to generate the aggregated Fisher information matrix (Page 3, paragraph starting on page 2 and continuing into page 3 with [Now if w; fits the data at client.…]; clients can replace the full Fisher F; with another practical approximation F; such as the diagonal Fisher or K-FAC (21); also equation 7 and page 4 section 4 Practical Variants of FedFisher and Experiments).

Regarding claim 12, as mentioned above in the discussion of claim 1, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein determining the Fisher information matrix for the client data set based on the global weights of the global ML model comprises: identifying a portion of the client data set (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F; to the server, we can use Proposition 2, Equation (3) and Equation (5) to approximate the logarithm of the global posterior); and determining, based on the portion of the client data set and based on the global weights of the global ML model, the Fisher information matrix (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F; to the server, we can use Proposition 2, Equation (3) and Equation (5) to approximate the logarithm of the global posterior).

Regarding claim 13, as mentioned above in the discussion of claim 1, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein generating the server update for the global ML model based on processing the corresponding server data and based on the corresponding EWC loss term for each of the global weights comprises: obtaining the corresponding server data (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model; also equations 6 - 7); processing, using the global ML model, the corresponding server data to generate predicted output (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model; also equations 6 - 7); determining, based on the predicted output, a loss (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model; also equations 6 - 7); and generating, based on the loss and based on the corresponding EWC loss term for each of the global weights, the corresponding server update (Page 19, paragraph starting with [FedFisher (Diag) and FedFisher (K-FAC).…]; We measure the validation performance after every 100 steps and use the model which achieves the best validation performance as the final FedFisher model; also equations 6 - 7).

Regarding claim 14, as mentioned above in the discussion of claim 13, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein determining the loss based on the predicted output is using a supervised learning technique (Page 1, paragraph starting with [Here, A1 is the number.…]; The loss function€(-,·) penalizes the difference between the prediction of the model J(w, x) and true label y. We use V = {Vi}~1 to denote the collection of data across all clients and N = 1\1 n to denote the total data samples across clients.).

Regarding claim 15, as mentioned above in the discussion of claim 13, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein determining the loss based on the predicted output is using an unsupervised or semi-supervised learning technique (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; Assuming clients compute and send back wi and F; to the server, we can use Proposition 2, Equation (3) and Equation (5) to approximate the logarithm of the global posterior).

Regarding claim 19, Jhunjhunwala Divyansh et al. teaches a method implemented by one or more processors of a client device, the method comprising: receiving, from a remote system, global weights of a global machine learning (ML) model (Page 1, paragraph starting with [Most standard FL algorithms…]; between clients and server in order to train a global model in federated settings; also Algorithm 1 lines 2 - 3); obtaining a client data set that is accessible locally at the client device and that is not accessible by the remote system (Page 1, paragraph starting with [Data collection and storage is becoming…]; decentralized data that is distributed across a network of clients under the supervision of a central server; also, Page 1, paragraph starting with [Here, A1 is the number…]; A1 is the number of clients, Di is the i-th client's local dataset consisting of input-label pairs); determining, based on the global weights of the global ML model, a Fisher information matrix for the client data set (Abstract, novel algorithm for one-shot FL that makes use of the Fisher information matrices computed at the local models of clients); and transmitting, to the remote system, the Fisher information matrix for the client data set (Page 2, paragraph starting with [Our Contributions. In this…]; using the local models at each client and the Fisher information matrices computed at these local models; also equation 4 and algorithm 1 lines 9 - 10), wherein transmitting the Fisher information matrix for the client data set to the remote system causes the remote system to: determine, based on the Fisher information matrix received from the client device and based on a plurality of additional Fisher information matrices received from corresponding additional client devices, a corresponding elastic weight consolidation (EWC) loss term for each of the global weights (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also Algorithm 1 lines 9 - 10); generate, based on processing corresponding server data remotely at the remote system and using the global ML model (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also equations 6 – 7), and based on the corresponding EWC loss term for each of the global weights, a server update for the global ML model (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also equations 6 – 7); and update, based on the server update for, the global weights of the global ML model to generate an updated global ML model (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also equations 6 – 7).

Regarding claim 20, Jhunjhunwala Divyansh et al. teaches a method implemented by one or more processors of a remote system, the method comprising: receiving, from a client device, a Fisher information matrix, the Fisher information matrix being generated locally at the client device based on global weights, of a global machine learning (ML) model (Page 1, paragraph starting with [Most standard FL algorithms…]; between clients and server in order to train a global model in federated settings; also Algorithm 1 lines 2 - 3), and for a client data set that is accessible locally at the client device and that is not accessible by the remote system (Page 1, paragraph starting with [Data collection and storage is becoming…]; decentralized data that is distributed across a network of clients under the supervision of a central server; also, Page 1, paragraph starting with [Here, A1 is the number…]; A1 is the number of clients, Di is the i-th client's local dataset consisting of input-label pairs); determining, based on the Fisher information matrix received from the client device and based on a plurality of additional Fisher information matrices received from corresponding additional client devices, a corresponding elastic weight consolidation (EWC) loss term for each of the global weights (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also Algorithm 1 lines 9 - 10); generating, based on processing corresponding server data remotely at the remote system and using the global ML model, and based on the corresponding EWC loss term for each of the global weights (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also equations 6 – 7), a server update for the global ML model (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also equations 6 – 7); and updating, based on the server update, the global weights of the global ML model to generate an updated global ML model (Page 3, paragraph starting with [Computing Mode of Global Posterior.…]; clients compute and send back wi and F to the server; also equations 6 – 7).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 16 – 18 are rejected under 35 U.S.C. 103 as being unpatentable over Jhunjhunwala Divyansh et al. (Towards a Theoretical and Practical Understanding of One-Shot Federated Learning with Fisher Information [Reference provided by applicant in IDS dated 11/15/2024]) in view of Beaufays (US PgPub No. 2023/0177382).
Regarding claim 16, as mentioned above in the discussion of claim 1, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein the global ML model is an based global ML model (Page 4, paragraph starting with [Two of the most popular approximations.…]; The datasets that we use are (i) MNIST (30), (ii) FashionMNIST (3 I) (iii) SVHN (32) (iv) CIFAR-10 (33) and (v) CINIC-10 (34).).
However, Jhunjhunwala Divyansh et al. fails to teach audio based global ML model that is utilized in processing audio data. Beaufays, on the other hand teaches audio based global ML model that is utilized in processing audio data.
More specifically, Beaufays teaches audio based global ML model that is utilized in processing audio data (paragraph 0007).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention (AIA ) to incorporate the teachings of Beaufays with the teachings of Jhunjhunwala Divyansh et al. because in at least paragraph 0009 – 0010, 0029, 0033, 0071, and 0075 Beaufays teaches that using the invention resulting in improved efficiency in federated learning, thereby improving the system of Jhunjhunwala Divyansh et al.

Regarding claim 17, as mentioned above in the discussion of claim 1, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein the global ML model is an based global ML model (Page 4, paragraph starting with [Two of the most popular approximations.…]; The datasets that we use are (i) MNIST (30), (ii) FashionMNIST (3 I) (iii) SVHN (32) (iv) CIFAR-10 (33) and (v) CINIC-10 (34).).
However, Jhunjhunwala Divyansh et al. fails to teach vision-based global ML model that is utilized in processing vision data. Beaufays, on the other hand teaches vision-based global ML model that is utilized in processing vision data.
More specifically, Beaufays teaches vision-based global ML model that is utilized in processing vision data (paragraph 0007).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention (AIA ) to incorporate the teachings of Beaufays with the teachings of Jhunjhunwala Divyansh et al. because in at least paragraph 0009 – 0010, 0029, 0033, 0071, and 0075 Beaufays teaches that using the invention resulting in improved efficiency in federated learning, thereby improving the system of Jhunjhunwala Divyansh et al.

Regarding claim 18, as mentioned above in the discussion of claim 1, Jhunjhunwala Divyansh et al. teaches all of the limitations of the parent claim.  Additionally, Jhunjhunwala Divyansh et al. teaches wherein the global ML model is an based global ML model (Page 4, paragraph starting with [Two of the most popular approximations.…]; The datasets that we use are (i) MNIST (30), (ii) FashionMNIST (3 I) (iii) SVHN (32) (iv) CIFAR-10 (33) and (v) CINIC-10 (34).).
However, Jhunjhunwala Divyansh et al. fails to teach text-based global ML model that is utilized in processing textual data. Beaufays, on the other hand teaches text-based global ML model that is utilized in processing textual data.
More specifically, Beaufays teaches text-based global ML model that is utilized in processing textual data (paragraph 0007).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention (AIA ) to incorporate the teachings of Beaufays with the teachings of Jhunjhunwala Divyansh et al. because in at least paragraph 0009 – 0010, 0029, 0033, 0071, and 0075 Beaufays teaches that using the invention resulting in improved efficiency in federated learning, thereby improving the system of Jhunjhunwala Divyansh et al.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
SHALOUDEGI (US patent No. 2023/0117768) teaches a machine learning system with data processing.
Sommer (US patent No. 2010/0262454) teaches a machine learning system with data processing.
Lu (US patent No. 2012/0303557) teaches a machine learning system with data processing.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Usman A Khan whose telephone number is (571)270-1131.  The examiner can normally be reached on M - Th 5:30 AM - 2 PM, F 5:30 AM - Noon.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sinh Tran can be reached on (571)272-7564.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Usman Khan
/USMAN A KHAN/Primary Examiner, Art Unit 2637
02/24/2026
Read full office action
Prosecution Timeline

Aug 04, 2023
Application Filed
Feb 24, 2026
Non-Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/212,284
Patent 12604089
IMAGE CAPTURING APPARATUS HAVING AUDIO RECOGNITION, CONTROL METHOD THEREOF, AND STORAGE MEDIUM
2y 5m to grant Granted Apr 14, 2026
18/675,123
Patent 12604073
DEVICE AND FILTER ARRAY USED IN SYSTEM FOR GENERATING SPECTRAL IMAGE, SYSTEM FOR GENERATING SPECTRAL IMAGE, AND METHOD FOR MANUFACTURING FILTER ARRAY
2y 5m to grant Granted Apr 14, 2026
17/910,232
Patent 12598376
CAMERA SYSTEM, COMMUNICATION METHOD, SIGNAL PROCESSING DEVICE, AND CAMERA FOR COMMUNICATING VIA DIFFERENT TYPES OF WIRELESS COMMUNICATION
2y 5m to grant Granted Apr 07, 2026
18/629,231
Patent 12598384
IMAGING DEVICE WITH FILTER SWITCHING, METHOD FOR CONTROLLING THE SAME, AND STORAGE MEDIUM
2y 5m to grant Granted Apr 07, 2026
18/582,251
Patent 12591169
Remotely controllable mobile video studio with integrated teleprompter, camera, lighting and microphone
2y 5m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
75%
Grant Probability
87%
With Interview (+12.5%)
2y 9m
Median Time to Grant
Low
PTA Risk
Based on 866 resolved cases by this examiner. Grant probability derived from career allow rate.