Office Action Analysis: 17920839 — FEDERATED LEARNING OPTIMIZATIONS

Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The amendments filed 12/29/2025 have been entered.
Claims 78-102 remain pending within the application.
The amendments filed 12/29/2025 are sufficient to overcome each and every objection previously set forth in the Non-Final Office Action mailed 07/28/2025. The objections have been withdrawn.


Information Disclosure Statement
The information disclosure statement filed 12/24/2025 fails to comply with 37 CFR 1.98(a)(2)(iii), which requires a copy of a pending U.S. application that is being cited in an IDS if (A) the cited information is not part of the specification, including the claims, and the drawings (e.g., an Office Action, remarks in an amendment paper, etc.). See MPEP 609.04(a). The IDS has been placed in the application file, but the information referred to therein has not been considered.


Allowable Subject Matter
Claims 79, 96, and 100 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 101 set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

The claims 78-102 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Is the claim directed to a process, machine, manufacture, or composition of matter?
Claims 78-94 are directed to an apparatus, hence fall within the statutory category of a machine. 
Claims 95-98 are directed to a method, hence fall within the statutory category of a process.
Claims 99-102 are directed to non-transitory computer-readable media, hence fall within the statutory category of a machine.
Thus, each of the claims fall within one of the four statutory categories.
Claim 78 includes the steps of:
An apparatus of an edge computing node to be operated in an edge computing network, the apparatus including an interconnect interface to connect the apparatus to one or more components of the edge computing node, and a processor to:
cause an initial set of weights for a global machine learning (ML) model to be transmitted a set of client compute nodes of the edge computing network; 
process Hessians computed by each of the client compute nodes based on a dataset stored on the client compute node; 
evaluate a gradient expression for the ML model based on a second dataset and an updated set of weights received from the client compute nodes; and 
generate a meta-updated set of weights for the global model based on the initial set of weights, the Hessians received, and the evaluated gradient expression.

Step 2A Prong 1: Does the claim recite an abstract idea, law of nature, or natural phenomenon?
The broadest reasonable interpretation of the following limitations falls within the mental process groupings of abstract ideas because they cover concepts performed in the human mind, including observation, evaluation, judgment, and opinion. See MPEP 2106.04(a)(2), subsection III. The claim(s) recite(s) in part:
“evaluate a gradient expression for the ML model based on a second dataset and an updated set of weights received from the client compute nodes“. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because evaluating gradient expressions for a machine learning model, recited generically, based on a dataset and received weights encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
“generate a meta-updated set of weights for the global model based on the initial set of weights, the Hessians received, and the evaluated gradient expression”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because generating updated weights for a global model, recited generically, based on initial weights, received Hessians, and evaluated gradients expressions encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.

Step 2A Prong 2: Does the claim recite additional elements that integrate the judicial exception into a practical application?
The judicial exception is not integrated into a practical application. In particular, The claim(s) recite(s) in part:
“An apparatus of an edge computing node to be operated in an edge computing network, the apparatus including an interconnect interface to connect the apparatus to one or more components of the edge computing node, and a processor to:”. As drafted and under its broadest reasonable interpretation, this limitation recites additional elements which amount to generic computer components recited at a high level of generality, with merely the words “apply it” or an equivalent with the judicial exception, merely including instructions to implement an abstract idea on the additional elements, or merely using the additional elements as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f).
“client compute nodes”. As drafted and under its broadest reasonable interpretation, this limitation recites additional elements which amount to generic computer components recited at a high level of generality, with merely the words “apply it” or an equivalent with the judicial exception, merely including instructions to implement an abstract idea on the additional elements, or merely using the additional elements as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f).
“cause an initial set of weights for a global machine learning (ML) model to be transmitted a set of client compute nodes of the edge computing network”. As drafted and under its broadest reasonable interpretation, this limitation recites transmitting data, which is mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05.
“set of weights received from the client compute nodes”. As drafted and under its broadest reasonable interpretation, this limitation recites receiving data, which is mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05.
“process Hessians computed by each of the client compute nodes based on a dataset stored on the client compute node”. As drafted and under its broadest reasonable interpretation, this limitation recites receiving data, which is mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05.
The additional elements have been considered both individually and as an ordered combination in order to determine whether they integrates the exception into a practical application. Therefore, no meaningful claim limits are imposed practicing the abstract idea. Accordingly, at Step 2A, prong two, the additional elements do not integrate the judicial exception into a practical application.

Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception?
In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed, the claim limitations reciting generic computer elements amounts to no more than mere instructions to apply the exception using a generic computer. The claim reciting the additional elements of “receiving” and/or “transmitting” amount to receiving/transmitting information.
“Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607, 610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network); but see DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245, 1258, 113 USPQ2d 1097, 1106 (Fed. Cir. 2014) ("Unlike the claims in Ultramercial, the claims at issue here specify how interactions with the Internet are manipulated to yield a desired result‐‐a result that overrides the routine and conventional sequence of events ordinarily triggered by the click of a hyperlink." (emphasis added)) MPEP § 2106.05(d)(II)(i).
The additional elements have been considered both individually and as an ordered combination in order to determine whether they warrant significantly more consideration. Thus, the claim does not provide an inventive concept.
The claim is ineligible.

Claim 79, which depends upon claim 78, recite(s) in part:
“wherein the processor is to generate the meta-updated set of weights according to: 

    PNG
    media_image1.png
    44
    274
    media_image1.png
    Greyscale

where wt+1 represents the meta-updated set of weights, wt represents the initial set of weights,             
                a
            
         represents a learning rate for the ML model, I represents an identity matrix,             
                β
            
         represents a gradient step size for the ML model, hk represents the Hessian from the k-th client compute node, and gk(wkt+1) represents the evaluated gradient expression of the ML model for the k-th client compute node.”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of mathematical concepts and calculations because it comprises application of mathematical formulas. See MPEP 2106.04(a)(2), subsection I. 
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 80, which depends upon claim 78, recite(s) in part:
“wherein the processor is to cause a selection of the set of client compute nodes randomly from a larger set of client compute nodes”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because randomly selecting nodes from a larger set of nodes encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 81, which depends upon claim 78, recite(s) in part:
“wherein the processor is further to cause a clustering of a larger set of client compute nodes based on their data distributions ”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because clustering nodes based on data distribution encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
“selection of the set of client compute nodes from the larger set of client compute nodes based on the clustering”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because selecting nodes based on clustering encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 82, which depends upon claim 81, recite(s) in part:
“wherein the processor is to cause the clustering based on probability mass function information or a distance metric indicating a distance between data distributions for data on the client compute nodes”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because clustering nodes based on probability mass function information or a distance metric indicating a distance between data distributions encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.

The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 83, which depends upon claim 82, recite(s) in part:
“wherein the probability mass function information includes a probability mass function of label data associated with training examples of the client compute nodes”. As drafted and under its broadest reasonable interpretation, this limitation further clarifies the mental evaluation of clustering nodes based on probability mass function information or a distance metric indicating a distance between data distributions recited in claim 82, by introducing a probability mass function of label data associated with training examples of the client compute nodes, and thus falls under the same analysis.
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 84, which depends upon claim 82, recite(s) in part:
“wherein the distance metric is a KL-divergence metric”. As drafted and under its broadest reasonable interpretation, this limitation further clarifies the mental evaluation of clustering nodes based on probability mass function information or a distance metric indicating a distance between data distributions recited in claim 82, by introducing a KL-divergence metric, and thus falls under the same analysis.
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 85, which depends upon claim 81, recite(s) in part:
“wherein the processor is to cause the selection of the set of client compute nodes based at least in part on one or more of communication capability or compute ability received from each client compute node from a larger set of client compute nodes”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because selecting nodes based on communication capability or compute ability encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
“compute ability received from each client compute node”. As drafted and under its broadest reasonable interpretation, this limitation recites receiving data, which is mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05.

The claim reciting the additional elements of mere data gathering do not integrate the judicial exception into practical application. The additional elements have been considered both individually and as an ordered combination in order to determine whether they integrates the exception into a practical application.
The claim reciting the additional elements of “receiving” or “transmitting” amount to receiving/transmitting information. The additional elements have been considered both individually and as an ordered combination in order to determine whether they warrant significantly more consideration. The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 86, which depends upon claim 81, recite(s) in part:
“wherein the processor is to cause clustering based on Bregman's k-means clustering or affinity propagation analysis”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because clustering based on Bregman's k-means clustering or affinity propagation analysis encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 87, which depends upon claim 78, recite(s) in part:
“wherein the dataset stored on the client and the second dataset each include a set of training examples and a set of label values associated with the training examples”. As drafted and under its broadest reasonable interpretation, this limitation further clarifies the mental evaluation of evaluating a gradient expression, as well as the mere data gathering of processing Hessians recited in claim 78, by introducing a set of training examples and a set of label values associated with the training examples, and thus falls under the same analyses.
The claim reciting the additional elements of mere data gathering do not integrate the judicial exception into practical application. The additional elements have been considered both individually and as an ordered combination in order to determine whether they integrates the exception into a practical application.
The claim reciting the additional elements of “receiving” or “transmitting” amount to receiving/transmitting information. The additional elements have been considered both individually and as an ordered combination in order to determine whether they warrant significantly more consideration. The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 88, which depends upon claim 78, recite(s) in part:
“determine a data batch size for each of a plurality of client compute nodes, wherein the data batch size for each client compute node is based on compute capabilities of the client compute node and indicates a number of training examples to be used by the client compute node in performing a round of federated machine learning training”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because determining a batch size based on compute capabilities encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
“cause the data batch size determined for each client compute node to be transmitted to the corresponding client compute node.”. As drafted and under its broadest reasonable interpretation, this limitation recites transmitting data, which is mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05.
The claim reciting the additional elements of mere data gathering do not integrate the judicial exception into practical application. The additional elements have been considered both individually and as an ordered combination in order to determine whether they integrates the exception into a practical application.
The claim reciting the additional elements of “receiving” or “transmitting” amount to receiving/transmitting information. The additional elements have been considered both individually and as an ordered combination in order to determine whether they warrant significantly more consideration. The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 89, which depends upon claim 78, recite(s) in part:
“determine a reference time indicating an amount of time in which clients are to perform a round of federated machine learning training”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because determining a reference time for performing rounds of federated machine learning training encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
“cause the reference time to be transmitted to each of a plurality of clients of the edge computing network”. As drafted and under its broadest reasonable interpretation, this limitation recites transmitting data, which is mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05. 
“obtain data batch size information from each client indicating a number of training examples to be used by the client to perform a round of federated machine learning training within the reference time.”. As drafted and under its broadest reasonable interpretation, this limitation recites receiving data, which is mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05.
The claim reciting the additional elements of mere data gathering do not integrate the judicial exception into practical application. The additional elements have been considered both individually and as an ordered combination in order to determine whether they integrates the exception into a practical application.
The claim reciting the additional elements of “receiving” and/or “transmitting” amount to receiving/transmitting information. The additional elements have been considered both individually and as an ordered combination in order to determine whether they warrant significantly more consideration. The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 90, which depends upon claim 78, recite(s) in part:
“wherein the processor is further to perform reinforcement learning to determine hyper-parameters for federated ML training of the global ML model, by performing operations comprising”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because performing reinforcement learning to determine hyper-parameters for federated ML training of global ML model, recited at a high level of generality, encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
“obtaining state information from clients of the edge computing network”. As drafted and under its broadest reasonable interpretation, this limitation recites receiving data, which is mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g) (“whether the limitation is significant”). In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim. These limitations amount to necessary data gathering and outputting. See MPEP 2106.05.
“selecting a set of action vectors corresponding to the hyper-parameters”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because selecting a set of action vectors corresponding to hyper-parameters encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
“performing rounds of a federated ML training within the edge computing network using the action vectors to update the global ML model”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because performing rounds of a federated ML training using the action vectors to update a global ML model encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
 “determining a measure of accuracy of the updated global ML model”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because determining accuracy of an updated global ML model encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
The claim reciting the additional elements of mere data gathering do not integrate the judicial exception into practical application. The additional elements have been considered both individually and as an ordered combination in order to determine whether they integrates the exception into a practical application.
The claim reciting the additional elements of “receiving” and/or “transmitting” amount to receiving/transmitting information. The additional elements have been considered both individually and as an ordered combination in order to determine whether they warrant significantly more consideration. The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 91, which depends upon claim 90, recite(s) in part:
“wherein the state information comprises one or more of statistics of ML parameter updates from each client compute node of the edge computing network, a cosine similarity of ML parameter updates from each client compute node, loss metrics for each client compute node, a learning rate for each client compute node, a number of local federated ML training epochs performed by each client compute node, a number training data samples used by each client compute node, an average data rate supported between the client compute node and the central server, an energy budget of the client compute node, a time to compute a gradient update at each client compute node, and a time to perform a memory access at each client compute node.”. As drafted and under its broadest reasonable interpretation, this limitation further clarifies the mere data gathering of obtaining state information recited in claim 90, by introducing one or more of statistics of ML parameter updates from each client compute node of the edge computing network, a cosine similarity of ML parameter updates from each client compute node, loss metrics for each client compute node, a learning rate for each client compute node, a number of local federated ML training epochs performed by each client compute node, a number training data samples used by each client compute node, an average data rate supported between the client compute node and the central server, an energy budget of the client compute node, a time to compute a gradient update at each client compute node, and a time to perform a memory access at each client compute node, and thus falls under the same analyses. 
The claim reciting the additional elements of mere data gathering do not integrate the judicial exception into practical application. The additional elements have been considered both individually and as an ordered combination in order to determine whether they integrates the exception into a practical application.
The claim reciting the additional elements of “receiving” and/or “transmitting” amount to receiving/transmitting information. The additional elements have been considered both individually and as an ordered combination in order to determine whether they warrant significantly more consideration. The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 92, which depends upon claim 90, recite(s) in part:
“wherein the action vectors comprise one or more of a sampling probability for each client compute node, a coding redundancy to be used by each client compute node for coded federated ML training, an uplink transmit power to be used by the client compute node, a bandwidth to be allocated to the client compute node, and a scaling factor to be applied to the hyper-parameters”. As drafted and under its broadest reasonable interpretation, this limitation further clarifies the mental evaluation of selecting a set of action vectors recited in claim 90, by introducing wherein the action vectors comprise one or more of a sampling probability for each client compute node, a coding redundancy to be used by each client compute node for coded federated ML training, an uplink transmit power to be used by the client compute node, a bandwidth to be allocated to the client compute node, and a scaling factor to be applied to the hyper-parameters, and thus falls under the same analyses. 
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 93, which depends upon claim 90, recite(s) in part:
“wherein the hyper-parameters determined via the reinforcement learning comprise one or more of a learning rate for the federated ML training and a weight regularization coefficient”. As drafted and under its broadest reasonable interpretation, this limitation further clarifies the mental evaluation of performing reinforcement learning to determine hyper-parameters recited in claim 90, by introducing wherein the hyper-parameters determined via the reinforcement learning comprise one or more of a learning rate for the federated ML training and a weight regularization coefficient, and thus falls under the same analyses. 
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claim 94, which depends upon claim 90, recite(s) in part:
“further comprising performing the reinforcement learning across multiple hyper-parameter scenarios using a plurality of trials”. As drafted and under its broadest reasonable interpretation, this limitation recites an abstract idea of a mental process because performing reinforcement learning across multiple hyper-parameter scenarios using a plurality of trials encompasses mental evaluations that are practically performed in the human mind, but for the recitation of generic computer components. Even if most humans would use a physical aid, like a pen and paper or a calculator, to make such evaluations, the use of a physical aid would not negate the mental nature of this limitation. See MPEP 2106.04(a)(2), subsection III.B.
The claim does not integrate the judicial exception into practical application.
The claim limitations do not recite additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim is ineligible.

Claims 95 and 99 are substantially similar to claim 78, and thus are rejected on the same basis as claim 78.
Claims 96 and 100 are substantially similar to claim 79, and thus are rejected on the same basis as claim 79.
Claims 97-98 and 101-102 are substantially similar to claims 81-82, and thus are rejected on the same basis as claims 81-82.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 78, 81, 85, 87, 95, 99, 97, and 101 are rejected under 35 U.S.C. 103 as being unpatentable over Moshe Guttmann (Pub. No.: US 2020/0202243 A1), hereafter Guttmann, in view of Sattler et al. (“Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints”, dated October 2019), as cited in the IDS dated 12/02/2022, hereafter Sattler.

Regarding claim 78, Guttmann discloses:
An apparatus of an edge computing node to be operated in an edge computing network, the apparatus including an interconnect interface to connect the apparatus to one or more components of the edge computing node, and a processor to (Guttmann, ¶[0032] and Fig. 4A and 4B),
… a set of weights for a global machine learning (ML) model to be transmitted to a set of client compute nodes of the edge computing network (Guttmann, ¶[0014], ¶[0143] teaches transmitting global updates as transmitting a set of weights for a global machine learning, to client nodes),
process Hessians computed by each of the client compute nodes based on a dataset stored on the client compute node (Guttmann, ¶[0122] teaches receiving an update based on a Hessian corresponding to training examples of dataset stored on the client compute node, i.e., external device),
evaluate a gradient expression for the ML model based on a second dataset and an updated set of weights received from the client compute nodes (Guttmann, Fig. 6, ¶[0016], ¶[0020], ¶[0113] and ¶[0077] teaches evaluating a gradient descent update of the global model based on a validation dataset as a second dataset, through hyperparameter setting, and an updated set of weights received from the client compute nodes),
generate a meta-updated set of weights for the global model based on the initial set of weights, the Hessians received, and the evaluated gradient expression (Guttmann, ¶[0124], ¶[0134] and Fig. 7 teaches determining a global update as a meta updated set of weights for the global model using received update information based on a gradient and/or a Hessian, and the evaluated gradient information).

Guttmann teaches a set of weights for a global machine learning (ML) model to be transmitted to a set of client compute nodes of the edge computing network , but does not teach the transmitted set of weights to be an initial set of weights for a global machine learning (ML) model.

Sattler discloses:
an initial set of weights for a global machine learning (ML) model ….client…(Sattler, page 8,  Algorithm 5 line 3 teaches initializing the set of weights for a global model for a client).

Guttmann and Sattler are analogous art because they are from the same field of endeavor, distributed learning.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann to include initializing the set of weights for a global model for a client, based on the teachings of Sattler. One of ordinary skill in the art would have been motivated to make this modification in order to achieve drastic improvements over the Federated Learning baseline in terms of classification accuracy, as suggested by Sattler (page 11, right column, paragraph 3, lines 2-4).

Regarding claim 81, Guttmann, in view of Sattler, discloses the apparatus of claim 78, wherein the processor…. Sattler further discloses:
cause a clustering of a larger set of client compute nodes based on their data distributions and selection of the set of client compute nodes from the larger set of client compute nodes based on the clustering (Sattler, page 6, Algorithm 3 and page 1, abstract, lines 6-10 “we present Clustered Federated Learning (CFL), a novel Federated Multi-Task Learning (FMTL) framework, which exploits geometric properties of the FL loss surface, to group the client population into clusters with jointly trainable data distributions.” Teaches a clustering of a larger set of client compute nodes based on their data distributions and selection of the set of client compute nodes from the larger set of client compute nodes based on the clustering).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann to include cause a clustering of a larger set of client compute nodes based on their data distributions and selection of the set of client compute nodes from the larger set of client compute nodes based on the clustering, based on the teachings of Sattler. One of ordinary skill in the art would have been motivated to make this modification in order to achieve drastic improvements over the Federated Learning baseline in terms of classification accuracy, as suggested by Sattler (page 11, right column, paragraph 3, lines 2-4).

Regarding claim 85, Guttmann, in view of Sattler, discloses the apparatus of claim 81. Guttmann further discloses:
wherein the processor is to cause the selection of the set of client compute nodes based at least in part on one or more of communication capability or compute ability received from each client compute node from a larger set of client compute nodes (Guttmann, ¶[0074] teaches load balancing as the selection of the set of client compute nodes based at least in part on one or more of communication capability or compute ability received from each client).


Regarding claim 87, Guttmann, in view of Sattler, discloses the apparatus of claim 78. Guttmann further discloses:
wherein the dataset stored on the client and the second dataset each include a set of training examples and a set of label values associated with the training examples (Guttmann, ¶[0093] teaches the dataset stored on the client and the second dataset each include a set of training examples and a set of label values associated with the training examples).

Claims 95 and 99 are substantially similar to claim 78, and thus are rejected on the same basis as claim 78.
Claims 97 and 101 are substantially similar to claim 81, and thus are rejected on the same basis as claim 81.

Claims 80, 82, 83, 98, and 102 are rejected under 35 U.S.C. 103 as being unpatentable over Moshe Guttmann (Pub. No.: US 2020/0202243 A1), hereafter Guttmann, in view of Sattler et al. (“Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints”, dated October 2019), as cited in the IDS dated 12/02/2022, hereafter Sattler, in further view of Zhu et al. (“Broadband Analog Aggregation for Low-Latency Federated Edge Learning”), hereafter Zhu.


Regarding claim 80, Guttmann, in view of Sattler, discloses the apparatus of claim 78. Sattler further discloses:
wherein the processor is to cause a selection of the set of client compute nodes … from a larger set of client compute nodes (Sattler, page 7, Fig. 4 and its caption teaches a selection of the set of client compute nodes from a larger set of client compute nodes).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann to include wherein the processor is to cause a selection of the set of client compute nodes … from a larger set of client compute nodes, based on the teachings of Sattler. One of ordinary skill in the art would have been motivated to make this modification in order to achieve drastic improvements over the Federated Learning baseline in terms of classification accuracy, as suggested by Sattler (page 11, right column, paragraph 3, lines 2-4).

While Guttmann, in view of Sattler, discloses wherein the processor is to cause a selection of the set of client compute nodes … from a larger set of client compute nodes, they do not discloses the selection to be random.

Zhu discloses:
selection of the set of client compute nodes randomly (Zhu, page 504, right column, Section D, paragraph 1, lines 2-3 “the number of scheduled devices is now a random variable,”).

Guttmann, Sattler, and Zhu are analogous art because they are from the same field of endeavor, machine learning models and weight updates.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include selection of the set of client compute nodes randomly, based on the teachings of Zhu. One of ordinary skill in the art would have been motivated to make this modification in order to improve the learning performance, as suggested by Zhu (page 496, left column, penultimate paragraph, last 2 lines).

Regarding claim 82, Guttmann, in view of Sattler, discloses the apparatus of claim 81, wherein the processor to cause the clustering. Sattler further discloses:
cause the clustering based on …a distance metric indicating a distance between data distributions for data on the client compute nodes (Sattler, page 2, right column, paragraph 3, lines 1-2 “provide every client with a model that optimally fits it’s local data distribution” and page 5, right column. The 4 lines above equation 35 ”the server separates the clients into two clusters in such a way that the maximum similarity between clients from different clusters is minimized” and page 3, right column, paragraph below equation 14, lines 2-5 “we can distinguish clients based on their hidden data generating distribution by inspecting the cosine similarity between their gradient updates.” Teaches clustering based on a distance metric indicating a distance between data distributions for data on the client compute nodes).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann to include cause the clustering based on probability mass function information or a distance metric indicating a distance between data distributions for data on the client compute nodes, based on the teachings of Sattler. One of ordinary skill in the art would have been motivated to make this modification in order to achieve drastic improvements over the Federated Learning baseline in terms of classification accuracy, as suggested by Sattler (page 11, right column, paragraph 3, lines 2-4).

Guttmann, in view of Settler, does not explicitly disclose:
cause the clustering based on probability mass function information.

Zhu teaches:
cause the clustering based on probability mass function information (Zhu, Fig. 1 and page 497, right column, Lemma 3 teaches “the number of scheduled users follows a Binomial distribution with the probability mass function (PMF) given by…” and page 504, left column, paragraph 2, lines 4-6 “One direction is to further enhance the aggregation performance of BAA by exploiting the clustering structure in device distribution for scheduling” teaches clustering based on probability mass function information).

Guttmann, Sattler, and Zhu are analogous art because they are from the same field of endeavor, machine learning models and weight updates.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include cause the clustering based on probability mass function information, based on the teachings of Zhu. One of ordinary skill in the art would have been motivated to make this modification in order to improve the learning performance, as suggested by Zhu (page 496, left column, penultimate paragraph, last 2 lines).

Regarding claim 83, Guttmann, in view of Sattler, in view of Zhu, discloses the apparatus of claim 82, wherein the processor to cause the clustering. Zhu further discloses:
wherein the probability mass function information includes a probability mass function of label data associated with training examples of the client compute nodes ( Zhu, Fig. 1 and page 494, left column, paragraph 2, lines 4-5 “Each device collects a fraction of labelled training data” and page 497, right column, Lemma 3 teaches “the number of scheduled users follows a Binomial distribution with the probability mass function (PMF) …” and page 504, Appendix B teaches a probability mass function of label data associated with training examples of the client compute nodes).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include wherein the probability mass function information includes a probability mass function of label data associated with training examples of the client compute nodes, based on the teachings of Zhu. One of ordinary skill in the art would have been motivated to make this modification in order to improve the learning performance, as suggested by Zhu (page 496, left column, penultimate paragraph, last 2 lines).

Claims 98 and 102 are substantially similar to claim 82, and thus are rejected on the same basis as claim 82.

Claims 86 is rejected under 35 U.S.C. 103 as being unpatentable over Moshe Guttmann (Pub. No.: US 2020/0202243 A1), hereafter Guttmann, in view of Sattler et al. (“Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints”, dated October 2019), as cited in the IDS dated 12/02/2022, hereafter Sattler, in further view of Mansour et al. (“Three Approaches for Personalization with Applications to Federated Learning”), hereafter Mansour.


Regarding claim 86, Guttmann, in view of Sattler, discloses the apparatus of claim 81, wherein the processor to cause the clustering. Sattler discloses clustering, but does not disclose clustering based on Bregman's k-means clustering or affinity propagation analysis.

Mansour discloses:
cause clustering based on Bregman's k-means clustering or affinity propagation analysis (Mansour, page 5, paragraph 3, lines 1-2 “a natural approach is to cluster using a Bregman divergence defined over the distributions Dk” teaches clustering based on Bregman's k-means clustering).

Guttmann, Sattler, and Mansour are analogous art because they are from the same field of endeavor, machine learning models and weight updates.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include clustering based on Bregman's k-means clustering, based on the teachings of Mansour. One of ordinary skill in the art would have been motivated to make this modification in order to provide a trade-off between generalization and distribution mismatch, as suggested by Mansour (page 4 last line to page 5 first line).


Claims 88-89 are rejected under 35 U.S.C. 103 as being unpatentable over Moshe Guttmann (Pub. No.: US 2020/0202243 A1), hereafter Guttmann, in view of Sattler et al. (“Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints”, dated October 2019), as cited in the IDS dated 12/02/2022, hereafter Sattler, in further view of Choudhury et al. (Pub. No.: US 2020/0125926 A1), as cited in the IDS dated 12/02/2022, hereafter Choudhury.

Regarding claim 88, Guttmann, in view of Sattler, discloses the apparatus of claim 78. 
Sattler further discloses:
a data batch size for each of a plurality of client compute nodes … indicates a number of training examples to be used by the client compute node in performing a round of federated machine learning training (Sattler, page 1, right column, line 1 “mini-batches sampled from it’s local data Di” and equation (1) teaches a data batch size for each of a plurality of client compute nodes that indicates a number of training examples to be used by the client compute node in performing a round of federated machine learning training).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann to include a data batch size for each of a plurality of client compute nodes … indicates a number of training examples to be used by the client compute node in performing a round of federated machine learning training, based on the teachings of Sattler. One of ordinary skill in the art would have been motivated to make this modification in order to achieve drastic improvements over the Federated Learning baseline in terms of classification accuracy, as suggested by Sattler (page 11, right column, paragraph 3, lines 2-4).

Guttmann, in view of Sattler, discloses a data batch size for each of a plurality of client compute nodes … indicates a number of training examples to be used by the client compute node in performing a round of federated machine learning training, but does not disclose:
determine a… data batch size for each client compute node … based on compute capabilities of the client compute node,
cause the data batch size determined for each… compute node to be transmitted to the corresponding … compute node.

Choudhury discloses:
determine a… data batch size for each client compute node … based on compute capabilities of the client compute node (Choudhury, Fig. 5 and ¶[0014] teaches determining optimal batch size based on compute capabilities determined in step 504),
cause the data batch size determined for each client compute node to be transmitted to the corresponding … compute node (Choudhury, Fig. 5 and ¶[0030] teaches returning the optimal batch size to its corresponding layer).

Guttmann, Sattler, and Choudhury are analogous art because they are from the same field of endeavor, machine learning models and weight updates.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include determine a… data batch size for each client compute node … based on compute capabilities of the client compute node, cause the data batch size determined for each… compute node to be transmitted to the corresponding … compute node, based on the teachings of Choudhury. One of ordinary skill in the art would have been motivated to make this modification in order to increase throughput and/or reduce energy or power consumption, as suggested by Choudhury (paragraph [0014]).

Regarding claim 89, Guttmann, in view of Sattler, discloses the apparatus of claim 78 where clients perform federated machine learning training. Guttmann, in view of Sattler, do not disclose:
determine a reference time indicating an amount of time in which …  to perform a round of … machine learning training , 
cause the reference time to be transmitted…,
obtain data batch size information…indicating a number of training examples to be used … to perform a round of … machine learning training within the reference time.

Choudhury discloses:
determine a reference time indicating an amount of time in which …  to perform a round of … machine learning training (Choudhury, ¶[0020] teaches optimal time to perform inference computations as a reference time indicating an amount of time in which clients are to perform a round of machine learning training), 
cause the reference time to be transmitted … (Choudhury, ¶[0021] and Fig. 3 teaches transmitting the reference time),
obtain data batch size information…indicating a number of training examples to be used … to perform a round of … machine learning training within the reference time (Choudhury, Fig. 3 and Fig. 5 teaches obtaining data batch size information indicating a number of training examples to be used to perform a round of machine learning training within the reference time).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include determine a reference time indicating an amount of time in which …  to perform a round of … machine learning training, cause the reference time to be transmitted…, obtain data batch size information…indicating a number of training examples to be used … to perform a round of … machine learning training within the reference time, based on the teachings of Choudhury. One of ordinary skill in the art would have been motivated to make this modification in order to increase throughput and/or reduce energy or power consumption, as suggested by Choudhury (paragraph [0014]).

Claims 90-94 are rejected under 35 U.S.C. 103 as being unpatentable over Moshe Guttmann (Pub. No.: US 2020/0202243 A1), hereafter Guttmann, in view of Sattler et al. (“Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints”, dated October 2019), as cited in the IDS dated 12/02/2022, hereafter Sattler, in further view of Kegel et al. (Pub. No.: US 2019/0235940 A1), as cited in the IDS dated 12/02/2022, hereafter Kegel.


Regarding claim 90, Guttmann, in view of Sattler, discloses the apparatus of claim 78. 
Sattler further discloses:
perform… federated ML training of the global ML model (Sattler, page 2, right column, paragraph 4, lines 7-9 “Based on the theoretical insights in section II we present the Clustered Federated Learning Algorithm in section III”),
obtaining … information from clients of the edge computing network (Sattler, page 6, algorithms 1-3 teach obtaining information from a set of clients),
performing rounds of a federated ML training within the edge computing network … to update the global ML model (Sattler, page 6, algorithms 1-3 teach performing rounds of a federated ML training to update the global ML model),
determining a measure of accuracy of the updated global ML model (Sattler, page 10, Fig. 8).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann to include perform… federated ML training of the global ML model, obtaining … information from clients of the edge computing network, performing rounds of a federated ML training within the edge computing network … to update the global ML model, determining a measure of accuracy of the updated global ML model, based on the teachings of Sattler. One of ordinary skill in the art would have been motivated to make this modification in order to achieve drastic improvements over the Federated Learning baseline in terms of classification accuracy, as suggested by Sattler (page 11, right column, paragraph 3, lines 2-4).

While Guttmann, in view of Sattler, teaches perform… federated ML training of the global ML model, obtaining … information from clients of the edge computing network, performing rounds of a federated ML training within the edge computing network … to update the global ML model, determining a measure of accuracy of the updated global ML model, they do not disclose:
perform reinforcement learning to determine hyper-parameters for … training of … ML model, by performing operations comprising: obtaining state information … of … network ,
selecting a set of action vectors corresponding to the hyper-parameters… ,
performing rounds of … ML training … using the action vectors to update the …  model.

Kegel discloses:
perform reinforcement learning to determine hyper-parameters for … training of … ML model, by performing operations comprising: obtaining state information … of … network (Kegel, Fig. 2, Fig. 3, ¶[0019]  and ¶[0024] teaches obtaining state information of networks to determine error rates and confidence levels as hyperparameters for training a machine learning model),
selecting a set of action vectors corresponding to the hyper-parameters… (Kegel, Fig. 2, Fig. 3, ¶[0022] teaches selecting action vectors such as elements 209 and 215 in Fig. 2 or elements 303, 309, and etc. in Fig. 3 corresponding to the hyperparameters),
performing rounds of … ML training … using the action vectors to update the …  model (Kegel, Fig. 1, Fig. 2, Fig. 3, ¶[0022] teaches performing rounds of ML training using the action vectors to update the model)
Guttmann, Sattler, and Kegel are analogous art because they are from the same field of endeavor, machine learning models and weight updates.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include perform reinforcement learning to determine hyper-parameters for … training of … ML model, by performing operations comprising: obtaining state information … of … network, selecting a set of action vectors corresponding to the hyper-parameters… , performing rounds of … ML training … using the action vectors to update the …  model, based on the teachings of Kegel. One of ordinary skill in the art would have been motivated to make this modification in order to provide improved power management for neural networks, as suggested by Kegel (paragraph [0004]).

Regarding claim 91, Guttmann, in view of Sattler, in further view of Kegel, discloses the apparatus of claim 90. Kegel further discloses:
wherein the state information comprises one or more of 
statistics of ML parameter updates from each client compute node of the edge computing network,  a cosine similarity of ML parameter updates from each client compute node, loss metrics for each client compute node, a learning rate for each client compute node, a number of local federated ML training epochs performed by each client compute node, a number training data samples used by each client compute node, an average data rate supported between the client compute node and the central server, an energy budget of the client compute node, a time to compute a gradient update at each client compute node, and a time to perform a memory access at each client compute node (Kegel, ¶[0005-0007] teaches state information to comprise error rates as loss metrics and power settings as energy budgets).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include wherein the state information comprises one or more of statistics of ML parameter updates from each client compute node of the edge computing network,  a cosine similarity of ML parameter updates from each client compute node, loss metrics for each client compute node, a learning rate for each client compute node, a number of local federated ML training epochs performed by each client compute node, a number training data samples used by each client compute node, an average data rate supported between the client compute node and the central server, an energy budget of the client compute node, a time to compute a gradient update at each client compute node, and a time to perform a memory access at each client compute node, based on the teachings of Kegel. One of ordinary skill in the art would have been motivated to make this modification in order to provide improved power management for neural networks, as suggested by Kegel (paragraph [0004]).

Regarding claim 92, Guttmann, in view of Sattler, in further view of Kegel, discloses the apparatus of claim 90. Kegel further discloses:
wherein the action vectors comprise one or more of 
a sampling probability for each client compute node, a coding redundancy to be used by each client compute node for coded federated ML training, an uplink transmit power to be used by the … compute node, a bandwidth to be allocated to the … compute node, and a scaling factor to be applied to the hyper-parameters (Kegel, ¶[0018] and Fig. 1 teaches controlling voltage/frequency as power to be used or bandwidth allocated) .
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include wherein the action vectors comprise one or more of a sampling probability for each client compute node, a coding redundancy to be used by each client compute node for coded federated ML training, an uplink transmit power to be used by the … compute node, a bandwidth to be allocated to the …. compute node, and a scaling factor to be applied to the hyper-parameters, based on the teachings of Kegel. One of ordinary skill in the art would have been motivated to make this modification in order to provide improved power management for neural networks, as suggested by Kegel (paragraph [0004]).

Regarding claim 93, Guttmann, in view of Sattler, in further view of Kegel, discloses the apparatus of claim 90. Kegel further discloses:
wherein the hyper-parameters determined via the reinforcement learning comprise one or more of a learning rate for the federated ML training and a weight regularization coefficient (Kegel, ¶[0003] and [0033] teaches learning and adjusting the weights of the neural as learning rates determined through learning of the hyperparameters).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include wherein the hyper-parameters determined via the reinforcement learning comprise one or more of a learning rate for the federated ML training and a weight regularization coefficient, based on the teachings of Kegel. One of ordinary skill in the art would have been motivated to make this modification in order to provide improved power management for neural networks, as suggested by Kegel (paragraph [0004]).


Regarding claim 94, Guttmann, in view of Sattler, in further view of Kegel, discloses the apparatus of claim 90. Kegel further discloses:
performing the reinforcement learning across multiple hyper-parameter scenarios using a plurality of trials (Kegel, ¶[0024] teaches performing the reinforcement learning across multiple hyper-parameter scenarios using a plurality of trials).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, to include performing the reinforcement learning across multiple hyper-parameter scenarios using a plurality of trials, based on the teachings of Kegel. One of ordinary skill in the art would have been motivated to make this modification in order to provide improved power management for neural networks, as suggested by Kegel (paragraph [0004]).
Claim 84 is rejected under 35 U.S.C. 103 as being unpatentable over Moshe Guttmann (Pub. No.: US 2020/0202243 A1), hereafter Guttmann, in view of Sattler et al. (“Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints”, dated October 2019), as cited in the IDS dated 12/02/2022, hereafter Sattler, in further view of Zhu et al. (“Broadband Analog Aggregation for Low-Latency Federated Edge Learning”), hereafter Zhu, in further view of Mansour et al. (“Three Approaches for Personalization with Applications to Federated Learning”), hereafter Mansour.

Regarding claim 84, Guttmann, in view of Sattler, in further view of Zhu, discloses the apparatus of claim 82. 
Sattler teaches cause the clustering based on …a distance metric indicating a distance between data distributions for data on the client compute nodes in claim 84, but does not teach the distance metric is a KL-divergence metric.

Mansour discloses:
distance metric is a KL-divergence metric (Mansour, page 3, paragraph5, lines 1-2 “The divergence between distributions is often measured by a Bregman divergence such as KL-divergence” teaches measuring the distance in data distribution using KL-divergence metric).

Guttmann, Sattler, Zhu, and Mansour are analogous art because they are from the same field of endeavor, machine learning models and weight updates.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Guttmann, in view of Sattler, in further view of Zhu to include distance metric is a KL-divergence metric, based on the teachings of Mansour. One of ordinary skill in the art would have been motivated to make this modification in order to obtain better performance by matching the actual underlying client distribution, as suggested by Mansour (page 3, paragraph 4, lines 4-5).

Regarding claim 79, none of the prior art of record, alone or in combination, fairly teaches or suggest the limitation of " wherein the processor is to generate the meta-updated set of weights according to:  where wt+1 represents the meta-updated set of weights,  wt represents the initial set of weights, a represents a learning rate for the ML model, I represents an identity matrix, β represents a gradient step size for the ML model, hk represents the Hessian from the k-th client compute node, and gk(wkt+1) represents the evaluated gradient expression of the ML model for the k-th client compute node.", in the specific combination as recited in the claim. 

Claims 96 and 100 are substantially similar to claim 79, and the same rationale applies as claim 79.



Response to Arguments

Applicant's arguments filed 12/29/2025 have been fully considered with regards to the 35 U.S.C. 101  rejection, but they are not persuasive. 
The applicant asserts on page 12 of the remarks “Independent claim 1 recites an apparatus of an edge computing node in an edge computing network that inter alia transmits initial ML model weights to client compute nodes, processes Hessians computed by those nodes, and generates updates weights for the ML model. These limitations are directed to operations that can only be computed practically by computing nodes, and are not simply observations, evaluations, judgments or opinions-that is, they clearly cannot be practically performed by the human mind.”. The examiner respectfully disagrees, as one could reasonably process generate updated weights with the aid of pen and paper, by performing calculations required to update existing weight values. Transmitting model weights and processing hessians, interpreted under BRI as recited in the claims, are directed to mere data gathering and output recited at a high level of generality, and thus are insignificant extra-solution activity.
The applicant asserts on page s 15-16 of the remarks “The claims of the present Application are similar to those in the Ex Parte Desjardins decision, in that they are directed to improvements in how a machine learning model operates and learns... Here, the claims recite specific operations related to the specific training/updating of a machine learning model in an edge computing environment. While certain of these operations may involve mathematical concepts, the recited limitations amount to significantly more than merely applying the mathematical concepts to a computer environment.”. The Examiner respectfully disagrees, as no such improvements shown in the claims. The MPEP 2106.04(d)(1) discloses the evaluation of claimed improvements in the functioning of a computer or improvement to a technical field in step 2A prong two. The MPEP section discloses — "if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology. Second, if the specification sets forth an improvement in technology, the claim must be evaluated to ensure that the claim itself reflects the disclosed improvement. That is, the claim includes the components or steps of the invention that provide the improvement described in the specification ...". The applicant’s argument amounts to a general allegation that the claim defines a patentable invention without specifically pointing out how the language of the amended claim recites specific and meaningful limitation encompassing improvements in the recited fields.


Applicant's arguments filed 12/29/2025 have been fully considered with regards to the 35 U.S.C. 102/103 rejection, but they are not persuasive. 
The applicant asserts on pages 17-18 of the remarks “the Office Action points to Fig. 7 of Guttman as allegedly disclosing "generating a meta-updated set of weights for the global model based on [an] initial set of weights [for the model], the Hessians received [from client compute nodes], and an evaluated gradient expression [performed using an updated set of weights from the client compute nodes]." However, Fig. 7 of Guttman merely describes using first and second "update information" and "information related to the distribution of [a] plurality of training examples" to determined a global update to an "inference model". Guttman does not describe how the a global update is based on each of the recited items, i.e., an initial set of weights for the model, Hessians received from client compute nodes ( computed based on datasets stored on the client compute nodes), and an evaluated gradient expression (which is based on an updated set of model weights received from the client compute nodes).” The office action mailed 07/28/2025 relies on paragraphs [0124] and [0134] as well as Fig. 7 to teach generating a meta-updated set of weights for the global model based on the initial set of weights, the Hessians received, and the evaluated gradient expression. ¶[0134] discloses the initial set of weights and the evaluated gradient expressions, i.e. gradient descent updates, used for global updates. Furthermore, ¶[0122] teaches update information to be based on Hessians corresponding to training examples, and ¶[0130] explicitly discloses that this update information is used to determine a global update.



Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUMAIRA ZAHIN MAUNI whose telephone number is (703)756-5654. The examiner can normally be reached Monday - Friday, 9 am - 5 pm (ET).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MATT ELL can be reached at (571) 270-3264. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/H.Z.M./Examiner, Art Unit 2141                                                                                                                                                                                                        
/MATTHEW ELL/Supervisory Patent Examiner, Art Unit 2141
Read full office action
FEDERATED LEARNING OPTIMIZATIONS

This examiner grants 38% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

FEDERATED LEARNING OPTIMIZATIONS

This examiner grants 38% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email