Prosecution Insights
Last updated: April 19, 2026
Application No. 18/734,103

FEDERATED LEARNING FOR WIRELESS COMMUNICATIONS SYSTEMS

Non-Final OA §103
Filed
Jun 05, 2024
Examiner
TURRIATE GASTULO, JUAN CARLOS
Art Unit
2446
Tech Center
2400 — Computer Networks
Assignee
Apple Inc.
OA Round
1 (Non-Final)
72%
Grant Probability
Favorable
1-2
OA Rounds
3y 2m
To Grant
99%
With Interview

Examiner Intelligence

Grants 72% — above average
72%
Career Allow Rate
270 granted / 376 resolved
+13.8% vs TC avg
Strong +36% interview lift
Without
With
+35.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
28 currently pending
Career history
404
Total Applications
across all art units

Statute-Specific Performance

§101
13.8%
-26.2% vs TC avg
§103
55.4%
+15.4% vs TC avg
§102
14.3%
-25.7% vs TC avg
§112
8.4%
-31.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 376 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . DETAILED ACTION This action is in response to application filed 06/05/2024. Claims 1-20 are pending in this application. Information Disclosure Statement The information disclosure statement (IDS) submitted on 10/17/2025 has been placed in record and considered by the examiner. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1-6, 8-14, 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Keshavamurthy et al. (US 2025/0371370 A1) in view of Krouka et al. (US 2025/0265500 A1) in further view of Park et al. (“FedFwd: Federated Learning without Backpropagation” – Published June 19, 2023) Regarding claim 1, Keshavamurthy discloses a method for federated learning for a wireless communication system ([0001]: performing training of a model using federated learning), the method comprising: initializing parameters of models for user equipment (UE) devices ([0149]: A local training result reported by a particular UE may comprise updates for parameters of the local model at the particular UE. In some embodiments, the updates for the parameters of the local model are the values of the parameters of the local model when training of the local model has been completed at the UE); selecting, via a network node, two or more of the UE devices as selected UE devices to participate in federated training ([0174]: The selection of K UEs for local training can in some situations be random or based on any suitable UE selection scheme that takes into account the information included in the received FL reports (provided by the UEs). In the following example one of the selected UEs is UE1 300a….other selected UEs such as for example UE3 300c which could also have been selected as a first UE along with UE1 300a); aggregating, at the network node, the local models to generate a global model ([0037]: combining the local training results to generate aggregated training results for the model); iteratively selecting the two or more of the UE devices ([0147]: the FL aggregator 400 can send training configuration to the K selected UEs to enable the selected UEs to perform local training. The FL aggregator 400 is configured to then send a signal to each of the selected UEs (e.g., UE1 and UE3) to perform local training of their local models. For example, for iteration N 451, FL is performed with UE1 and UE3 as shown by the dashed box 411), and aggregating the local models until the global model converges ([0161]: the FL iteration sets can be repeated or continued until the FL model converges (i.e., until the parameters of the global model are optimized). However, Keshavamurthy does not disclose selecting, at the selected UE devices, one or more model layers as selected model layers to participate in the federated training. In an analogous art, Krouka discloses selecting, at the selected UE devices, one or more model layers as selected model layers to participate in the federated training ([0073]: the selected cut-layer index affects the processing energy (e.g., memory access and computation). Moreover, although the devices may share the same ML model architecture, the energy consumption may be different even for devices that selecting the same cut-layer index). Therefore, it would have been obvious before the effective filed date of the claimed invention to a person having ordinary skill in the art to modify Keshavamurthy to comprise “selecting, at the selected UE devices, one or more model layers as selected model layers to participate in the federated training” taught by Krouka. One of ordinary skilled in the art would have been motivated because it would have enabled the devices process a fraction of the ML model and transmit the output of the partitioning layer to the centralized server at every communication round (Krouka, [0068]). However, Keshavamurthy-Krouka does not disclose performing, at the selected UE devices, forward-forward learning on the selected model layers to generate local models for the selected model layers of the selected UE devices; iteratively selecting the one or more of the model layers, performing the forward-forward learning on the selected model layers. In an analogous art, Park discloses performing, at the selected UE devices, forward-forward learning on the selected model layers to generate local models for the selected model layers of the selected UE devices (pg. 2, right column, [0001]-[0002]: federated learning algorithm, FedFwd, which follows the core steps of FedAvg, encompassing three primary stages: 1) the selection of a client subset at each iteration, 2) execution of local parameter updates, and 3) subsequent aggregation of these updates at the server. The Forward-Forward algorithm, a greedy layer-wise learning technique, adopts an alternative approach to the traditional backpropagation’s one forward and one backward pass by employing two forward passes. This algorithm uniquely trains each layer by leveraging a measure of goodness); iteratively selecting the one or more of the model layers, performing the forward-forward learning on the selected model layers (Pg. 3, left column, [0003]: To ensure a fair comparison, we design FedFwd and FedAvgmodels to have the same number of layers and a similar number of parameters with a marginal parameter difference of approximately 1%). Therefore, it would have been obvious before the effective filed date of the claimed invention to a person having ordinary skill in the art to modify Keshavamurthy-Krouka to comprise “performing, at the selected UE devices, forward-forward learning on the selected model layers to generate local models for the selected model layers of the selected UE devices; iteratively selecting the one or more of the model layers, performing the forward-forward learning on the selected model layers” taught by Park. One of ordinary skilled in the art would have been motivated because it would have enabled the forward-forward algorithm to reduce the computational burden on local clients by eliminating the need to store all intermediate activations in memory (Park, pg. 1, right column, [0001]). Regarding claim 2, Keshavamurthy-Krouka-Park disclose the method for federated learning of claim 1, wherein the parameters are initialized with values that are random, pre-trained, or obtained from the network node (Keshavamurthy, [0138]: the central node provides a global model comprising parameters or data to the distributed nodes and each of the distributed nodes performs local training of a local model (referred to hereinafter a local model training) using a dataset comprising data of the distributed node during an iteration of FL). Regarding claim 3, Keshavamurthy-Krouka-Park disclose the method for federated learning of claim 1, wherein the UE devices are selected based on their availability, reliability, or randomly (Keshavamurthy, [0145]: The selection of UEs by the FL aggregator 400 can in some situations be random or based on any suitable UE selection scheme that takes into account the obtained FL reports (provided by the UEs). Regarding claim 4, Keshavamurthy-Krouka-Park disclose the method for federated learning of claim 1, wherein the one or more model layers are selected based on criteria determined by each of the selected UE devices, wherein a number of one or more model layers selected by the selected UE devices is based on available computational resources, transmission bandwidth, battery power, model evaluation errors, or local dataset size (Krouka, [0073]: the selected cut-layer index affects the processing energy (e.g., memory access and computation). Moreover, although the devices may share the same ML model architecture, the energy consumption may be different even for devices that selecting the same cut-layer index. The reason for this is that the transmission energy E.sub.t depends on the output size of the cut-layer, as well as the radio channel conditions of the devices). The same rationale applies as in claim 1. Regarding claim 5, Keshavamurthy-Krouka-Park disclose the method for federated learning of claim 1, wherein aggregating is performed by averaging layers (Park, pg.3, left column, [0002]: FedAvg by varying the size and depth of the hidden 92.57 Table 1. The comparison results between FedFwd (FF) and FedAvg (BP) on MNIST dataset. layers. We then evaluate the training speed of FedFwd and FedAvg based on the size of the mini-batch). The same rationale applies as in claim 1. Regarding claim 6, Keshavamurthy-Krouka-Park disclose the method for federated learning of claim 1, wherein model aggregation is a layer-wise operation (Krouka, [0149]: network node 150 may aggregate the gradients received from devices 110-1 and 110-2, and any other devices participating in the collaborative training. Aggregation of gradients may comprise any suitable method for combining the gradients of respective instances of the second ML model, for example averaging them. Note that in case of SL mode part of the gradients (up to the cut-layer) are determined by devices 110 and the rest of them are determined by network node 150. In case of FL mode, all gradients are determined by a device). The same rationale applies as in claim 1. Regarding claim 8, Keshavamurthy-Krouka-Park disclose the method for federated learning of claim 1, wherein each of the two or more selected UE devices choose a set of layers for the federated training, wherein a first layer from the set of layers determines an amount of training (Krouka, [0130]: The energy consumption of device 110-1 associated with transmission/reception of data (e.g., transmission of the output of the cut-layer and/or reception of respective gradients) may be estimated based on the transmit energy of device 110-1 and the transmission time, which may be dependent on the amount of the training output data of the cut-layer and the gradients). The same rationale applies as in claim 1. Regarding claim 9, Keshavamurthy-Krouka-Park disclose the method for federated learning of claim 1, wherein the local models that the UE devices send includes partial layers of a neural network (Krouka, [0108]: split learning with cut-layer index i(SL.sup.i), federated learning with the index of the cut-layer corresponding to the final layer of the second ML model, or an idle mode indicative of the respective device 110-1, 110-2, 110-3 not participating in the collaborative training of the second ML model. The training modes of devices 110 may be therefore indicative of respective cut-layers devices 110 configured to provide the training output data of the second ML model). The same rationale applies as in claim 1. Regarding claim 10, Keshavamurthy discloses a network node apparatus comprising: a processor; and a memory storing instructions that, when executed by the processor, configure the apparatus to: initialize parameters of models for user equipment (UE) devices ([0149]: A local training result reported by a particular UE may comprise updates for parameters of the local model at the particular UE. In some embodiments, the updates for the parameters of the local model are the values of the parameters of the local model when training of the local model has been completed at the UE); select two or more of the UE devices as selected UE devices to participate in federated training ([0174]: The selection of K UEs for local training can in some situations be random or based on any suitable UE selection scheme that takes into account the information included in the received FL reports (provided by the UEs). In the following example one of the selected UEs is UE1 300a….other selected UEs such as for example UE3 300c which could also have been selected as a first UE along with UE1 300a); aggregate, at the network node, the local models to generate a global model ([0037]: combining the local training results to generate aggregated training results for the model). However, Keshavamurthy does not disclose instruct the selected UE devices to select one or more model layers as selected model layers to participate in the federated training. In an analogous art, Krouka discloses instruct the selected UE devices to select one or more model layers as selected model layers to participate in the federated training ([0106]: a reward (e.g., R=1) may be provided if the SL mode is selected with cut-layer index i that results in the minimum estimated energy consumption for a device among the selectable cut-layer indices and the energy consumption is below a threshold E.sub.max (e.g., maximum allowed energy consumption). Therefore, it would have been obvious before the effective filed date of the claimed invention to a person having ordinary skill in the art to modify Keshavamurthy to comprise “instruct the selected UE devices to select one or more model layers as selected model layers to participate in the federated training” taught by Krouka. One of ordinary skilled in the art would have been motivated because it would have enabled the devices process a fraction of the ML model and transmit the output of the partitioning layer to the centralized server at every communication round (Krouka, [0068]). However, Keshavamurthy-Krouka does not disclose receive local models generated by the selected UE devices performing forward-forward learning on the selected model layers. In an analogous art, Park discloses receive local models generated by the selected UE devices performing forward-forward learning on the selected model layers (pg. 2, right column, [0001]-[0002]: a federated learning algorithm, FedFwd, which follows the core steps of FedAvg, encompassing three primary stages: 1) the selection of a client subset at each iteration, 2) execution of local parameter updates, and 3) subsequent aggregation of these updates at the server. The Forward-Forward algorithm, a greedy layer-wise learning technique, adopts an alternative approach to the traditional backpropagation’s one forward and one backward pass by employing two forward passes. This algorithm uniquely trains each layer by leveraging a measure of goodness. Pg. 3, left column, [0003]: To ensure a fair comparison, we design FedFwd and FedAvgmodels to have the same number of layers and a similar number of parameters with a marginal parameter difference of approximately 1%). Therefore, it would have been obvious before the effective filed date of the claimed invention to a person having ordinary skill in the art to modify Keshavamurthy-Krouka to comprise “receive local models generated by the selected UE devices performing forward-forward learning on the selected model layers” taught by Park. One of ordinary skilled in the art would have been motivated because it would have enabled the forward-forward algorithm to reduce the computational burden on local clients by eliminating the need to store all intermediate activations in memory (Park, pg. 1, right column, [0001]). Regarding claim 11; the claim is interpreted and rejected for the same reason as set forth in claim 2. Regarding claim 12; the claim is interpreted and rejected for the same reason as set forth in claim 3. Regarding claim 13; the claim is interpreted and rejected for the same reason as set forth in claim 5. Regarding claim 14; the claim is interpreted and rejected for the same reason as set forth in claim 6. Regarding claim 16; the claim is interpreted and rejected for the same reason as set forth in claim 9. Regarding claim 17, Keshavamurthy discloses user equipment (UE) apparatus comprising: a processor; and a memory storing instructions that, when executed by the processor, configure the apparatus to: initialize parameters of models for the UE apparatus ([0149]: A local training result reported by a particular UE may comprise updates for parameters of the local model at the particular UE. In some embodiments, the updates for the parameters of the local model are the values of the parameters of the local model when training of the local model has been completed at the UE); receive an indication from a network node that the UE apparatus is selected to participate in a federated training ([0174]: The selection of K UEs for local training can in some situations be random or based on any suitable UE selection scheme that takes into account the information included in the received FL reports (provided by the UEs). In the following example one of the selected UEs is UE1 300a….other selected UEs such as for example UE3 300c which could also have been selected as a first UE along with UE1 300a); send the local model to the network node for aggregation to generate a global model ([0037]: combining the local training results to generate aggregated training results for the model). However, Keshavamurthy does not disclose select one or more model layers as selected model layers to participate in the federated training. In an analogous art, Krouka discloses select one or more model layers as selected model layers to participate in the federated training ([0106]: a reward (e.g., R=1) may be provided if the SL mode is selected with cut-layer index i that results in the minimum estimated energy consumption for a device among the selectable cut-layer indices and the energy consumption is below a threshold E.sub.max (e.g., maximum allowed energy consumption). Therefore, it would have been obvious before the effective filed date of the claimed invention to a person having ordinary skill in the art to modify Keshavamurthy to comprise “select one or more model layers as selected model layers to participate in the federated training” taught by Krouka. One of ordinary skilled in the art would have been motivated because it would have enabled the devices process a fraction of the ML model and transmit the output of the partitioning layer to the centralized server at every communication round (Krouka, [0068]). However, Keshavamurthy-Krouka does not disclose perform forward-forward learning on the selected model layers to generate local models for the selected model layers. In an analogous art, Park discloses perform forward-forward learning on the selected model layers to generate local models for the selected model layers (pg. 2, right column, [0001]-[0002]: A federated learning algorithm, FedFwd, which follows the core steps of FedAvg, encompassing three primary stages: 1) the selection of a client subset at each iteration, 2) execution of local parameter updates, and 3) subsequent aggregation of these updates at the server. The Forward-Forward algorithm, a greedy layer-wise learning technique, adopts an alternative approach to the traditional backpropagation’s one forward and one backward pass by employing two forward passes. This algorithm uniquely trains each layer by leveraging a measure of goodness. Pg. 3, left column, [0003]: To ensure a fair comparison, we design FedFwd and FedAvgmodels to have the same number of layers and a similar number of parameters with a marginal parameter difference of approximately 1%). Therefore, it would have been obvious before the effective filed date of the claimed invention to a person having ordinary skill in the art to modify Keshavamurthy-Krouka to comprise “perform forward-forward learning on the selected model layers to generate local models for the selected model layers” taught by Park. One of ordinary skilled in the art would have been motivated because it would have enabled the forward-forward algorithm to reduce the computational burden on local clients by eliminating the need to store all intermediate activations in memory (Park, pg. 1, right column, [0001]). Regarding claim 18; the claim is interpreted and rejected for the same reason as set forth in claim 2. Regarding claim 19; the claim is interpreted and rejected for the same reason as set forth in claim 3. Regarding claim 20; the claim is interpreted and rejected for the same reason as set forth in claim 4. Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Keshavamurthy in view of Krouka in view of Park, as applies to claim 1, in further view of Mo et al. (US 2024/0144009 A1). Regarding claim 7, Keshavamurthy-Krouka-Park disclose the method for federated learning of claim 1. However, Keshavamurthy-Krouka-Park does not disclose wherein the selected model layers aggregation can be homogeneous or heterogeneous, wherein homogeneous aggregation comprises averaging neural network coefficients on a same layer among all UE devices, and wherein heterogeneous aggregation comprises averaging neural network coefficients on different layers. In an analogous art, Mo discloses wherein the selected model layers aggregation can be homogeneous or heterogeneous, wherein homogeneous aggregation comprises averaging neural network coefficients on a same layer among all UE devices, and wherein heterogeneous aggregation comprises averaging neural network coefficients on different layers (([0237]: a method of federated learning 330. At stage 232, the method identifies the model part. If the model part is an encoder 12 it is aggregated by averaging 336. If the model part is a predictor 14 it is determined if it is homogeneous or heterogeneous with a predictor 14 that is being averaged at stage 334. [0245]: Homogeneous aggregation can occur for predictors 14 associated with the same computational resource class. Heterogeneous aggregation can occur across the aggregated predictors 14 of the different classes). Therefore, it would have been obvious before the effective filed date of the claimed invention to a person having ordinary skill in the art to modify Keshavamurthy-Krouka-Park to comprise “wherein the crawling is performed until a fulfilment of a stop condition, in particular an expiry of a pre-set crawling period associated with the input message, is determined” taught by Mo. One of ordinary skilled in the art would have been motivated because it would have enabled to perform federated learning for the neural network based on homogeneous or heterogeneous predictors (Mo, [0237]). Regarding claim 15; the claim is interpreted and rejected for the same reason as set forth in claim 7. Additional References The prior art made of record and not relied upon is considered pertinent to applicants disclosure. Balevi et al., US 2023/0316062 A1: Layer by Layer Training for Federated Learning. Li et al., US 2023/0082173: Data processing, Method, Federated Learning Training Method, and Related Apparatus and Device. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUAN C TURRIATE GASTULO whose telephone number is (571)272-6707. The examiner can normally be reached Monday - Friday 8 am-4 pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian J Gillis can be reached at 571-272-7952. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /J.C.T/Examiner, Art Unit 2446 /BRIAN J. GILLIS/Supervisory Patent Examiner, Art Unit 2446
Read full office action

Prosecution Timeline

Jun 05, 2024
Application Filed
Feb 21, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12603795
INFORMATION PROCESSING TERMINAL, INFORMATION PROCESSING DEVICE, AND SYSTEM
2y 5m to grant Granted Apr 14, 2026
Patent 12587432
Visual Map for Network Alerts
2y 5m to grant Granted Mar 24, 2026
Patent 12574436
BLOCKCHAIN MACHINE BROADCAST PROTOCOL WITH LOSS RECOVERY
2y 5m to grant Granted Mar 10, 2026
Patent 12566427
Method and System for Synchronizing Configuration Data in a Plant
2y 5m to grant Granted Mar 03, 2026
Patent 12568059
UPDATING COMMUNICATIONS WITH MACHINE LEARNING AND PLATFORM CONTEXT
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
72%
Grant Probability
99%
With Interview (+35.9%)
3y 2m
Median Time to Grant
Low
PTA Risk
Based on 376 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month