Last updated: April 19, 2026
Application No. 17/975,664
FEDERATED LEARNING SYSTEM FOR PERFORMING INDIVIDUAL DATA CUSTOMIZED FEDERATED LEARNING, METHOD FOR FEDERATED LEARNING, AND CLIENT ARATUS FOR PERFORMING SAME

Final Rejection §103
Filed
Oct 28, 2022
Examiner
WU, NICHOLAS S
Art Unit
2148
Tech Center
2100 — Computer Architecture & Software
Assignee
Korea Advanced Institute Of Science And Technology
OA Round
2 (Final)
This examiner grants 47% of cases after interview

— +43.1% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 38 resolved cases, 2023–2026
Examiner Intelligence

WU, NICHOLAS S View full profile →
Grants 47% of resolved cases
Career Allow Rate
18 granted / 38 resolved
-7.6% vs TC avg
Strong +43% interview lift
Without
With
+43.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
44 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
26.7%
-13.3% vs TC avg
§103
52.6%
+12.6% vs TC avg
§102
3.1%
-36.9% vs TC avg
§112
17.4%
-22.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 38 resolved cases
Office Action

§103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 11/05/2025 have been fully considered but they are not fully persuasive.
Regarding the 101 rejections, applicant’s arguments and amendments to the independent claims are persuasive and overcome the previous 101 rejections. Specifically, applicant’s amended limitations of wherein the federated learning model includes a hidden layer and the federated learning system divides the hidden layer into the extractor and the classifier, wherein the classifier includes a last hidden layer positioned immediately before an output layer among layers included in the hidden layer of the federated learning model, wherein the extractor includes at least one of layers from a first hidden layer positioned immediately after an input layer to a hidden layer positioned immediately before the last hidden layer included in the hidden layer among the layers included in the federated learning model, and wherein the extractor is updated globally through federated learning while the classifier remains fixed, and after the federated learning is completed, the classifier is locally trained at each of the plurality of the client devices provides a technical improvement because training a classifier portion of a federated learning model separately on local data, from the rest of the model, improves data imbalance performance. See pg. 10-11 of “Remarks”: 
“Page 10, lines 12-20 of the specification of the present application states the following: ‘Fig. 2 is an exemplary diagram illustrating the structure of the federated learning model used in the federated learning system 10 according to the embodiment of the present disclosure. Referring to Fig. 2, the federated learning model according to the embodiment may include a neural network including an input layer, a hidden layer, and an output layer. The federated learning system 10 according to the embodiment divides the hidden layer of the federated learning model into an extractor and a classifier once more. The extractor may include layers from the frontmost layer in contact with the input layer to a layer just before a layer of the layer of the last end of the hidden layer, among the layers constituting the federated learning model....’ 
Page 15, lines 7-16 of the specification of the present application states the following: ‘According to the above-described embodiment, the federated learning model is divided into an extractor and a classifier, and in the primary training process of training the global model, federated learning is performed by intensively training the extractor of the federated learning model from data held by each client device 200, and thus, the federated learning speed can be improved. In addition, in the secondary training process of individually training the local model after the completion of training the global model, each client device 200 uses the data held by each client device 200 to individually train the classifier, so that each client device 200 has a federated learning model having a decision boundary customized by the train data set stored in each client device 200, and accordingly, each client device 200 can use a model with greatly improved accuracy for the data it mainly uses.’” Applicant’s amendments and corresponding arguments that the claimed invention provides a technical improvement to the field of federated learning are persuasive. Therefore, the 101 rejections are withdrawn.
Regarding the 103 rejections, applicant’s arguments about reference(s) Arivazhagan have been fully considered but are not persuasive.
Alleged no teaching of hidden layers and division of hidden layer into extractor and classifier
	In Remarks/Arguments pg. 16, applicant contends:
“However, Applicant respectfully submits that Arivazhagan does not disclose, teach, or suggest a hidden layer. Arivazhagan does not mention the term "hidden layer." In addition, Arivazhagan does not disclose that the federated learning model includes a hidden layer and that the federated learning system divides the hidden layer into the extractor and the classifier.”

The relevant claim limitations appear to be: wherein the federated learning model includes a hidden layer and the federated learning system divides the hidden layer into the extractor and the classifier in claim 1. Arivazhagan teaches:
(Arivazhagan, abstract, “This paper proposes FedPer, a base + personalization layer approach for federated training of deep feed forward neural networks”).

(Arivazhagan, pg. 7 col. 1, “base layers capture complex image features through federated training.”).

(Arivazhagan, pg. 5 col. 2, “We consider personalization layers to include the classifier layer (final fully connected layer)”).

In other words, Arivazhagan teaches a base and personalized layers approach to training federated deep feed forward neural networks. The mention of the base and personalization layers being applied to a deep feed forward neural network is interpreted as the base and personalization layers being hidden layers because one of ordinary skill of the art knows that deep neural networks are comprised of more than one hidden layer. Additionally, Arivazhagan shows that the hidden layer of the federated learning model is divided into a base, or feature extractor, portion and a personalized, or classifier, portion. Therefore, applicant’s arguments are not persuasive.

Alleged no teaching of claimed model’s extractor and classifier layer order
	In Remarks/Arguments pg. 16, applicant contends:
“Moreover, Applicant respectfully submits that Arivazhagan does not disclose, "wherein the classifier includes a last hidden layer positioned immediately before an output layer among layers included in the hidden layer of the federated learning model, wherein the extractor includes at least one of layers from a first hidden layer positioned immediately after an input layer to a hidden layer positioned immediately before the last hidden layer included in the hidden layer among the layers included in the federated learning model," as recited in claim 1.”

The relevant claim limitations appear to be: wherein the classifier includes a last hidden layer positioned immediately before an output layer among layers included in the hidden layer of the federated learning model, wherein the extractor includes at least one of layers from a first hidden layer positioned immediately after an input layer to a hidden layer positioned immediately before the last hidden layer included in the hidden layer among the layers included in the federated learning model, in claim 1. Arivazhagan teaches:
(Arivazhagan, pg. 1 see Figure 1 below, 
    PNG
    media_image1.png
    375
    621
    media_image1.png
    Greyscale

“Figure 1: Pictorial view of proposed federated personalization approach. All user devices share a set of base layers with same weights (colored blue) and have distinct personalization layers that can potentially adapt to individual data. The base layers are shared with the parameter server while the personalization layers are kept private by each device.”).

In other words, Arivazhagan teaches a base layer that is positioned before a personalized layer within the federated model. Arivazhagan shows that the two base layers, the larger layers of Model A/B/N in Figure 1, are positioned before the personalized layer, the smallest layer of Models A/B/N in Figure 1. As mentioned above, the base and personalized layers are interpreted as hidden layers and the base layers are interpreted as the extractor and the personalized layers are interpreted as the classifier. Therefore, Arivazhagan teaches wherein the classifier includes a last hidden layer positioned immediately before an output layer among layers included in the hidden layer of the federated learning model, wherein the extractor includes at least one of layers from a first hidden layer positioned immediately after an input layer to a hidden layer positioned immediately before the last hidden layer included in the hidden layer among the layers included in the federated learning model. Therefore, applicant’s arguments are not persuasive. 

Alleged no teaching of globally training extractor and locally training classifier
	In Remarks/Arguments pg. 16-17, applicant contends:
“wherein the extractor is updated globally through federated learning while the classifier remains fixed, and after the federated learning is completed, the classifier is locally trained at each of the plurality of the client devices," as recited in claim 1. For example, in the paragraph bridging pages 10 and 11 (second paragraph of section A.3 of Arivazhagan), Arivazhagan discloses "In this experiment, at the start of each global round, we fine tune the personalization layer parameters of the clients for 1 epoch by freezing the base layers. We hypothesize that fine tuning the personalization layers after receiving the base layer parameters from the server will help the local client models to accommodate to the previous round changes in the base layer parameters better." However, Arivazhagan fails to disclose the limitation of "wherein the extractor is updated globally through federated learning while the classifier remains fixed, and after the federated learning is completed, the classifier is locally trained at each of the plurality of the client devices" of the amended independent claim 1. Watanabe and Hu fail to cure the deficiencies of Arivazhagan.”

The relevant claim limitations appear to be: wherein the extractor is updated globally through federated learning while the classifier remains fixed, and after the federated learning is completed, the classifier is locally trained at each of the plurality of the client devices. in claim 1. Arivazhagan teaches:
(Arivazhagan, pg. 2 col. 2, “We propose to capture personalization aspects in federated learning by viewing deep learning models as base + personalization layers as illustrated in Figure 1. Our training algorithm comprises of the base layers being trained by federated averaging (or some variant thereof) and personalization layers being trained only from local data with stochastic gradient descent (or some variant thereof). We demonstrate that the personalization layers that are free from the federated averaging (FedAvg) procedure can help combat the ill-effects of statistical heterogeneity”).

In other words, Arivazhagan teaches base layers that are updated globally using some variation of federated averaging across the other devices and the classifier being trained locally on some variation of gradient descent. As mentioned above, the base layers are interpreted as the extractor and the personalized layers are interpreted as the classifier. Arivazhagan states that they purposely exclude the personalized layers from the federated learning step to reduce the effects that different local datasets have on the overall model performance. Therefore, Arivazhagan teaches wherein the extractor is updated globally through federated learning while the classifier remains fixed, and after the federated learning is completed, the classifier is locally trained at each of the plurality of the client devices. Therefore, applicant’s arguments are not persuasive.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 6, and 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Watanabe, et al., US Pre-Grant Publication 2023/0050708A1 (“Watanabe”) in view of Arivazhagan, et al., Non-Patent Literature “Federated Learning with Personalization Layers” (“Arivazhagan”).
Regarding claim 1, Watanabe discloses:
A federated learning system comprising: a central server configured to (Watanabe, ⁋24, “depicting a computing environment 100 for training a federated learning model [A federated learning system] in accordance with an embodiment of the present invention. As depicted, computing environment 100 includes a central training server 105 [comprising: a central server configured to], distributed computing nodes 130A-130N, and a network 160.”).
transmit a first parameter of an extractor in a federated learning model including the extractor and a classifier to each of a plurality of client devices, (Watanabe, ⁋28, “In some embodiments, federated learning module 115 initiates a federated learning task by providing copies of an algorithm that represents the pre-trained state of a federated learning model to any computing nodes that are participating in the federated learning task (e.g., distributed computing nodes 130A-130N). Federated learning module 115 may select initial values or settings [transmit a first parameter] for the pre-trained algorithm, such as providing random or other starting values to initialize the algorithm…The type of machine learning model that is implemented by the distributed pre-trained algorithms may vary depending on the federated learning task, and may include any form of machine learning that is suitable for training via any conventional or other federated learning approaches. For example, federated learning algorithms may utilize neural network-based machine learning techniques; extractors and classifiers are interpreted as conventional machine learning techniques (i.e. of an extractor in a federated learning model including the extractor and a classifier to each of a plurality of client devices,).”).
and receive a plurality of first parameters learned from the plurality of client devices to update the federated learning model; (Watanabe, ⁋29, “When a participating computing node receives the pre-trained algorithm, the computing node trains the algorithm using a local set of training data (which may be supplemented with additional training data in accordance with present invention embodiments), and once training is complete, the computing node provides the results back to central training server 105 [and receive a plurality of first parameters learned from the plurality of client devices]. Federated learning module 115 may combine the results from multiple computing nodes to generate a global trained model, which can be generated using any conventional or other federated learning techniques [to update the federated learning model;].”).
and the plurality of client devices configured to train each of the plurality of the first parameters of the federated learning model using a training data set stored in each of the plurality of client devices…and to transmit each of the plurality of the trained first parameters to the central server, (Watanabe, ⁋29, “When a participating computing node receives the pre-trained algorithm, the computing node trains the algorithm using a local set of training data [and the plurality of client devices configured to train each of the plurality of the first parameters of the federated learning model using a training data set stored in each of the plurality of client devices…] (which may be supplemented with additional training data in accordance with present invention embodiments), and once training is complete, the computing node provides the results back to central training server 105 [and to transmit each of the plurality of the trained first parameters to the central server,].”).  
While Watanabe teaches a federated learning system, Watanabe does not explicitly teach:
…while maintaining a value of a second parameter value of the classifier in the federated learning model,…
wherein the federated learning model includes a hidden layer and the federated learning system divides the hidden layer into the extractor and the classifier,
wherein the classifier includes a last hidden layer positioned immediately before an output layer among layers included in the hidden layer of the federated learning model, wherein the extractor includes at least one of layers from a first hidden layer positioned immediately after an input layer to a hidden layer positioned immediately before the last hidden layer included in the hidden layer among the layers included in the federated learning model,
and wherein the extractor is updated globally through federated learning while the classifier remains fixed, and after the federated learning is completed, the classifier is locally trained at each of the plurality of the client devices.
Arivazhagan teaches:
…while maintaining a value of a second parameter of the classifier in the federated learning model,… (Arivazhagan, pg. 2 col. 2, “We propose to capture personalization aspects in federated learning by viewing deep learning models as base + personalization layers as illustrated in Figure 1. Our training algorithm comprises of the base layers being trained by federated averaging (or some variant thereof) and personalization layers being trained only from local data with stochastic gradient descent (or some variant thereof). We demonstrate that the personalization layers that are free from the federated averaging (FedAvg) procedure can help combat the ill-effects of statistical heterogeneity; using personalized layers that are not affected by the federated learning step is interpreted as maintaining values of a second parameter of a model in the federated model (i.e. …while maintaining a value of a second parameter of the classifier in the federated learning model,…).”).
wherein the federated learning model includes a hidden layer and the federated learning system divides the hidden layer into the extractor and the classifier, (Arivazhagan, abstract, “This paper proposes FedPer, a base + personalization layer approach for federated training of deep feed forward neural networks; the base and personalization layers are interpreted as hidden layers of a feed forward neural network (i.e. wherein the federated learning model includes a hidden layer)”, and Arivazhagan, pg. 7 col. 1, “base layers capture complex image features through federated training [and the federated learning system divides the hidden layer into the extractor].”, and Arivazhagan, pg. 5 col. 2, “We consider personalization layers to include the classifier layer (final fully connected layer) [and the classifier,]”).
wherein the classifier includes a last hidden layer positioned immediately before an output layer among layers included in the hidden layer of the federated learning model, wherein the extractor includes at least one of layers from a first hidden layer positioned immediately after an input layer to a hidden layer positioned immediately before the last hidden layer included in the hidden layer among the layers included in the federated learning model, (Arivazhagan, pg. 1 see Figure 1 below, 
    PNG
    media_image1.png
    375
    621
    media_image1.png
    Greyscale
“Figure 1: Pictorial view of proposed federated personalization approach. All user devices share a set of base layers with same weights (colored blue); the base layers are interpreted as the extractor layers. In Figure 1, the two larger layers of Model A/B/N are the base layers (i.e. wherein the extractor includes at least one of layers from a first hidden layer positioned immediately after an input layer to a hidden layer positioned immediately before the last hidden layer included in the hidden layer among the layers included in the federated learning model,) and have distinct personalization layers that can potentially adapt to individual data. The base layers are shared with the parameter server while the personalization layers are kept private by each device; personalized layers are interpreted as the classifier layers. In Figure 1, the personalized layers are the smallest layer in Models A/B/N (i.e. wherein the classifier includes a last hidden layer positioned immediately before an output layer among layers included in the hidden layer of the federated learning model,).”).
and wherein the extractor is updated globally through federated learning while the classifier remains fixed, and after the federated learning is completed, the classifier is locally trained at each of the plurality of the client devices. (Arivazhagan, pg. 2 col. 2, “We propose to capture personalization aspects in federated learning by viewing deep learning models as base + personalization layers as illustrated in Figure 1. Our training algorithm comprises of the base layers being trained by federated averaging (or some variant thereof) [and wherein the extractor is updated globally through federated learning] and personalization layers being trained only from local data with stochastic gradient descent (or some variant thereof) [and after the federated learning is completed, the classifier is locally trained at each of the plurality of the client devices.]. We demonstrate that the personalization layers that are free from the federated averaging (FedAvg) [while the classifier remains fixed,] procedure can help combat the ill-effects of statistical heterogeneity”).
Watanabe and Arivazhagan are both in the same field of endeavor (i.e. federated learning). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Watanabe and Arivazhagan to teach the above limitation(s). The motivation for doing so is that incorporating personalization in federated learning improves model performance by countering the effects of data heterogeneity in a distributed environment (cf. Arivazhagan, abstract, “Specifically, statistical heterogeneity of data across user devices can severely degrade performance of standard federated averaging for traditional machine learning applications like personalization with deep learning. This paper proposes FedPer, a base + personalization layer approach for federated training of deep feed forward neural networks, which can combat the ill-effects of statistical heterogeneity.”).

Regarding claim 6, the claim is similar to claim 1. Watanabe teaches the additional limitations of and updating, by the central server, the federated learning model by receiving the plurality of the first parameters trained from each of the plurality of client devices; (Watanabe, ⁋29, “When a participating computing node receives the pre-trained algorithm, the computing node trains the algorithm using a local set of training data (which may be supplemented with additional training data in accordance with present invention embodiments), and once training is complete, the computing node provides the results back to central training server 105. Federated learning module 115 may combine the results from multiple computing nodes to generate a global trained model, which can be generated using any conventional or other federated learning techniques [and updating, by the central server, the federated learning model by receiving the plurality of the first parameters trained from each of the plurality of client devices;].”).
Arivazhagan teaches the additional limitation and after the updating of the federated learning model, by each of the plurality of client devices, receiving the federated learning model on which federated learning is completed from the central server, and updating the second parameter of the federated learning model using the training data set stored in each of the plurality of client devices, so that each client device has the federated learning model having a decision boundary customized by the training data set stored in each client device, (Arivazhagan, pg. 2 col. 2, “We propose to capture personalization aspects in federated learning by viewing deep learning models as base + personalization layers as illustrated in Figure 1. Our training algorithm comprises of the base layers being trained by federated averaging (or some variant thereof) [and after the updating of the federated learning model, by each of the plurality of client devices, receiving the federated learning model on which federated learning is completed from the central server,] and personalization layers being trained only from local data with stochastic gradient descent (or some variant thereof) [and updating the second parameter of the federated learning model using the training data set stored in each of the plurality of client devices,]. We demonstrate that the personalization layers that are free from the federated averaging (FedAvg) procedure can help combat the ill-effects of statistical heterogeneity [so that each client device has the federated learning model having a decision boundary customized by the training data set stored in each client device,]”).
Watanabe and Arivazhagan are both in the same field of endeavor (i.e. federated learning). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Watanabe and Arivazhagan to teach the above limitation(s). The motivation for doing so is that incorporating personalization in federated learning improves model performance by countering the effects of data heterogeneity in a distributed environment (cf. Arivazhagan, abstract, “Specifically, statistical heterogeneity of data across user devices can severely degrade performance of standard federated averaging for traditional machine learning applications like personalization with deep learning. This paper proposes FedPer, a base + personalization layer approach for federated training of deep feed forward neural networks, which can combat the ill-effects of statistical heterogeneity.”).

Regarding claim 11, the claim is similar to claim 1. Watanabe teaches the additional limitations of:
A client device for training a federated learning model, the client device comprising: a communication unit that transmits and receives information to and from a central server; (Watanabe, ⁋29, “When a participating computing node [A client device for training a federated learning model,] receives the pre-trained algorithm, the computing node trains the algorithm using a local set of training data (which may be supplemented with additional training data in accordance with present invention embodiments), and once training is complete, the computing node provides the results back to central training server 105 [the client device comprising: a communication unit that transmits and receives information to and from a central server;].”).
a memory; and a processor, wherein the processor is configured to (Watanabe, ⁋41, “Distributed computing nodes 130A-130N each include a network interface (I/F) 131, at least one processor 132, memory 135, and a database 155 [a memory; and a processor, wherein the processor is configured to].”).

Regarding claim 2 and analogous claim 12, Watanabe in view of Arivazhagan teaches the system of claim 1. Arivazhagan further teaches wherein each of the plurality of client devices update the second parameter of the federated learning model using the training data set stored in each of the plurality of client devices after each of the plurality of client devices receive the federated learning model on which federated learning is completed from the central server. (Arivazhagan, pg. 2 col. 2, “We propose to capture personalization aspects in federated learning by viewing deep learning models as base + personalization layers as illustrated in Figure 1. Our training algorithm comprises of the base layers being trained by federated averaging (or some variant thereof) and personalization layers being trained only from local data [wherein each of the plurality of client devices update the second parameter of the federated learning model using the training data set stored in each of the plurality of client devices] with stochastic gradient descent (or some variant thereof). We demonstrate that the personalization layers that are free from the federated averaging (FedAvg) procedure can help combat the ill-effects of statistical heterogeneity; using personalized layers that are not affected by the federated learning step is interpreted as updating the personalized layers after receiving the federated trained base layers (i.e. after each of the plurality of client devices receive the federated learning model on which federated learning is completed from the central server.).”). 
It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Arivazhagan with the teachings of Watanabe for the same reasons disclosed in claim 1.

Claims 3-4, 8-9, and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Watanabe, et al., US Pre-Grant Publication 2023/0050708A1 (“Watanabe”) in view of Arivazhagan, et al., Non-Patent Literature “Federated Learning with Personalization Layers” (“Arivazhagan”) and further in view of Hu, et al., Non-Patent Literature “PROVABLE BENEFIT OF ORTHOGONAL INITIALIZATION IN OPTIMIZING DEEP LINEAR NETWORKS” (“Hu”).
Regarding claim 3 and analogous claims 8 and 13, Watanabe in view of Arivazhagan teaches the system of claim 1. Arivazhagan further teaches wherein the second parameter maintains a preset value…in the training process of each of the plurality of the first parameters of each of the plurality of client devices. (Arivazhagan, pg. 2 col. 2, “We propose to capture personalization aspects in federated learning by viewing deep learning models as base + personalization layers as illustrated in Figure 1. Our training algorithm comprises of the base layers being trained by federated averaging (or some variant thereof) and personalization layers being trained only from local data with stochastic gradient descent (or some variant thereof). We demonstrate that the personalization layers that are free from the federated averaging (FedAvg) procedure can help combat the ill-effects of statistical heterogeneity; using personalized layers that are not affected by the federated learning step is interpreted as maintaining values of a second parameter during the training of the first parameter values (i.e. wherein the second parameter maintains a preset value…in the training process of each of the plurality of the first parameters of each of the plurality of client devices.).”).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Arivazhagan with the teachings of Watanabe for the same reasons disclosed in claim 1. 
While Watanabe in view of Arivazhagan teaches a personalized federated learning system by separating the training of a model between personalized and base layers, the combination does explicitly teach: …according to a predetermined weight initialization algorithm…  
Hu teaches …according to a predetermined weight initialization algorithm… (Hu, pg. 1 Section 1, “In this work, we examine the effect of initialization on the rate of convergence of gradient descent in deep linear networks. We provide for the first time a rigorous proof that drawing the initial weights […according to a predetermined weight initialization algorithm…] from the orthogonal group speeds up convergence relative to the standard Gaussian initialization with iid weights.”).
Watanabe, in view of Arivazhagan, and Hu are both in the same field of endeavor (i.e. deep learning). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Watanabe, in view of Arivazhagan, and Hu to teach the above limitation(s). The motivation for doing so is that incorporating orthogonal weight initializations to machine learning models improve performance (cf. Hu, pg. 1 Section 1, “Orthogonal weight initializations have been the subject of a significant amount of prior theoretical and empirical investigation. For example, in a line of work focusing on dynamical isometry, it was found that orthogonal weights can speed up convergence for deep linear networks (Saxe et al., 2014; Advani & Saxe, 2017) and for deep non-linear networks”).

Regarding claim 4 and analogous claims 9 and 14, Watanabe in view of Arivazhagan and Hu teaches the system of claim 3. Arivazhagan teaches wherein the second parameter maintains a preset value…in the training process of each of the plurality of the first parameters of each of the plurality of client devices as seen in claim 3. 
Hu further teaches …according to an orthogonal initialization algorithm… (Hu, pg. 1 Section 1, “In this work, we examine the effect of initialization on the rate of convergence of gradient descent in deep linear networks. We provide for the first time a rigorous proof that drawing the initial weights from the orthogonal group […according to an orthogonal initialization algorithm…] speeds up convergence relative to the standard Gaussian initialization with iid weights.”).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the present application to combine the teachings of Hu with the teachings of Watanabe and Arivazhagan for the same reasons disclosed in claim 3. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Singhal, et al., US20220398500A1 discloses using a federated reconstruction algorithm with partial local federated learning by partitioning the federated model parameters into global and local variables that are updated at different frequencies.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS S WU whose telephone number is (571)270-0939. The examiner can normally be reached Monday - Friday 8:00 am - 4:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle Bechtold can be reached at 571-431-0762. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/N.S.W./Examiner, Art Unit 2148                                                                                                                                                                                                        /MICHELLE T BECHTOLD/Supervisory Patent Examiner, Art Unit 2148
Read full office action
Prosecution Timeline

Oct 28, 2022
Application Filed
Aug 01, 2025
Non-Final Rejection — §103
Nov 05, 2025
Response Filed
Feb 17, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/882,311
Patent 12488244
APPARATUS AND METHOD FOR DATA GENERATION FOR USER ENGAGEMENT
2y 5m to grant Granted Dec 02, 2025
17/444,687
Patent 12423576
METHOD AND APPARATUS FOR UPDATING PARAMETER OF MULTI-TASK MODEL, AND STORAGE MEDIUM
2y 5m to grant Granted Sep 23, 2025
17/265,476
Patent 12361280
METHOD AND DEVICE FOR TRAINING A MACHINE LEARNING ROUTINE FOR CONTROLLING A TECHNICAL SYSTEM
2y 5m to grant Granted Jul 15, 2025
17/191,518
Patent 12354017
ALIGNING KNOWLEDGE GRAPHS USING SUBGRAPH TYPING
2y 5m to grant Granted Jul 08, 2025
17/161,152
Patent 12333425
HYBRID GRAPH NEURAL NETWORK
2y 5m to grant Granted Jun 17, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
47%
Grant Probability
90%
With Interview (+43.1%)
3y 9m
Median Time to Grant
Moderate
PTA Risk
Based on 38 resolved cases by this examiner. Grant probability derived from career allow rate.