Office Action Analysis: 18166948 — FEDERATED LEARNING METHOD AND APPARATUS

Office Action

§101 §103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/09/2023 is/are in
compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is
being considered by the examiner.
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The present application claims foreign priority based on Korean Patent Application No. 10-2022-0041901 filed 04/04/2022. The examiner notes that a certified copy (in Korean) of the above-noted application was retrieved on 04/03/2023. Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 2-4, 13-15, and claim 11 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 2 and its dependents 3 and 4 recites the limitation “wherein the client side extracts the feature vector by applying the input data as input of a partially connected network” in the beginning part of the claim. There is insufficient antecedent basis for this limitation in the claim. (Examiner’s note: Claim 2 introduces “the input data” for the first time without a prior introduction in either claim 1 or 2.)
Claim 13 and its dependents 14 and 15 recites the limitation “wherein the client side extracts the feature vector by applying the input data as input of a partially connected network” in the beginning part of the claim. There is insufficient antecedent basis for this limitation in the claim. (Examiner’s note: Claim 13 introduces “the input data” for the first time without a prior introduction in either claim 12 or 13.)
Claim 11 recites the limitation "the output vector string" in the last limitation of the claim. There is insufficient antecedent basis for this limitation in the claim. (Examiner’s note: The claim recites “producing an output vector string” in two different contexts. The first being “producing an output vector string… using an SOFM” and second being “producing an output vector string… using a fully connected network”. Therefore, when it mentions “applying the output vector string”, it is unclear to which output vector string it is referring to.)  

Claim Objections
Claim 1 and Claim 12 objected to because of the following informalities:
Both claims recite “training a neural network model by applying… as input of a neural network model”. The second instance of “a neural network model” should be replaced with “the neural network model”. 
 Appropriate correction is required.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract
idea without significantly more. The analysis of the claims will follow the 2019 Revised Patent Subject
Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).

Claim 1:
Step 1: The claim recites a method; therefore, it is directed to the statutory category of processes.
Step 2A prong 1: The claim recites the following abstract ideas:
outputting a feature vector with phase information preserved therein by applying the feature vector as input of a Self-Organizing Feature Map (SOFM); (a person mentally or with a pen and paper can input a feature vector into SOFM (which is essentially a set of mathematical equations as seen in paragraphs 116-128) and get a feature vector as the output.)

Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
receiving a feature vector extracted from a client side and label data corresponding to the feature vector; (Insignificant extra-solution as the limitation amounts to receiving data (MPEP 2106.05(g)(3)))
and training a neural network model by applying both the feature vector with the phase information preserved therein and the label data as input of a neural network model. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner's note (EN): Claims denote generic training and a generic neural network model with no additional details or limitations beyond a generic, off the shelf neural network model.)

Step 2B:
receiving a feature vector extracted from a client side and label data corresponding to the feature vector; ((MPEP 2106.05(d)(II) indicates that merely gathering data is a well- understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed limitation is well- understood, routine, conventional activity is supported under Berkheimer.)
and training a neural network model by applying both the feature vector with the phase information preserved therein and the label data as input of a neural network model. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner's note (EN): Claims denote generic training and a generic neural network model with no additional details or limitations beyond a generic, off the shelf neural network model.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 2:
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 1 above, which claim 2 depends on. Claim 2 further recites:
and classifies the feature vector (a person mentally or with a pen and paper can classify vectors with multiple features.)
Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
wherein the client side extracts the feature vector by applying the input data as input of a partially connected network, (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
by applying the feature vector as input of a fully connected network. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)

Step 2B:
wherein the client side extracts the feature vector by applying the input data as input of a partially connected network, (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
by applying the feature vector as input of a fully connected network. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 3:
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 2 above, which claim 3 depends on.
Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
transmitting a rate of change in a weight of the neural network model to the fully connected network on the client side. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f)(2), and does not provide integration into a practical application. -- Examiner’s Note (EN): This limitation amounts to using a generic neural network model to transmit data, which is merely an instruction to apply the abstract idea using a generic computer component.)
Step 2B:
transmitting a rate of change in a weight of the neural network model to the fully connected network on the client side. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f)(2), and does not provide integration into a practical application. -- Examiner’s Note (EN): This limitation amounts to using a generic neural network model to transmit data, which is merely an instruction to apply the abstract idea using a generic computer component.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 4: 
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 2 above, which claim 4 depends on.
Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
wherein an architecture of the neural network model corresponds to an architecture of the fully connected network on the client side. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s note (EN): This limitation amounts to taking an architecture of one model and applying it to another model.)
Step 2B:
wherein an architecture of the neural network model corresponds to an architecture of the fully connected network on the client side. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s note (EN): This limitation amounts to taking an architecture of one model and applying it to another model.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 5:
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 1 above, which claim 5 depends on. Claim 5 further recites:
varying a learning time based on an average rate of change in a loss function of the neural network model (This is interpreted as a mental process and recitation of mathematical concepts. A person with a pen and paper can change a learning time based on an average derived from a mathematical function.)
Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
thus training the Self-Organizing Feature Map (SOFM). (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s note (EN): This limitation amounts to performing an abstract idea (as recited in step 2A prong 1) and applying it to do training.)
Step 2B: 
thus training the Self-Organizing Feature Map (SOFM). (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s note (EN): This limitation amounts to performing an abstract idea (as recited in step 2A prong 1) and applying it to do training.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 6:
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 5 above, which claim 6 depends on. Claim 6 further recites:
wherein the average rate of change in the loss function is calculated based on an output vector and the label data. (This limitation falls within the mathematical concepts grouping because it involves calculating a value in a loss function based on two variables.)

Step 2A prong 2: The claim does not recite additional elements therefore the judicial exception is not integrated into a practical application. 

Step 2B: The claim does not recite additional elements that amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 7: 
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 1 above, which claim 6 depends on. Claim 7 further recites:
wherein the Self-Organizing Feature Map (SOFM) varies a learning time based on an SOFM learning coefficient, thus learning the feature vector. (This limitation falls within the mathematical concepts grouping because it involves calculating the speed or duration of the learning process using a formula that depends on a specific numerical variable called a learning coefficient.)

Step 2A prong 2: The claim does not recite additional elements therefore the judicial exception is not integrated into a practical application. 

Step 2B: The claim does not recite additional elements that amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 8:
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 1 above, which claim 8 depends on.
Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
wherein the neural network model is a fully connected network. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s note (EN): This limitation merely amounts to applying a generic neural network model as a fully connected network.)

Step 2B:
wherein the neural network model is a fully connected network. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s note (EN): This limitation merely amounts to applying a generic neural network model as a fully connected network.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 9:
Step 1: The claim recites a method; therefore, it is directed to the statutory category of processes.
Step 2A prong 1: The claim recites the following abstract ideas:
preserving phase information of the feature vector string, (A person mentally or with a pen and paper can maintain an association between a data set (feature vector) and its context (phase information).)
…by applying the feature vector string with the phase information preserved therein … and producing an output vector string. (A person mentally or with a pen and paper can take some data (feature vector) with information attached (phase information) and produce a result (output vector).

Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
receiving a feature vector string extracted from multiple client sides and label data corresponding to the feature vector string; and  (Adding insignificant extra-solution activity to the judicial exception - see MPEP 2106.05(g))
training a neural network model … as input of the neural network model, (Adding the words "apply it" (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f) -- Examiner's note (EN): Claims denote generic training and a generic neural network model with no additional details or limitations beyond a generic, off the shelf neural network model.)

Step 2B:
receiving a feature vector string extracted from multiple client sides and label data corresponding to the feature vector string; and (MPEP 2106.05(d)(II) indicate that merely “receiving or transmitting data" is a well-understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed receiving steps are well-understood, routine, conventional activity is supported under Berkheimer.)
training a neural network model … as input of the neural network model, (Adding the words "apply it" (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f) -- Examiner's note (EN): Claims denote generic training and a generic neural network model with no additional details or limitations beyond a generic, off the shelf neural network model.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 10:
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 9 above, which claim 10 depends on. Claim 10 further recites:
calculating a loss function of a server-side neural network model based on an output value, which is produced by…, (This limitation falls within the mental processes and mathematical concepts grouping because it involves evaluating an equation to get a value representing the difference between model’s prediction and the actual data.)
calculating a gradient based on the loss function (This limitation falls within the mathematical concepts grouping because it involves performing mathematical calculations to determine the rate of change of the error (gradient). 
and back-propagating the gradient to the server-side neural network model. (This limitation falls within the mathematical concepts grouping because it involves applying a specific mathematical algorithm to distribute the calculated error backward through the network to mathematically update the model’s weights.)

Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:

receiving the output vector string as input, and the label data ((Insignificant extra-solution as the limitation amounts to receiving data (MPEP 2106.05(g)(3))))
Step 2B: 
receiving the output vector string as input, and the label data ((MPEP 2106.05(d)(II) indicates that merely gathering data is a well- understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed limitation is well- understood, routine, conventional activity is supported under Berkheimer.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 11:
Step 1: A process, as above.
Step 2A prong 1: See the rejection of Claim 9 above, which claim 11 depends on. Claim 11 further recites:
producing an output vector string with phase information preserved in the feature vector string by applying the feature vector string as input of a self-organizing feature map (SOFM); and (a person mentally or with a pen and paper can input a feature vector string into SOFM (which is essentially a set of mathematical equations as seen in paragraphs 116-128) and get a feature vector string as the output.)
Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
performing learning and producing an output vector string by applying the output vector string as input of a fully connected network. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s Note (EN): This limitation merely using a fully connected network to apply training and producing an output.)

Step 2B: 
performing learning and producing an output vector string by applying the output vector string as input of a fully connected network. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s Note (EN): This limitation merely using a fully connected network to apply training and producing an output.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 12
Step 1: The claim recites an apparatus; therefore, it is directed to the statutory category of machine.
Step 2A prong 1: The claim recites the following abstract ideas:
output a feature vector with phase information preserved therein by applying the feature vector as input of a Self-Organizing Feature Map (SOFM); a person mentally or with a pen and paper can input a feature vector into SOFM (which is essentially a set of mathematical equations as seen in paragraphs 116-128) and get a feature vector as the output. 

Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
a memory configured to store a control program for performing federated learning; and a processor configured to execute the control program stored in the memory, wherein the processor is configured to (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
receive a feature vector extracted from a client side and label data corresponding to the feature vector (Insignificant extra-solution as the limitation amounts to receiving data (MPEP 2106.05(g)(3)).)
and train a neural network model by applying both the feature vector with the phase information preserved therein and the label data as input of a neural network model. This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner's note (EN): Claim denote generic training and a generic neural network model with no additional details or limitations beyond a generic, off the shelf neural network model.)

Step 2B:
a memory configured to store a control program for performing federated learning; and a processor configured to execute the control program stored in the memory, wherein the processor is configured to (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
receive a feature vector extracted from a client side and label data corresponding to the feature vector ((MPEP 2106.05(d)(II) indicates that merely gathering data is a well- understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed limitation is well- understood, routine, conventional activity is supported under Berkheimer.)
and train a neural network model by applying both the feature vector with the phase information preserved therein and the label data as input of a neural network model. This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner's note (EN): Claims denote generic training and a generic neural network model with no additional details or limitations beyond a generic, off the shelf neural network model.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 13Claim 13 is an apparatus claim that recites identical limitations to method claim 2. Therefore, claim 13 is rejected using the same rationale as claim 2.

Claim 14
Step 1: A machine, as above. 
Step 2A prong 1: See the rejection of Claim 13 above, which claim 14 depends on.
Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
wherein the processor is configured to perform control such that (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
transmitting a rate of change in a weight of the neural network model to the fully connected network on the client side. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s Note (EN): This limitation amounts to using a generic neural network model to transmit data, which is merely an instruction to apply the abstract idea using a generic computer component.)
Step 2B:
wherein the processor is configured to perform control such that (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
transmitting a rate of change in a weight of the neural network model to the fully connected network on the client side. (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f)(2), and does not provide integration into a practical application. -- Examiner’s Note (EN): This limitation amounts to using a generic neural network model to transmit data, which is merely an instruction to apply the abstract idea using a generic computer component.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 15Claim 15 is an apparatus claim that recites identical limitations to method claim 4. Therefore, claim 15 is rejected using the same rationale as claim 4.

Claim 16:
Step 1: A machine, as above.
Step 2A prong 1: See the rejection of Claim 12 above, which claim 16 depends on. Claim 16 further recites:
a learning time varies based on an average rate of change in a loss function of the neural network model, (This is interpreted as a mental process and recitation of mathematical concepts. A person with a pen and paper can change a learning time based on an average derived from a mathematical function.)
Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
wherein the processor is configured to perform control such that (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
thus training the Self-Organizing Feature Map (SOFM). (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s Note (EN): This limitation amounts to performing an abstract idea (as recited in step 2A prong 1) and applying it to do training.)
Step 2B: 
wherein the processor is configured to perform control such that (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
thus training the Self-Organizing Feature Map (SOFM). (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application. -- Examiner’s Note (EN): This limitation amounts to performing an abstract idea (as recited in step 2A prong 1) and applying it to do training.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 17:
Step 1: A machine, as above.
Step 2A prong 1: See the rejection of Claim 16 above, which claim 17 depends on. Claim 17 further recites:
the average rate of change in the loss function is calculated based on the output vector and the label data. (This limitation falls within the mathematical concepts grouping because it involves calculating a value in a loss function based on two variables.)

Step 2A prong 2: This judicial exception is not integrated into a practical application. The claim further
recites:
wherein the processor is configured to perform control such that (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
Step 2B: 
wherein the processor is configured to perform control such that (This additional element recites a mere instruction to apply an exception with a recitation of the words "apply it" (or an equivalent) as identified in MPEP 2106.05(f), and does not provide integration into a practical application.)
The additional elements considered individually or in combination do not amount to significantly more than the judicial exception. Therefore, the claim is not patent eligible.

Claim 18:
Claim 18 is an apparatus claim that recites identical limitations to method claim 7. Therefore, claim 18 is rejected using the same rationale as claim 7.

Claim 19:
Claim 19 is an apparatus claim that recites identical limitations to method claim 8. Therefore, claim 19 is rejected using the same rationale as claim 8.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
	Claims 1, 2, 5, 7 and 9-11 and 12, 13, 16, 18 are rejected under 35 U.S.C. 103 as being unpatentable over non-patent literature He et al. ("Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge", hereinafter "He") in view of Shah-Hosseini et al. ("TASOM: A New Time Adaptive Self-Organizing Map", hereinafter "Shah-Hosseini")Claim 1:
He teaches:
A federated learning method, comprising: (Page 1, “To address the resource-constrained reality of edge devices, we reformulate FL as a group knowledge transfer training algorithm, called FedGKT.”) 
    PNG
    media_image1.png
    1096
    1626
    media_image1.png
    Greyscale
receiving a feature vector extracted from a client side (page 2, Figure 1(a), EN: this denotes Edge (client) doing f_extractor (feature extraction) and sending it to the server.) and label data corresponding to the feature vector; (page 2, Figure 1(a), EN: this denotes LOSS_KD/input to f_server. F_server processes the inputs to calculate LOSS_KD. To calculate loss and train, label data must be present corresponding to the inputs. See quote page 2, “The larger server model is trained by taking features extracted from the edge-side model as inputs to the model, and then uses KD-based loss function that can minimize the gap between the ground truth and soft label (probabilistic prediction in KD [19, 20, 21, 22]) predicted from the edge-side model”) (…) and training a neural network model by applying both the feature vector (…) and the label data as input of a neural network model. (page 2, Figure 1(a), EN: Figure 1(a) depicts where the f_server (neural network model) is trained. It depicts feature vectors and label data being fed into the LOSS_KD module. This module uses both of these inputs to calculate the error signal required to train the server model, therefore it is applying both as input for training.)
He does not disclose “outputting a feature vector with phase information preserved therein by applying the feature vector as input of a Self-Organizing Feature Map (SOFM);” and “with the phase information preserved therein”

However, Shah-Hosseini teaches outputting a feature vector with phase information preserved therein by applying the feature vector as input of a Self-Organizing Feature Map (SOFM); (Page 271, “The time adaptive self-organizing map (TASOM) network is a modified self-organizing map (SOM) network with adaptive learning rates and neighborhood sizes as its learning parameters.” Page 276, “The weights of the TASOM network after 15 000 iterations are shown in Fig. 1(a). As it was expected, the weights topologically fill the input distribution space.” Page 272, “The nonuniform scaling defined in Step 7 should be used for preserving the topological ordering of neurons in the lattice when the input distribution is nonsymmetric.” Examiner’s note (EN): this denotes TASOM, a modified SOM that takes in input vectors and generates an output in the form of updated weight vectors. This process in the paper is mentioned to “topologically fill the input distribution space” and “preserve the topological ordering of neurons”. The topological ordering is equivalent to the “phase information persevered therein” recited in the instant claim.)

with the phase information preserved therein (Page 272, “The nonuniform scaling defined in Step 7 should be used for preserving the topological ordering of neurons in the lattice when the input distribution is nonsymmetric”)

Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He and Shah-Hosseini in order to process feature vectors using a self-organizing map that preserves topological ordering.
The motivation for doing so would be to allow the system to “follow changes of the environment without reinitializing the weight vectors or learning parameters (Page 282).”

Claim 2:
He in view of Shah-Hosseini teaches all the limitations of claim 1, He further teaches:
wherein the client side extracts the feature vector by applying the input data as input of a partially connected network, and classifies the feature vector by applying the feature vector as input of a fully connected network. (Page 2, “The compact CNN on the edge device consists of a lightweight feature extractor and classifier that can be trained efficiently using its private data (1 - local training).” Page 15, “ResNet-8 is a compact CNN. Its head convolutional layer (including batch normalization and ReLU non-linear activation) is used as the feature extractor. The remaining two Bottlenecks (a classical component in ResNet, each containing 3 convolutional layers) and the last fully-connected layer are used as the classifier.” EN: The “partial connected network” reads on the convolutional layers of the client’s ResNet-8 model, which process input data to extract feature maps (feature vectors) prior to the final layer. The “fully connected network” reads on the client’s final fully connected (dense) layer, which receives the extracted feature vector from the convolutional layers to generate classification results locally.)
Claim 5:
He in view of Shah-Hosseini teaches all the limitations of claim 1, Shah-Hosseini further teaches:
varying a learning time based on an average rate of change in a loss function of the neural network model, thus training the Self-Organizing Feature Map (SOFM). (Page 271, “The learning parameters (learning rate and neighborhood size)… decrease with time so that the feature map stabilizes… This is, in fact, the reason why the basic SOM algorithm cannot work properly in changing environments… Therefore, adaptive learning parameters must be employed in the SOM algorithm…” Page 272-273, Step 5, “Adjust the learning-rate parameters…”, f(||x(t)-w_i(t)||_f) gives some sense of a normalized error of neuron i. the total learning rate error… is minimized when each is maximized, and each is minimized. The former is a cross correlation between the normalized learning rate error and learning rate of the same neuron. Therefore, the total learning rate error is reduced when the cross correlation increases.” EN: This denotes varying the learning parameters (the rate n and neighborhood size, which define the “learning time” in SOMs) based on the error or difference between the input and the weights (a loss function). The reference explicitly states that these parameters are adapted (varied) based on “normalized error” and “total learning rate error” analogous to the average rate of change in a loss function to train the SOFM.)

Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He and Shah-Hosseini in order to vary the learning based on a loss function. 
The motivation for doing so would be to allow the network to “adapt to the new conditions with the least sufficient change in its weight vectors (Page 274).”, thereby ensuring “minimum instability is inflicted on the network” (Page 282.)

Claim 7:
He in view of Shah-Hosseini teaches all the limitations of claim 1, Shah-Hosseini further teaches:
wherein the Self-Organizing Feature Map (SOFM) varies a learning time based on an SOFM learning coefficient, thus learning the feature vector. (Page 272, equation 4 and quote “The constant parameters  α, β… can have any values between zero and one… β is a constant parameter between zero and one which controls how fast the neighborhood sizes should follow the local neighborhood errors” EN: this denotes varying the learning (learning rate n) using a learning coefficient (β). This satisfies the “thus learning the feature vector” limitation because the adaptive learning rate n, once modified by the coefficient β, is directly applied in the weight update rule (Equation 5) to adjust the weights in response to the input feature vector x(n). The reference states page 271, “the weights of a TASOM network are continuously trained by the input vectors.”, thereby learning the feature vector.)

Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He and Shah-Hosseini in order to vary the learning by a learning coefficient. 

The motivation for doing so would be to ensure that “minimum instability is inflicted on the network” while adapting to new data (Page 282.)

Claim 9:He teaches:receiving a feature vector string extracted from multiple client sides and label data corresponding to the feature vector string; and (…) training a neural network model by applying the feature vector string (…) as input of the neural network model, 
    PNG
    media_image1.png
    1096
    1626
    media_image1.png
    Greyscale
 Page 2 Figure 1(a) depicts multiple clients on the edge that transfer information to the server (feature vector string) and this quote, “The larger server model is trained by taking features extracted from the edge-side model as inputs to the model, and then uses KD-based loss function that can minimize the gap between the ground truth and soft label” and producing an output vector string. (Page 5, “The logit zs and z (k) c are the output of the last fully connected layer in the server model and the client model, respectively).”

He does not disclose “preserving phase information of the feature vector string,” and “with the phase information preserved therein”

However, Shah-Hosseini teaches preserving phase information of the feature vector string, (Page 271, “The time adaptive self-organizing map (TASOM) network is a modified self-organizing map (SOM) network with adaptive learning rates and neighborhood sizes as its learning parameters.” Page 276, “The weights of the TASOM network after 15 000 iterations are shown in Fig. 1(a). As it was expected, the weights topologically fill the input distribution space.” Page 272, “The nonuniform scaling defined in Step 7 should be used for preserving the topological ordering of neurons in the lattice when the input distribution is nonsymmetric.” EN: this denotes TASOM, a modified SOM that takes in input vectors and generates an output in the form of updated weight vectors. This process in the paper is mentioned to “topologically fill the input distribution space” and “preserve the topological ordering of neurons”. The topological ordering is equivalent to the “preserving phase information of the feature vector string” recited in the instant claim.) with the phase information preserved therein (Page 272, “The nonuniform scaling defined in Step 7 should be used for preserving the topological ordering of neurons in the lattice when the input distribution is nonsymmetric”)
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He and Shah-Hosseini in order to process feature vectors using a self-organizing map that preserves topological ordering.

The motivation for doing so would be to allow the system to “follow changes of the environment without reinitializing the weight vectors or learning parameters (Page 282).”

Claim 10:
He in view of Shah-Hosseini teaches all the limitations of claim 9, He further teaches:calculating a loss function of a server-side neural network model based on an output value, which is produced by receiving the output vector string as input, and the label data, (Page 2, “then uses KD-based loss function that can minimize the gap between the ground truth and soft label”  Page 5 equation 6 and 7 and qoute “L_CE is the cross-entropy loss between the predicted values and the ground truth labels.,” EN: the server calculates its error (loss) by comparing its own output predications against the label data.) calculating a gradient based on the loss function, and back-propagating the gradient to the server-side neural network model. (Page 5, “Moreover, each minimization subproblem can be solved with SGD and its variants” Page 6, Algorithm 1, line 10. EN: line 10 denotes calculating the gradient of the server’s loss function, and the update equation W_s [Wingdings font/0xDF] represents back-propagating that gradient to adjust the server model’s weights (W_s) to minimize error.)Claim 11:He in view of Shah-Hosseini teaches all the limitations of claim 9, He further teaches:and performing learning and producing an output vector string by applying the output vector string as input (page 6, “Training Algorithm. To elaborate, we illustrate the alternating training algorithm FedGKT in Fig. 1(a) and summarize it as Algorithm 1. During each round of training, the client uses local SGD to train several epochs and then sends the extracted feature maps and related logits to the server. When the server receives extracted features and logits from each client, it trains the much larger server-side CNN. The server then sends back its global logits to each client. This process iterates over multiple rounds, and during each round the knowledge of all clients is transferred to the server model and vice-versa.”) of a fully connected network. (Page 5, “The logit zs and z (k) c are the output of the last fully connected layer in the server model and the client model, respectively).”)

He does not disclose “producing an output vector string with phase information preserved in the feature vector string by applying the feature vector string as input of a self-organizing feature map (SOFM);”

However, Shah-Hosseini teaches: producing an output vector string with phase information preserved in the feature vector string by applying the feature vector string as input of a self-organizing feature map (SOFM); (Page 271, “The time adaptive self-organizing map (TASOM) network is a modified self-organizing map (SOM) network with adaptive learning rates and neighborhood sizes as its learning parameters.” Page 276, “The weights of the TASOM network after 15 000 iterations are shown in Fig. 1(a). As it was expected, the weights topologically fill the input distribution space.” Page 272, “The nonuniform scaling defined in Step 7 should be used for preserving the topological ordering of neurons in the lattice when the input distribution is nonsymmetric.” EN: this denotes TASOM, a modified SOM that takes in input vectors and generates an output in the form of updated weight vectors. This process in the paper is mentioned to “topologically fill the input distribution space” and “preserve the topological ordering of neurons”. The topological ordering is equivalent to the “phase information persevered therein” recited in the instant claim.)

Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He and Shah-Hosseini in order to process feature vectors using a self-organizing map that preserves topological ordering.

The motivation for doing so would be to allow the system to “follow changes of the environment without reinitializing the weight vectors or learning parameters (Page 282).”

Claim 12:He teaches:
A federated learning apparatus, comprising: a memory configured to store a control program for performing federated learning; and a processor configured to execute the control program stored in the memory, wherein the processor is configured to (Page 6, “4.1 Experimental Setup Implementation. We develop the FedGKT training framework based on FedML [23], an open source federated learning research library that simplifies the new algorithm development and deploys it in a distributed computing environment. Our server node has 4 NVIDIA RTX 2080Ti GPUs with sufficient GPU memory for large model training. We use several CPU-based nodes as clients training small CNNs.”)
The rest of claim 12 recites identical limitations to method claim 1. Therefore claim 12 is rejected using the same rationale as claim 1.Claim 13:
Claim 13 is an apparatus type claim that recite the same limitations as claim 2. Therefore claim 13 is rejected using the same rationale as claim 2.Claim 16:
Claim 16 is an apparatus type claim that recite the same limitations as claim 5. Therefore claim 16 is rejected using the same rationale as claim 5.

Claim 18:Claim 18 is an apparatus type claim that recite the same limitations as claim 7. Therefore claim 18 is rejected using the same rationale as claim 7.

Claims 3, 4, 6, 8 and 14, 15, 17, 19 are rejected under 35 U.S.C. 103 as being unpatentable over non-patent literature He et al. ("Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge", hereinafter "He") in view of Shah-Hosseini et al. ("TASOM: A New Time Adaptive Self-Organizing Map", hereinafter "Shah-Hosseini") further in view of Thapa et al. (“SplitFed: When Federated Learning Meets Split Learning”, hereinafter “Thapa”)

Claim 3:
He in view of Shah-Hosseini teaches all the limitations of claim 2, He further teaches:transmitting (…) of the neural network model to the fully connected network on the client side.
(Page 2, “To boost the edge model’s performance, the server sends its predicted soft labels to the edge, then the edge also trains its local dataset with a KD-based loss function using server-side soft labels (3 - transfer back).” EN: The “fully connected network” reads on the client’s final fully connected (dense) layer as further elaborated in EN of claim 2)

He in view of Shah-Hosseini does not disclose: “a rate of change in a weight”
However, Thapa teaches: a rate of change in a weight (Page 4, “Next, the server carries out the back-propagation up to the cut layer and sends the gradients of the smashed data to the client. With the gradients, the client carries out its back-propagation on the remaining network.”)Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He modified by Shah-Hosseini with Thapa in order to transmit the rate of change or gradient to the client. 

The motivation for doing so would be to allow for better efficiency. Page 1, “Moreover, it offers better efficiency than SL by incorporating the parallel ML model update paradigm of FL.”

Claim 4:
He in view of Shah-Hosseini teaches all the limitations of claim 2,

He in view of Shah-Hosseini does not teach: “wherein an architecture of the neural network model corresponds to an architecture of the fully connected network on the client side.”

However, Thapa teaches: wherein an architecture of the neural network model corresponds to an architecture of the fully connected network on the client side. (Page 6, “Then, the clients send the gradients to the fed server, which conducts the federated averaging of the client-side local updates and send that back to all participating clients. This way, the fed server synchronizes the client-side global model…” EN: this denotes using a specific “Fed Server” that maintains a “client-side global model” which must be architecturally identical (corresponds) to the client-side networks in order to perform the cited “federated averaging”.)

Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He modified by Shah-Hosseini with Thapa in order to use similar architectures for both the neural network and client-side network.

The motivation for doing so would be enable parallel processing, which is only possible if the architectures are similar. Page 1, “Moreover, it offers better efficiency than SL by incorporating the parallel ML model update paradigm of FL. Our empirical results, on uniformly distributed horizontally partitioned HAM10000 and MNIST datasets with multiple clients, show that SFL provides similar communication efficiency and test accuracy as SL, while significantly decreasing - by four to six times - its computation time per global epoch than in SL for both datasets.”

Claim 6:
He in view of Shah-Hosseini teaches all the limitations of claim 5, 
He in view Shah-Hosseini does not teach “wherein the average rate of change in the loss function is calculated based on an output vector and the label data.”

However, Thapa teaches: wherein the average rate of change in the loss function is calculated based on an output vector and the label data. (Page 4, Algorithm 3, EN: This denotes calculating the error (loss) by comparing the model’s predictions (output vector) against the label data from the clients, and uses this error to compute gradients (rate of change) that guide the model’s updates.)

Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He modified by Shah-Hosseini with Thapa in order to calculate the average rate of change of the loss based on output and label data. 

The motivation for doing so would be to minimize error and reduce client-side load. Page 3, “The learned parameters of the global model are then sent back to all clients to train for the next round. This process continues until the algorithm converges to a certain level (i.e., runs for multiple global epochs).”

Claim 8:
He in view of Shah-Hosseini teaches all the limitations of claim 1, 
He in view Shah-Hosseini does not teach: “wherein the neural network model is a fully connected network.”

However, Thapa teaches: wherein the neural network model is a fully connected network. 
Page 10, “LeNet [22] is a five-layer CNN consisting of convolution, average pooling, Sigmoid or Tanh, and fully connected layers. It uses the 5 × 5 and 2 × 2 sized kernels in its layers. The input image dimension is 32 × 32 × 1. AlexNet [2] consists of eight network layers, including convolution (five layers), pooling (three layers), and fully connected (two) layers. It uses 11 × 11, 5 × 5, and 3 × 3 sized kernels in its layers. The input image dimension is 227 × 227 × 3. VGG16 [23] consists of sixteen network layers, including convolution (sixteen layers), max pooling, and fully connected layers. It uses 3 × 3 sized kernels in its layers. The input image dimension is 224 × 224 × 3. ResNet18 [24] consists of eighteen network layers, including convolution, max pooling, ReLU, and fully connected layers.” EN: this denotes the usage of Lenet, Alex, VGG16, and ResNet18, all of which are defined as containing “fully connected layers.” It also denotes the split point “cut layer” is variable. Therefore, if the network is split after the convolutional (feature extraction layers), the remaining “server-side network” has the fully connected layers.

Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the
art to combine the work of He modified by Shah-Hosseini with Thapa in order to have a server network that is fully connected.

The motivation for doing so would be reduce client-side load. Page 5, “SL limits the client-side network portion down to a few layers. Thus, it enables the reduction in client-side computation compared to other collaborative learning such as FL, where the complete network is trained at the client-side.”.
Claim 14:
Claim 14 is an apparatus type claim that recite the same limitations as claim 3. Therefore claim 14 is rejected using the same rationale as claim 3.

Claim 15:Claim 15 is an apparatus type claim that recite the same limitations as claim 4. Therefore claim 15 is rejected using the same rationale as claim 4.

Claim 17:
Claim 17 is an apparatus type claim that recite the same limitations as claim 6. Therefore claim 17 is rejected using the same rationale as claim 6.Claim 19:
Claim 19 is an apparatus type claim that recite the same limitations as claim 8. Therefore claim 19 is rejected using the same rationale as claim 8.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NAYMUR RAHMAN ALI whose telephone number is (571) 272-0007. The examiner can normally be reached Mon-Fri. 7:30-5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached at (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or (571) 272-1000.





/NAYMUR RAHMAN ALI/Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123
Read full office action
FEDERATED LEARNING METHOD AND APPARATUS

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

FEDERATED LEARNING METHOD AND APPARATUS

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email