Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1-18 are presented in the case.
Priority
Acknowledgment is made of applicant's claim for foreign priority based on application TW111138637 filed in Taiwan on 10/12/2022. Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim 1-18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”)
Claim 1 and 10 have the following abstract idea analysis.
Step 1: The claim is directed to “a system and method”. The claim is directed to the statutory categories accordingly.
Step 2A Prong 1: claims recites an abstract idea limitation of "dividing the target devices into a plurality of training groups according to a similarity of the performance parameters;". The limitation is a mathematical concept (a similarity) and mental concept (grouping). See MPEP 2106.04(a)(2).
Step 2A Prong 2: The judicial exceptions recited in these claims are not integrated into a practical application. Merely invoking "models", "client devices ", or "central device" does not yield eligibility. Claims are still in line with mathematical/mental concepts such as claim 1 are not specific to a practical application. The additional elements as such are generic models and devices which do not include specialized hardware. See MPEP § 2106.05(f).
Claim 1 does not include a particular field but even doing so may not be sufficient to overcome the abstract idea rejection. Merely applying machine learning to a field or data without an advancement in the new field or new machine learning is ineligible. MPEP § 2106.05(h).
Step 2B: The claims do not contain significantly more than their judicial exceptions. models and devices are in their standard forms in the field. These additional elements are well-understood, routine, and conventional activity, see MPEP 2106.05(d)(II). Claim lacks any particular "how" or algorithm for a solution in a field in a novel way. Claims require more specificity on processes that would be incapable of simple mathematics, mental processes or use more substantial structure than conventional devices such as non-textbook implementations.
Regarding claims 2-9 and 11-18 merely narrow the previously recited abstract idea limitations with more abstract concepts and/or routine fundamental processes. For the reasons described above with respect to claim 1 and this judicial exception is not meaningfully integrated into a practical application, or significantly more than the abstract idea. Abstract idea steps 1, 2A prong 1 and 2 remain the same as independent analysis above. See specification for more practical application concepts as none are seen in claims 2-9 and 11-18.
With respect to step 2B The claims disclose similar limitations described for the independent claims above and do not provide anything significantly more than mathematical or mental concepts . Claims 2-9 and 11-18 recite the additional elements of "sorting a plurality of values", "calculating an importance ratio", "sorting the performance parameters", "notifying the target devices belonging to a same training group", "calculating a performance ratio", "assigning a plurality of weight values", "a loss value or a gradient value of a local model" and "an inferring duration, an inferring speed of using a local model and connection information" These elements are more abstract concepts, generic applications to a field of use or well-understood, routine, conventional activity (see MPEP § 2106.05(d) and can't be simply appended to qualify as significantly more or being a practical application. What type of application, or structure of components beyond generic machine learning is still unknown for these claims. Therefore claims 2-9 and 11-18 also recite abstract ideas that do not integrate into a practical application or amount to significantly more than the judicial exception, and are rejected under U.S.C. 101.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 5, 7-8, 10, 14 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnaswamy et al. (US 20220414464 A1 hereinafter Krishnaswamy) in view of Choudhary et al. (US 20190385043 A1 hereinafter Choudhary)
As to independent claim 1, Krishnaswamy teaches a federated learning method, comprising: [federated learning ¶81]
providing a plurality of importance parameters and a plurality of performance parameters by a plurality of client devices respectively to a central device; and [sources (clients) have feature quality parameters and label quality parameters (derived from a loss function ¶106) and competence (performance) ¶54, ¶52 "data quality parameter associated with the respective data source comprises at least one of a feature quality parameter associated with the features and a label quality parameter associated with the labels"]
performing a training procedure by the central device, wherein the training procedure comprises: [Fig. 6 illustrates learning procedure that does training updates using a server ¶96 " server may then receive a plurality of training updates (e.g., a difference Δ.sub.LS.sup.t in the example) from the selected subset of data sources, respectively (after the respective data source has generated the respective training update in response to the current global model received"]
selecting a plurality of target devices from the client devices according to a priority order associated with the importance parameters; [Selects subsets based on quality (priority order) Fig. 6 step 4 ¶96 "selecting the subset of data sources from the set of data sources (binned in the plurality of intervals) for federation for the current round t. In other words, the selection of the subset of M data sources may be based on data sources having been binned into a plurality (e.g., K) of quality ranges, which advantageously accounts for the varying quality ranges amongst the data sources."]
dividing the target devices into a plurality of training groups according to a similarity of the performance parameters; [bins (groups) sources based on quality range (similar performance) ¶96 " binning the set of data sources (1 to N) into a plurality of intervals (bins) of K quality ranges, and then selecting the subset of data sources from the set of data sources (binned in the plurality of intervals) for federation for the current round t. In other words, the selection of the subset of M data sources may be based on data sources having been binned into a plurality (e.g., K) of quality ranges, which advantageously accounts for the varying quality ranges amongst the data sources."]
notifying the target devices to perform a plurality of iterations according to the training groups respectively to generate a plurality of trained models, and transmitting the trained models back to the central device; and [server sends global model out to selected subsets receive from sources the updates back ¶81, ¶96 "sends to the selected subset of data sources"…"the most up-to-date (i.e., current) global model"…"The aggregator server may then receive a plurality of training updates (e.g., a difference Δ.sub.LS.sup.t in the example) from the selected subset of data sources, respectively (after the respective data source has generated the respective training update in response to the current global model received)"]
updating a global model based on the trained models; [updates global model ¶81 "aggregate the local models (e.g., corresponding to the “local machine learning model” as described hereinbefore according to various embodiments) and updates a joint global model"]
Krishnaswamy does not specifically teach when a convergence value of the global model does not fall within a default range or a number of times of performing the training procedure does not reach a default number, performing the training procedure again by the central device; and when the convergence value of the global model falls within the default range and the number of times of performing the training procedure reaches the default number, outputting the global model to the client devices by the central device.
However, Choudhary teaches when a convergence value of the global model does not fall within a default range or a number of times of performing the training procedure does not reach a default number, performing the training procedure again by the central device; and [convergence based on the iterations (number of times training) and threshold and repeats or continues iterations accordingly ¶64 "further adjust the adjusted global parameters until a point of convergence. For instance, the asynchronous training system 106 may continue training iterations until adjustments to global parameters fall below a threshold value in a consecutive threshold number of training iterations (e.g., multiple training iterations of a weighted average of modified parameter indicators are within a threshold range of one another)"]
when the convergence value of the global model falls within the default range and the number of times of performing the training procedure reaches the default number, outputting the global model to the client devices by the central device. [may continue even with convergence such a time or number of iterations which will continue to send out global parameters ¶64 "asynchronous training system 106 continues to send adjusted global parameters, receive modified parameter indicators from a subset of the client devices"]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning procedure by Krishnaswamy by incorporating the when the convergence value of the global model falls within the default range and the number of times of performing the training procedure reaches the default number, outputting the global model to the client devices by the central device disclosed by Choudhary because both techniques address the same field of federated learning and by incorporating Choudhary into Krishnaswamy provides more efficient use of resources reducing the computational and storage costs [Choudhary ¶4]
As to dependent claim 5, the rejection of claim 1 is incorporated, Krishnaswamy and Choudhary further teach wherein each of the training groups comprises more than one target device among the target devices, and notifying the target devices to perform the iterations according to the training groups respectively to generate the trained models comprises:
notifying the target devices belonging to a same training group to perform the iterations during a same training period. [Krishnaswamy bins get selected for federated learning and sent a model (notified) ¶96 "selecting the subset of data sources from the set of data sources (binned in the plurality of intervals) for federation for the current round t"]
As to dependent claim 7, the rejection of claim 1 is incorporated, Krishnaswamy and Choudhary further teach assigning a plurality of weight values to the trained models respectively according to more than one importance parameters belonging to the target devices among the importance parameters provided by the client devices; and [Krishnaswamy weights based on quality parameters associated with sources ¶43-45 " each training update received may be modified or adjusted (e.g., weighted) based on the data quality parameter associated with the corresponding data source (i.e., the data source which the training update is received from)"]
updating the global model according to the weight values and the trained models. [Krishnaswamy ¶43-45 "transmission from each participating data source for updating the global machine learning model "]
As to dependent claim 8, the rejection of claim 1 is incorporated, Krishnaswamy and Choudhary further teach wherein each of the importance parameters comprises a loss value or a gradient value of a local model of a respective one of the client devices. [Krishnaswamy quality learned from loss function ¶106 "With supervision of the classification task, the data quality index (capturing both feature quality and label quality) may be learned implicitly from the loss function."]
As to independent claim 10, Krishnaswamy teaches a federated learning system, comprising: [federated learning ¶81]
a plurality of client devices [data sources as devices ¶42] having a plurality of importance parameters and a plurality of performance parameters, respectively; and [sources (clients) have feature quality parameters and label quality parameters (derived from a loss function ¶106) and competence (performance) ¶54, ¶52 "data quality parameter associated with the respective data source comprises at least one of a feature quality parameter associated with the features and a label quality parameter associated with the labels"]
a central device [server ¶42] connected to the client devices, configured to obtain the importance parameters and the performance parameters, and perform a training procedure [Fig. 6 illustrates learning procedure that does training updates using a server ¶96 " server may then receive a plurality of training updates (e.g., a difference Δ.sub.LS.sup.t in the example) from the selected subset of data sources, respectively (after the respective data source has generated the respective training update in response to the current global model received"]
selecting a plurality of target devices from the client devices according to a priority order associated with the importance parameters; [Selects subsets based on quality (priority order) Fig. 6 step 4 ¶96 "selecting the subset of data sources from the set of data sources (binned in the plurality of intervals) for federation for the current round t. In other words, the selection of the subset of M data sources may be based on data sources having been binned into a plurality (e.g., K) of quality ranges, which advantageously accounts for the varying quality ranges amongst the data sources."]
dividing the target devices into a plurality of training groups according to a similarity of the performance parameters; [bins (groups) sources based on quality range (similar performance) ¶96 " binning the set of data sources (1 to N) into a plurality of intervals (bins) of K quality ranges, and then selecting the subset of data sources from the set of data sources (binned in the plurality of intervals) for federation for the current round t. In other words, the selection of the subset of M data sources may be based on data sources having been binned into a plurality (e.g., K) of quality ranges, which advantageously accounts for the varying quality ranges amongst the data sources."]
notifying the target devices to perform a plurality of iterations according to the training groups respectively to generate a plurality of trained models, and transmitting the trained models back to the central device; and [server sends global model out to selected subsets receive from sources the updates back ¶81, ¶96 "sends to the selected subset of data sources"…"the most up-to-date (i.e., current) global model"…"The aggregator server may then receive a plurality of training updates (e.g., a difference Δ.sub.LS.sup.t in the example) from the selected subset of data sources, respectively (after the respective data source has generated the respective training update in response to the current global model received)"]
updating a global model based on the trained models; [updates global model ¶81 "aggregate the local models (e.g., corresponding to the “local machine learning model” as described hereinbefore according to various embodiments) and updates a joint global model"]
Krishnaswamy does not specifically teach when a convergence value of the global model does not fall within a default range or a number of times of performing the training procedure does not reach a default number, performing the training procedure again by the central device; and when the convergence value of the global model falls within the default range and the number of times of performing the training procedure reaches the default number, outputting the global model to the client devices by the central device.
However, Choudhary teaches when a convergence value of the global model does not fall within a default range or a number of times of performing the training procedure does not reach a default number, performing the training procedure again by the central device; and [convergence based on the iterations (number of times training) and threshold and repeats or continues iterations accordingly ¶64 "further adjust the adjusted global parameters until a point of convergence. For instance, the asynchronous training system 106 may continue training iterations until adjustments to global parameters fall below a threshold value in a consecutive threshold number of training iterations (e.g., multiple training iterations of a weighted average of modified parameter indicators are within a threshold range of one another)"]
when the convergence value of the global model falls within the default range and the number of times of performing the training procedure reaches the default number, outputting the global model to the client devices by the central device. [may continue even with convergence such a time or number of iterations which will continue to send out global parameters ¶64 "asynchronous training system 106 continues to send adjusted global parameters, receive modified parameter indicators from a subset of the client devices"]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning procedure by Krishnaswamy by incorporating the when the convergence value of the global model falls within the default range and the number of times of performing the training procedure reaches the default number, outputting the global model to the client devices by the central device disclosed by Choudhary because both techniques address the same field of federated learning and by incorporating Choudhary into Krishnaswamy provides more efficient use of resources reducing the computational and storage costs [Choudhary ¶4]
As to dependent claim 14, the rejection of claim 10 is incorporated, Krishnaswamy and Choudhary further teach wherein each of the training groups comprises more than one target device among the target devices, and the central device notifies the target devices belonging to a same training group to perform the iterations during a same training period. [Krishnaswamy bins get selected for federated learning and sent a model (notified) ¶96 "selecting the subset of data sources from the set of data sources (binned in the plurality of intervals) for federation for the current round t"]
As to dependent claim 16, the rejection of claim 10 is incorporated, Krishnaswamy and Choudhary further teach assigning a plurality of weight values to the trained models respectively according to more than one importance parameters belonging to the target devices among the importance parameters provided by the client devices; and [Krishnaswamy weights based on quality parameters associated with sources ¶43-45 " each training update received may be modified or adjusted (e.g., weighted) based on the data quality parameter associated with the corresponding data source (i.e., the data source which the training update is received from)"]
updating the global model according to the weight values and the trained models. [Krishnaswamy ¶43-45 "transmission from each participating data source for updating the global machine learning model "]
As to dependent claim 17, the rejection of claim 10 is incorporated, Krishnaswamy and Choudhary further teach wherein each of the importance parameters comprises a loss value or a gradient value of a local model of a respective one of the client devices. [Krishnaswamy quality learned from loss function ¶106 "With supervision of the classification task, the data quality index (capturing both feature quality and label quality) may be learned implicitly from the loss function."]
Claims 2, 4, 11 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnaswamy in view of Choudhary, as applied in the rejection of claim 1 and 10 above, and further in view of LIU et al. (US 20220366320 A1 hereinafter Liu)
As to dependent claim 2, Krishnaswamy and Choudhary teach the rejection of claim 1 that is incorporated.
Krishnaswamy and Choudhary do not specifically teach sorting a plurality of values" associated with the importance parameters respectively from high to low; and using N client devices corresponding to a first value to a Nth value among the values sorted from high to low as the target devices, wherein N is a positive integer that is equal to or greater than 2.
However, Liu teaches sorting a plurality of values associated with the importance parameters respectively from high to low; and [devices are sorted or ranked (high to low) ¶69 "For each task, the terminal devices may be sorted according to the probability that this task corresponds to each terminal device, for example, may be sorted according to a sequence from high to low or from low to high. If sorting is performed according to the sequence from high to low, a preset number of terminal devices ranked ahead is selected to serve as the target terminal device corresponding to the task. If sorting is performed according to the sequence from low to high, a preset number of terminal devices ranked behind is selected as the target terminal device corresponding to the task."]
using N client devices corresponding to a first value to a Nth value among the values sorted from high to low as the target devices, wherein N is a positive integer that is equal to or greater than 2. [uses a preset number from the ranking ¶69 "a preset number of terminal devices ranked ahead is selected to serve as the target terminal device corresponding to the task"]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy and Choudhary by incorporating the sorting a plurality of values" associated with the importance parameters respectively from high to low; and using N client devices corresponding to a first value to a Nth value among the values sorted from high to low as the target devices, wherein N is a positive integer that is equal to or greater than 2 disclosed by Liu because all techniques address the same field of federated learning systems and by incorporating Liu into Krishnaswamy and Choudhary better ensure performance of models while preserving privacy [Liu ¶3]
As to dependent claim 4, Krishnaswamy and Choudhary teach the rejection of claim 1 that is incorporated.
Krishnaswamy and Choudhary further teach grouping the performance parameters that are sorted to form the training groups, wherein said grouping comprises putting adjacent ones of the sorted performance parameters into one group for multiple times. [Krishnaswamy bins (groups) sources based on quality range (similar performance) ¶96 " binning the set of data sources (1 to N) into a plurality of intervals (bins) of K quality ranges, and then selecting the subset of data sources from the set of data sources (binned in the plurality of intervals) for federation for the current round t. In other words, the selection of the subset of M data sources may be based on data sources having been binned into a plurality (e.g., K) of quality ranges, which advantageously accounts for the varying quality ranges amongst the data sources."]
Krishnaswamy and Choudhary do not specifically teach sorting the performance parameters from high to low or from low to high;
However, Liu teaches s sorting the performance parameters from high to low or from low to high; and [devices are sorted or ranked (high to low) ¶69 "For each task, the terminal devices may be sorted according to the probability that this task corresponds to each terminal device, for example, may be sorted according to a sequence from high to low or from low to high. If sorting is performed according to the sequence from high to low, a preset number of terminal devices ranked ahead is selected to serve as the target terminal device corresponding to the task. If sorting is performed according to the sequence from low to high, a preset number of terminal devices ranked behind is selected as the target terminal device corresponding to the task."]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy and Choudhary by incorporating the sorting the performance parameters from high to low or from low to high disclosed by Liu because all techniques address the same field of federated learning systems and by incorporating Liu into Krishnaswamy and Choudhary better ensure performance of models while preserving privacy [Liu ¶3]
As to dependent claim 11, Krishnaswamy and Choudhary teach the rejection of claim 10 that is incorporated.
Krishnaswamy and Choudhary do not specifically teach sorting a plurality of values" associated with the importance parameters respectively from high to low; and using N client devices corresponding to a first value to a Nth value among the values sorted from high to low as the target devices, wherein N is a positive integer that is equal to or greater than 2.
However, Liu teaches sorting a plurality of values associated with the importance parameters respectively from high to low; and [devices are sorted or ranked (high to low) ¶69 "For each task, the terminal devices may be sorted according to the probability that this task corresponds to each terminal device, for example, may be sorted according to a sequence from high to low or from low to high. If sorting is performed according to the sequence from high to low, a preset number of terminal devices ranked ahead is selected to serve as the target terminal device corresponding to the task. If sorting is performed according to the sequence from low to high, a preset number of terminal devices ranked behind is selected as the target terminal device corresponding to the task."]
using N client devices corresponding to a first value to a Nth value among the values sorted from high to low as the target devices, wherein N is a positive integer that is equal to or greater than 2. [uses a preset number from the ranking ¶69 "a preset number of terminal devices ranked ahead is selected to serve as the target terminal device corresponding to the task"]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy and Choudhary by incorporating the sorting a plurality of values" associated with the importance parameters respectively from high to low; and using N client devices corresponding to a first value to a Nth value among the values sorted from high to low as the target devices, wherein N is a positive integer that is equal to or greater than 2 disclosed by Liu because all techniques address the same field of federated learning systems and by incorporating Liu into Krishnaswamy and Choudhary better ensure performance of models while preserving privacy [Liu ¶3]
As to dependent claim 13, Krishnaswamy and Choudhary teach the rejection of claim 10 that is incorporated.
Krishnaswamy and Choudhary further teach grouping the performance parameters that are sorted to form the training groups, wherein said grouping comprises putting adjacent ones of the sorted performance parameters into one group for multiple times. [Krishnaswamy bins (groups) sources based on quality range (similar performance) ¶96 " binning the set of data sources (1 to N) into a plurality of intervals (bins) of K quality ranges, and then selecting the subset of data sources from the set of data sources (binned in the plurality of intervals) for federation for the current round t. In other words, the selection of the subset of M data sources may be based on data sources having been binned into a plurality (e.g., K) of quality ranges, which advantageously accounts for the varying quality ranges amongst the data sources."]
Krishnaswamy and Choudhary do not specifically teach sorting the performance parameters from high to low or from low to high;
However, Liu teaches s sorting the performance parameters from high to low or from low to high; and [devices are sorted or ranked (high to low) ¶69 "For each task, the terminal devices may be sorted according to the probability that this task corresponds to each terminal device, for example, may be sorted according to a sequence from high to low or from low to high. If sorting is performed according to the sequence from high to low, a preset number of terminal devices ranked ahead is selected to serve as the target terminal device corresponding to the task. If sorting is performed according to the sequence from low to high, a preset number of terminal devices ranked behind is selected as the target terminal device corresponding to the task."]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy and Choudhary by incorporating the sorting the performance parameters from high to low or from low to high disclosed by Liu because all techniques address the same field of federated learning systems and by incorporating Liu into Krishnaswamy and Choudhary better ensure performance of models while preserving privacy [Liu ¶3]
Claims 3, 6, 12 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnaswamy in view of Choudhary and optionally Liu, as applied in the rejection of claim 1, 2, 10 and 11 above, and further in view of FRABONI et al. (US 20240037234 A1 hereinafter Fraboni)
As to dependent claim 3, Krishnaswamy, Choudhary and Liu teach the rejection of claim 2 that is incorporated.
Krishnaswamy, Choudhary and Liu do not specifically teach using each of the client devices as a candidate device and performing: calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; and using the importance ratio as one of the values.
However, Fraboni teaches using each of the client devices as a candidate device and performing: [scores devices as candidates ¶7]
calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; and [percentage of incentive for each device (ration) ¶40 "As shown in FIG. 3A, using each client contribution value and relative amount of data, the processor 102 may estimate the percentage of the incentive for each client device. In another example, as shown in FIG. 3B, with the summary of the importance score of the plurality of client devices in the global model, the processor 102 may provide client device-specific bandwidth for a faster collaborative learning process. For example, the processor 102 may provide more bandwidth for contributors with data of good quality, and provide less bandwidth for the ones with data of poor quality."]
using the importance ratio as one of the values. [uses scores to eliminate candidates ¶7]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy, Choudhary and Liu by incorporating the using each of the client devices as a candidate device and performing: calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; and using the importance ratio as one of the values disclosed by Fraboni because all techniques address the same field of federated learning systems and by incorporating Fraboni into Krishnaswamy, Choudhary and Liu addresses problems with trusting clients and reduces the complexity of tasks [Fraboni ¶5]
As to dependent claim 6, Krishnaswamy, Choudhary and Liu teach the rejection of claim 1 that is incorporated.
Krishnaswamy and Choudhary further teach before selecting the target devices from the client devices according to the priority order associated with the importance parameters, by the central device, using each of the client devices as a candidate device and performing: calculating an importance average between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; [Krishnaswamy weighted average of quality (importance) ¶51]
Krishnaswamy and Choudhary do not specifically teach before selecting the target devices from the client devices according to the priority order associated with the importance parameters, by the central device, using each of the client devices as a candidate device and performing: calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; calculating a performance ratio between one of the performance parameters, which belongs to the candidate device, and a sum of the performance parameters; and removing the candidate device from the client devices when a difference between the importance ratio and the performance ratio is greater than a default value.
However, Fraboni teaches calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; [percentage of incentive for each device (ration) ¶40 "As shown in FIG. 3A, using each client contribution value and relative amount of data, the processor 102 may estimate the percentage of the incentive for each client device. In another example, as shown in FIG. 3B, with the summary of the importance score of the plurality of client devices in the global model, the processor 102 may provide client device-specific bandwidth for a faster collaborative learning process. For example, the processor 102 may provide more bandwidth for contributors with data of good quality, and provide less bandwidth for the ones with data of poor quality."]
calculating a performance ratio between one of the performance parameters, which belongs to the candidate device, and a sum of the performance parameters; and [optimum score is based on performance and finds best performing models/devices (sum) ¶21, ¶26 "optimum score may be a score of the local model which is performing the best for client data"]
removing the candidate device from the client devices when a difference between the importance ratio and the performance ratio is greater than a default value. [eliminates (removes) devices when according to combined scores ¶7 "system eliminates from the set of selected devices for the collaborative machine learning, each client device which is clustered as the attacker class, by setting an importance score to zero for each client device clustered in the attacker class based on the grading score."]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy and Choudhary by incorporating the before selecting the target devices from the client devices according to the priority order associated with the importance parameters, by the central device, using each of the client devices as a candidate device and performing: calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; calculating a performance ratio between one of the performance parameters, which belongs to the candidate device, and a sum of the performance parameters; and removing the candidate device from the client devices when a difference between the importance ratio and the performance ratio is greater than a default value disclosed by Fraboni because all techniques address the same field of federated learning systems and by incorporating Fraboni into Krishnaswamy and Choudhary addresses problems with trusting clients and reduces the complexity of tasks [Fraboni ¶5]
As to dependent claim 12, Krishnaswamy, Choudhary and Liu teach the rejection of claim 11 that is incorporated.
Krishnaswamy, Choudhary and Liu do not specifically teach using each of the client devices as a candidate device and performing: calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; and using the importance ratio as one of the values.
However, Fraboni teaches using each of the client devices as a candidate device and performing: [scores devices as candidates ¶7]
calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; and [percentage of incentive for each device (ration) ¶40 "As shown in FIG. 3A, using each client contribution value and relative amount of data, the processor 102 may estimate the percentage of the incentive for each client device. In another example, as shown in FIG. 3B, with the summary of the importance score of the plurality of client devices in the global model, the processor 102 may provide client device-specific bandwidth for a faster collaborative learning process. For example, the processor 102 may provide more bandwidth for contributors with data of good quality, and provide less bandwidth for the ones with data of poor quality."]
using the importance ratio as one of the values. [uses scores to eliminate candidates ¶7]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy, Choudhary and Liu by incorporating the using each of the client devices as a candidate device and performing: calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; and using the importance ratio as one of the values disclosed by Fraboni because all techniques address the same field of federated learning systems and by incorporating Fraboni into Krishnaswamy, Choudhary and Liu addresses problems with trusting clients and reduces the complexity of tasks [Fraboni ¶5]
As to dependent claim 15, Krishnaswamy, Choudhary and Liu teach the rejection of claim 10 that is incorporated.
Krishnaswamy and Choudhary further teach before selecting the target devices from the client devices according to the priority order associated with the importance parameters, by the central device, using each of the client devices as a candidate device and performing: calculating an importance average between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; [Krishnaswamy weighted average of quality (importance) ¶51]
Krishnaswamy and Choudhary do not specifically teach before selecting the target devices from the client devices according to the priority order associated with the importance parameters, by the central device, using each of the client devices as a candidate device and performing: calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; calculating a performance ratio between one of the performance parameters, which belongs to the candidate device, and a sum of the performance parameters; and removing the candidate device from the client devices when a difference between the importance ratio and the performance ratio is greater than a default value.
However, Fraboni teaches calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; [percentage of incentive for each device (ration) ¶40 "As shown in FIG. 3A, using each client contribution value and relative amount of data, the processor 102 may estimate the percentage of the incentive for each client device. In another example, as shown in FIG. 3B, with the summary of the importance score of the plurality of client devices in the global model, the processor 102 may provide client device-specific bandwidth for a faster collaborative learning process. For example, the processor 102 may provide more bandwidth for contributors with data of good quality, and provide less bandwidth for the ones with data of poor quality."]
calculating a performance ratio between one of the performance parameters, which belongs to the candidate device, and a sum of the performance parameters; and [optimum score is based on performance and finds best performing models/devices (sum) ¶21, ¶26 "optimum score may be a score of the local model which is performing the best for client data"]
removing the candidate device from the client devices when a difference between the importance ratio and the performance ratio is greater than a default value. [eliminates (removes) devices when according to combined scores ¶7 "system eliminates from the set of selected devices for the collaborative machine learning, each client device which is clustered as the attacker class, by setting an importance score to zero for each client device clustered in the attacker class based on the grading score."]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy and Choudhary by incorporating the before selecting the target devices from the client devices according to the priority order associated with the importance parameters, by the central device, using each of the client devices as a candidate device and performing: calculating an importance ratio between one of the importance parameters, which belongs to the candidate device, and a sum of the importance parameters; calculating a performance ratio between one of the performance parameters, which belongs to the candidate device, and a sum of the performance parameters; and removing the candidate device from the client devices when a difference between the importance ratio and the performance ratio is greater than a default value disclosed by Fraboni because all techniques address the same field of federated learning systems and by incorporating Fraboni into Krishnaswamy and Choudhary addresses problems with trusting clients and reduces the complexity of tasks [Fraboni ¶5]
Claims 9 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnaswamy in view of Choudhary, as applied in the rejection of claim 1 and 10 above, and further in view of SUN et al. (US 20230385688 A1 hereinafter Sun)
As to dependent claim 9, Krishnaswamy and Choudhary teach the rejection of claim 1 that is incorporated.
Krishnaswamy and Choudhary do not specifically teach wherein each of the performance parameters comprises at least one of an inferring duration, an inferring speed of using a local model and connection information of a respective one of the client devices.
However, Sun teaches wherein each of the performance parameters comprises at least one of an inferring duration, an inferring speed of using a local model and connection information of a respective one of the client devices. [time interval, collection time, distance threshold and channel bandwidth concern the connection and relate to duration/speed ¶46-47, ¶60 "The network performance parameter of the distributed node UE.sub.1 indicates the network performance of the distributed node UE.sub.1, which may include, for example, one or more of the channel bandwidth, degree of interference, and channel quality of the distributed node UE.sub.1."]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy and Choudhary by incorporating the wherein each of the performance parameters comprises at least one of an inferring duration, an inferring speed of using a local model and connection information of a respective one of the client devices disclosed by Sun because all techniques address the same field of federated learning systems and by incorporating Sun into Krishnaswamy and Choudhary provide more reliable channels for aggregation in learning utilizing correlation analysis to ensure rapid convergence [Sun ¶5-6]
As to dependent claim 18, Krishnaswamy and Choudhary teach the rejection of claim 10 that is incorporated.
Krishnaswamy and Choudhary do not specifically teach wherein each of the performance parameters comprises at least one of an inferring duration, an inferring speed of using a local model and connection information of a respective one of the client devices.
However, Sun teaches wherein each of the performance parameters comprises at least one of an inferring duration, an inferring speed of using a local model and connection information of a respective one of the client devices. [time interval, collection time, distance threshold and channel bandwidth concern the connection and relate to duration/speed ¶46-47, ¶60 "The network performance parameter of the distributed node UE.sub.1 indicates the network performance of the distributed node UE.sub.1, which may include, for example, one or more of the channel bandwidth, degree of interference, and channel quality of the distributed node UE.sub.1."]
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the learning techniques disclosed by Krishnaswamy and Choudhary by incorporating the wherein each of the performance parameters comprises at least one of an inferring duration, an inferring speed of using a local model and connection information of a respective one of the client devices disclosed by Sun because all techniques address the same field of federated learning systems and by incorporating Sun into Krishnaswamy and Choudhary provide more reliable channels for aggregation in learning utilizing correlation analysis to ensure rapid convergence [Sun ¶5-6]
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
Wang et al. (US 11574254 B2) teaches a check for convergence and if not update model parameter (see Fig. 3A 305(train again) Col. 6 ln 22-35)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BEAU SPRATT whose telephone number is (571)272-9919. The examiner can normally be reached M-F 8:30-5 PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Welch can be reached on 5712127212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional q