Prosecution Insights
Last updated: April 19, 2026
Application No. 18/227,535

GROUP BIAS MITIGATION IN FEDERATED LEARNING SYSTEMS

Non-Final OA §102§103
Filed
Jul 28, 2023
Examiner
KATZ, DYLAN MICHAEL
Art Unit
3657
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Cisco Technology Inc.
OA Round
1 (Non-Final)
87%
Grant Probability
Favorable
1-2
OA Rounds
2y 7m
To Grant
99%
With Interview

Examiner Intelligence

Grants 87% — above average
87%
Career Allow Rate
242 granted / 279 resolved
+34.7% vs TC avg
Strong +21% interview lift
Without
With
+20.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
45 currently pending
Career history
324
Total Applications
across all art units

Statute-Specific Performance

§101
7.7%
-32.3% vs TC avg
§103
50.0%
+10.0% vs TC avg
§102
20.3%
-19.7% vs TC avg
§112
16.5%
-23.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 279 resolved cases

Office Action

§102 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale , or otherwise available to the public before the effective filing date of the claimed invention. (a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention. Claim(s) 1 -4, 6-1 4, 16-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Zhou et al ( US 20230084507 , hereinafter Zhou ) . Regarding Claim 1, Zhou teaches: 1. A method (see at least " a method for training a primary machine learning model using vertical federated learning. " in par. 0022) comprising: generating, by a supervisory device in a federated learning system, an aggregated model that aggregates a plurality of machine learning models trained by trainer nodes in the federated learning system during a training round (see at least " The server 110 may be used to train a centralized global model (referred to hereinafter as a global model) using FL. " in par. 0061 and “ Instead, the federated learning module 125 executed by the server 110 aggregates the predictions of the local models 136 and propagates an aggregated prediction 658 to the computing systems 102. ” In par. 0155 ) ; computing, by the supervisory device, an accuracy loss metric for the aggregated model (see at least " At 404, the global model 126 is trained. The computing systems 102 of all data owners 1 through k compute their model outputs 360, o.sub.1, . . . , o.sub.k, based on their respective local datasets 140, and send the outputs 360 to the server 110. Based on the outputs 360 received from the computing systems 102 of the data owners 1 through k, the server 110 computes its global prediction 358, o.sub.0=f.sub.0(o.sub.1, . . . , o.sub.k|θ.sub.0) (i.e. the output of the global model 126), and uses its global prediction 358 (i.e. predicted labels) to compute: [0132] The loss related to the task T, using the labels 302; " in par. 0131) ; computing, by the supervisory device, a fairness loss metric for the aggregated model based on fairness-related metrics associated with the plurality of machine learning models trained by the trainer nodes (see at least “ The computing system 102 of an active party can calculate both the loss function with respect to the task T, and the fairness, locally. ” In par. 0098 and " The DEO (i.e. |{circumflex over (l)}.sup.a(θ)−{circumflex over (l)}.sup.b(θ)|) for each pair of protected classes a and b relative to each other, using the protected class information 304; " in par. 0131 and “ Thus, the server 110 has to communicate with the computing system 102a of the task owner to compute the loss and the fairness constraint function, and to generate partitions 354. Thus, those calculations are instead performed by the computing system 102a of the task owner, and the server 110 acts only to distribute this information to the other computing systems 102 and aggregate the outputs 360 of the local models 136. ” In par. 0176 ) ; and initiating, by the supervisory device, an additional training round during which the trainer nodes retrain their machine learning models for aggregation by the supervisory device, in accordance with a constrained optimization problem that seeks to optimize a tradeoff between accuracy and fairness associated with the aggregated model. (see at least “ At 404, the global model 126 is trained. ” In par. 0131, “ Based on the calculated DEO (i.e. |{circumflex over (l)}.sup.a(θ)−{circumflex over (l)}.sup.b(θ)|) for a given fairness constraint, the server 110 updates the variable λ associated with the constraint. The server 110 then broadcasts λ and respective local gradients 352 ” in par. 0137 and " As shown in FIG. 4, step 404 is iterated until a convergence condition is satisfied at step 414. " in par. 0141 ) Regarding Claim 2, Zhou teaches: The method as in claim 1, wherein the supervisory device generates the aggregated model based on model parameters associated with the plurality of machine learning models trained by the trainer nodes. (see at least "s teps of method 400 shown in FIG. 4 may be performed in parallel, and their sub-steps as described below may include overlapping operations: for example, the training of the global model at 404 may be performed in parallel with training of each local model at 408, wherein each global model and local model is trained during each round of mutual communication. Thus, the sub-steps of 404 and each iteration of 408 described below may refer to the same sub-steps performed in each of the other training steps for other models (i.e., 404 or another iteration of 408). " in par. 0147) Regarding Claim 3, Zhou teaches: The method as in claim 2, wherein the trainer nodes do not share their training data on which they trained the plurality of machine learning models with the supervisory device. (see at least " In contrast, examples described herein may preserve the privacy of the local datasets when training a model using vertically partitioned data 230. In some examples described herein, none of the computing systems 102 of data owners custom-character.sub.1, . . . , custom-character.sub.N exposes its respective private data custom-character.sub.1, . . . , custom-character.sub.N or model parameters, but the computing systems 102 of all the data owners collaboratively use their private data custom-character.sub.1, . . . , custom-character.sub.N to train a model custom-character.sub.FED which has comparable performance to a hypothetical model custom-character.sub.SUM which had been trained using data collected from the computing systems of the data owners. " in par. 0075) . Regarding Claim 4, Zhou teaches: 4. The method as in claim 1, further comprising: receiving, at the supervisory device, the fairness-related metrics from the trainer nodes. (see at least " The computing system 102 of an active party can calculate both the loss function with respect to the task T, and the fairness, locally. " in par. 0098 ) Regarding Claim 6, Zhou teaches: 6. The method as in claim 1, further comprising: determining, by the supervisory device, that the additional training round resulting in an optimized aggregated model. (see at least " As shown in FIG. 4, step 404 is iterated until a convergence condition is satisfied at step 414. " in par. 0141) Regarding Claim 7, Zhou teaches: 7. The method as in claim 1, wherein the trainer nodes are geographically distributed. (see at least " The system 100 includes a plurality of computing systems 102 wherein each computing system 102 is controlled by one of a plurality of different data owners. The computing system 102 of each data owner collects and stores a respective set of private data (also referred to as a local dataset or private dataset) … A computing system 102 may be a server, a collection of servers, an edge device, an end user device (which may include such devices (or may be referred to) as a client device/terminal, user equipment/device (UE), wireless transmit/receive unit (WTRU), mobile station, fixed or mobile subscriber unit, cellular telephone, station (STA), personal digital assistant (PDA), smartphone, laptop, computer, tablet, wireless sensor, wearable device, smart device, machine type communications device, smart (or connected) vehicles, or consumer electronics device, among other possibilities), or may be a network device (which may include (or may be referred to as) a base station (BS), router, access point (AP), personal basic service set (PBSS) coordinate point (PCP), eNodeB, or gNodeB, among other possibilities). " in par. 0059 ) Regarding Claim 8, Zhou teaches: 8. The method as in claim 1, further comprising: determining, by the supervisory device, whether a further training round is needed after the additional training round to optimize a tradeoff between accuracy and fairness associated with the aggregated model. (see at least " As shown in FIG. 4, step 404 is iterated until a convergence condition is satisfied at step 414. " in par. 0141) Regarding Claim 9, Zhou teaches: 9. The method as in claim 1, wherein the aggregated model is configured to classify sensitive or confidential information. (see at least "M L models trained on datasets of personal and financial data can often become biased with respect to certain sensitive attributes such as gender, age etc. This may be the result of strong correlation of such sensitive attributes with other non-sensitive attributes such as salary and education. " in par. 0077) Regarding Claim 10, Zhou teaches: 10. The method as in claim 1, wherein the aggregated model is configured to classify image data. (see at least " As used herein, an “input sample” may refer to any data sample used as an input to a machine learning model, such as image data. It may refer to a training data sample used to train a machine learning model, or to a data sample provided to a trained machine learning model which will infer (i.e. predict) an output based on the data sample for the task for which the machine learning model has been trained. Thus, for a machine learning model that performs a task of image classification, an input sample may be a single digital image. " in par. 0014) Regarding Claim 11, Zhou also teaches: a n apparatus (see at least "server 110" in par. 0064) , comprising: one or more network interfaces (see at least " The server 110 may include one or more network interfaces 122 for wired or wireless communication with the network 104, the computing systems 102, or other entity in the system 100. " in par. 0064) ; a processor coupled to the one or more network interfaces and configured to execute one or more processes (see at least " The server 110 may include one or more processing devices 114, such as a processor, a microprocessor, a digital signal processor, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a dedicated logic circuitry, a dedicated artificial intelligence processor unit, a tensor processing unit, a neural processing unit, a hardware accelerator, or combinations thereof. The one or more processing devices 114 may be jointly referred to herein as a processor 114, processor device 114, or processing device 114. " in par. 0063) ; and a memory configured to store a process that is executable by the processor, the process when executed configured to (see at least " The server 110 may include one or more memories 128, which may include a volatile or non-volatile memory (e.g., a flash memory, a random access memory (RAM), and/or a read-only memory (ROM)). " in par. 0066) : implement the method of Claim 1 (see Claim 1 analysis for rejection of the method) Regarding Claim 12, Zhou also teaches: An apparatus for implementing the method of Claim 2 (see Claim 2 analysis for rejection of the method) Regarding Claim 1 3 , Zhou also teaches: An apparatus for implementing the method of Claim 3 (see Claim 3 analysis for rejection of the method) Regarding Claim 14, Zhou also teaches: An apparatus for implementing the method of Claim 4 (see Claim 4 analysis for rejection of the method) Regarding Claim 16, Zhou also teaches: An apparatus for implementing the method of Claim 6 (see Claim 6 analysis for rejection of the method) Regarding Claim 17, Zhou also teaches: An apparatus for implementing the method of Claim 7 (see Claim 7 analysis for rejection of the method) Regarding Claim 18, Zhou also teaches: An apparatus for implementing the method of Claim 8 (see Claim 8 analysis for rejection of the method) Regarding Claim 19, Zhou also teaches: An apparatus for implementing the method of Claim 9 (see Claim 9 analysis for rejection of the method) Regarding Claim 20, Zhou also teaches: A tangible, non-transitory, computer-readable medium storing program instructions that cause a supervisory device in a federated learning system to execute a process comprising (see at least " The server 110 may include one or more memories 128, which may include a volatile or non-volatile memory (e.g., a flash memory, a random access memory (RAM), and/or a read-only memory (ROM)). The one or more non-transitory memories 128 may be jointly referred to herein as a memory 128 for simplicity. The memory 128 may store processor executable instructions 129 for execution by the processing device(s) 114, such as to carry out examples described in the present disclosure. " in par. 0066) : The method of Claim 1 (see Claim 1 analysis for rejection of the method) Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 5 , 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al ( US 20230084507 , hereinafter Zhou ) in view of Watanabe et al ( US 20230050708 , hereinafter Watanabe ). Regarding Claim 5, Zhou teaches: 5. The method as in claim 1, Zhou does not appear to explicitly teach all of the following, but Watanabe does teach : wherein a particular one of the trainer nodes computes a fairness-related metric for its machine learning model based on a difference in ratios of populations of training data that it used to train that machine learning model to that of global populations of training data used across the trainer nodes. (see at least “ Node analysis module 120 may analyze statistical data that describes the local training data of distributed computing nodes 130A-130N, rather than analyzing the training data itself. ” In par. 0031 and " In some embodiments, node analysis module 120 identifies bias in local training data sets by comparing the count of data samples for an overrepresented label to the count of data samples for other labels. For example, one or more outliers may be identified corresponding to labels that have more samples as compared to other labels, and the counts of those one or more outlying labels can be compared to the counts of the other labels to determine whether the training data set is biased. In some embodiments, the label having the highest count of data samples may be selected, and compared with the counts of the other labels; if the ratio of counts of one or more of the other labels to the count of most-represented label does not surpass a threshold value, then the training data set may be considered to be biased. ” in par. 00 32 and “ This data sample may initially be used for a first iteration of testing, or a small amount of other data samples may also be included before performing a first iteration of testing. At each iteration, the set of training data is used to train a test model, whose accuracy is tested; additionally, at each iteration, the set of training data becomes less biased, as more data samples corresponding to the underrepresented labels are included in the training data. For example, a first iteration may include a test model that is trained using training data that includes one thousand data samples of one particular label and only ten data samples of each other label, and a second iteration may increase the size of the other labels to include twenty samples each, etc. ” in par. 0035 and “ Threshold selection module 125 may select a threshold value corresponding to the ratio of the count of data samples for the one or more underrepresented data labels to the count of data samples for the overrepresented data label when the model's performance does not improve beyond a threshold amount for a number of iterations. ” In par. 0037) It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method taught by Zhou to incorporate the teachings of Watanabe wherein statistical metrics are computed to asses underrepresented and overrepresented samples in local training datasets of a federated learning system and bias is addressed during the training process by balancing the label representation, in order to arrive at the local fairness metric taught by Zhou being based on how appropriately labels are represented in each local dataset relative to the global dataset. The motivation to incorporate the teachings of Watanabe would be to reduce bias in the global model and reduce the time and computation resources needed for training (see par. 0020) . Regarding Claim 15, Zhou as modified by Watanabe also teaches: An apparatus for implementing the method of Claim 5 (see Claim 5 analysis for rejection of the method) Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to FILLIN "Examiner name" \* MERGEFORMAT DYLAN M KATZ whose telephone number is FILLIN "Phone number" \* MERGEFORMAT (571)272-2776 . The examiner can normally be reached FILLIN "Work Schedule?" \* MERGEFORMAT Mon-Thurs. 8:00-6:00 . Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FILLIN "SPE Name?" \* MERGEFORMAT Abby Lin can be reached on FILLIN "SPE Phone?" \* MERGEFORMAT (571) 270-3976 . The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /DYLAN M KATZ/ Primary Examiner, Art Unit 3657
Read full office action

Prosecution Timeline

Jul 28, 2023
Application Filed
Feb 26, 2026
Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596378
Autonomous Control and Navigation of Unmanned Vehicles
2y 5m to grant Granted Apr 07, 2026
Patent 12594663
ROBOT SYSTEM AND CART
2y 5m to grant Granted Apr 07, 2026
Patent 12589499
Mobile Construction Robot
2y 5m to grant Granted Mar 31, 2026
Patent 12589491
METHODS, SYSTEMS, AND DEVICES FOR MOTION CONTROL OF AT LEAST ONE WORKING HEAD
2y 5m to grant Granted Mar 31, 2026
Patent 12582491
CONTROL OF A SURGICAL INSTRUMENT HAVING BACKLASH, FRICTION, AND COMPLIANCE UNDER EXTERNAL LOAD IN A SURGICAL ROBOTIC SYSTEM
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
87%
Grant Probability
99%
With Interview (+20.8%)
2y 7m
Median Time to Grant
Low
PTA Risk
Based on 279 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month