Last updated: May 29, 2026
Application No. 17/969,159
DATA PROCESSING METHOD, APPARATUS, AND DEVICE, AND STORAGE MEDIUM

Non-Final OA §103
Filed
Oct 19, 2022
Priority
Dec 02, 2020 — CN 202011385627.5 +1 more
Examiner
DIEP, DUY T
Art Unit
2123
Tech Center
2100 — Computer Architecture & Software
Assignee
Tencent Technology (Shenzhen) Company Limited
OA Round
3 (Non-Final)
Interview Optional

— +6.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 29% grant rate with +6.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 24 resolved cases, 2023–2026
Examiner Intelligence

DIEP, DUY T View full profile →
Grants only 29% of cases
Career Allowance Rate
7 granted / 24 resolved
-25.8% vs TC avg
Moderate +7% lift
Without
With
+6.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 3m
Avg Prosecution
17 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
2.0%
-38.0% vs TC avg
§103
98.0%
+58.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 24 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR
1.17(e), was filed in this application after final rejection. Since this application is eligible for continued
examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the
finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's
submission filed on 01/15/2026 has been entered.


Response to Amendment
The amendments filed 02/04/2026 have been entered. Claims 1, 3-9, 11-12, 14, 16-19 remain pending in the application.
Applicant’s amendments and arguments, with respect to claim rejections of claims 1, 3-9, 11-12, 14, 16-19 under 35 U.S.C 103 filed 12/03/2025 have been considered and are persuasive.
The applicant argues that the amended claims recite additional limitations not taught or suggested by the applied references. In particular, the applicant contends that the cited art fails to teach model parameters being rounded using a multiplier P indicating a degree of preservation of floating-point precision, obtaining and rounding a candidate random mask using that multiplier P, transmitting an encrypted model parameter obtained using the random mask, and removing the multiplier P during weighted averaging. The applicant further argues that Wiedemann only discloses deriving an NN parameter based on a multiplier, bit shift number, and quantization value, but does not teach the claimed removal of the multiplier P for rounding the candidate random mask and obtaining the encrypted model parameter. Accordingly, the applicant asserts that the 103 rejections should be withdrawn.
However, upon further consideration, new ground(s) of rejections have been raised (See Below.)


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claims 1, 3-4, 7, 12, 14, 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhu et.al (US 20220114475 A1) in view of Hu et al (NPL: Decentralized Federated Learning: A Segmented Gossip Approach), further in view of Gama et.al (US 20200304293 A1)

Regarding claim 1,
Zhu teaches the limitation “A data processing method in a data processing system, the data processing system comprising at least three devices including a first device and a second device connected to the first device, the at least three devices in the data processing system being connected in a ring architecture, the method being performed by the first device and comprising” (paragraph 52 “The system 100 has been simplified in this example for ease of understanding; generally, there may be more entities and components in the system 100 than that shown in FIG. 3. For the purpose of illustration, FIG. 3 shows a client B and its neighbors client A and client C (all generally referred to as a client 102). It should be understood that there may be greater or fewer numbers of neighbor clients directly connected to a given client 102, and each neighbor client may also have its own respective one or more other neighbor clients. A topology is formed by the direction connections among clients 102. The topology formed by the connections among clients may be any topology (e.g., ring topology, mesh topology, etc.)”. Zhu discloses methods and systems for decentralized federated learning, wherein each client device transmitting their set of local model parameters to each neighbor client computing system for aggregating and updating the set of local parameters at each neighbor client computing system based on the transmitted parameter and each neighbor client computing system’s own parameter. Within the disclosure, Zhu discloses one or more client computing system such as illustrated in figure 3, which comprises of a client computing system A, B and C connected with each other. The topology formed by the connections among clients may be any topology such as a ring topology.)
Zhu teaches the limitation “obtaining first sample data of a target service” (paragraph 53 “The system 100 includes a plurality of clients 102, each of which collect and store respective sets of local data (referred to as local datasets)”. Zhu discloses each client computing system collect and store local datasets)
Zhu teaches the limitation “training a first service processing model based on the first sample data, to obtain a first model parameter of the first service processing model” (paragraph 53 “The system 100 includes a plurality of clients 102, each of which collect and store respective sets of local data (referred to as local datasets) 106. Each client 102 can run a supervised machine learning algorithm to learn the parameters of a local model 104 using the local dataset.”. Zhu discloses each client computing system perform model training by running a supervised machine learning algorithm to learn the parameters using the local dataset.)
Zhu teaches the limitation “transmitting, to the second device, the first model parameter, based on which and based on a second model parameter determined by the second device, a first fusion parameter is determined at the second device” (paragraph 85 “At 510, each given client 102 transmits its respective set of local model parameters and its respective weighting coefficient to each neighbor client 102 identified at step 508 (e.g., via the respective agents 108)”, paragraph 86 “At 512, each given client 102 receives a set of local model parameters and a weighting coefficient from at least one neighbor”, paragraph 88 “At 514, each given client 102 aggregates the set(s) of local model parameters received at 512. For example, the set(s) of local model parameters may be aggregated by computing a weighted sum or weighted average”, and paragraph 91 “The given client 102 updates its respective set of local model parameters ... using the computed aggregations ... an updating procedure, such as adding the aggregation to the existing set of local model parameters”. Zhu discloses each client transmits its respective set of local model parameters to a neighbor client, wherein the client neighbor obtains the transmitted parameter and perform a weighted sum or weighted average computation to obtain an aggregated parameter, wherein this aggregated parameter is used to update the local model parameter of each client. The updating process based on the aggregated parameter to obtain an updated local parameter is analogous to the determination process to obtain a fusion parameter within the claim.)
Zhu teaches the limitation “obtaining a second fusion parameter, the second fusion parameter comprising model parameters respectively determined by the at least three devices in the data processing system, ...” (figure 2, paragraph 42 “FIG. 2 illustrates an example DDML system 20. The DDML system 20 is simplified for the purpose of illustration. In this example, the DDML system 20 has four clients, namely client A 22A, client B 22B, client C 22C and client D 22D (generally referred to as clients 22), which are connected in a ring topology. Each client 22 only interacts with its neighbor. A neighbor of a given client is defined as another client that has a direct connection with the given client”, paragraph 93 “The determination whether the convergence condition is satisfied may be performed by the task owner client 110, or may be performed by individual clients 102. If the convergence condition is not satisfied, the method 500 returns to step 508 to perform another round of training.” Zhu discloses perform another round of training until a convergence condition is satisfied, wherein another round of training comprises continuing transmitting local parameters from one client computing system to other three clients as illustrated in figure 2 to further update the local parameter at each client system, such that an aggregated parameter can be computed at each client. According to the teaching of ring topology by Zhu and the illustration in figure 2, one of ordinary skilled in the art would understand that the aggregated parameter obtained by each client after propagation through at least three neighboring client devices in the ring topology constitutes the aggregated parameter corresponding to the claimed “second fusion parameter”.)
 Zhu teaches the limitation “updating the first model parameter of the first service processing model to a target model parameter determined based on the second fusion parameter” (paragraph 86 “At 512, each given client 102 receives a set of local model parameters ... from at least one neighbor”, paragraph 88 “At 514, each given client 102 aggregates the set(s) of local model parameters received at 512. For example, the set(s) of local model parameters may be aggregated by computing a weighted sum or weighted average”, and paragraph 91 “The given client 102 updates its respective set of local model parameters” Zhu discloses each client updates its respective set of local model parameters from the aggregated parameters computed based on parameters received from other clients, suggesting that the update local model parameter is analogous to the claimed “target model parameter”.)
Zhu teaches a part of the limitation “a quantity of the at least three devices, N number of devices are elected among the at least three devices ... the quantity of the at least three device ... a number of the at least three devices” (paragraph 42 “FIG. 2 illustrates an example DDML system 20. The DDML system 20 is simplified for the purpose of illustration. In this example, the DDML system 20 has four clients, namely client A 22A, client B 22B, client C 22C and client D 22D (generally referred to as clients 22), which are connected in a ring topology. Each client 22 only interacts with its neighbor”, and paragraph 68 “After the task owner client 110 receives consent from a sufficient number of other clients 102 (e.g., the task owner client 110 may define a minimum number of participating clients 102 required for the training a local model related to the task to proceed)” Zhu disclose the task owner may define a minimum number of participating clients such as illustrated in figure 2, there are four clients connected in the ring topology.)

Zhu does not teach the limitation “wherein each of the first model parameter and the target model parameter comprises N segments, N being an integer greater than 1” However, Hu teaches the limitation (page 2 section 3 “Let W denote the model parameters. The worker firstly breaks the structure of W into S segments without overlapping such that W = (W[1], W[2], . . . , W[S])” Hu discloses a machine learning technique of Decentralized Federated Learning with a segmented gossip approach. Within the disclosure, Hu discloses model parameters are broken into S segments without overlapping such as segment W1, W2, thus indicate there are more than 1 segment of parameters.)
Zhu does not teach the limitation “wherein the transmitting the first model parameter to the second device comprises transmitting, to the second device, a first segment in the first model parameter, based on which and based on and a first segment in the second model parameter determined by the second device, a first fusion segment is determined” However, Hu teaches the limitation (page 2 section 3 “segmented pulling allows the worker to pull different parts of the model parameters from different workers and rebuild a mixed model for aggregation ... For each segment l, the worker chooses a peer worker which we denote it as jl and then actively pulls the corresponding segment Wjl[l] from it.”, and page 3 section 3 “Combine all the aggregated segments, and we can rebuild the final aggregation result” Hu discloses one worker may pull the parameter segment from different workers and rebuild a mixed model for aggregation, which is analogous to the claimed feature of transmitting the parameter segment from one device to another device and obtain the fusion segment. The aggregated segments pulled from another worker by Hu is analogous to the fusion segment within the claim.)
Zhu does not teach the limitation “wherein the obtaining the second fusion parameter comprises obtaining the second fusion parameter based on a second fusion segment, the second fusion segment comprising first segments in the model parameters respectively determined by the at least three devices in the data processing system” However, Hu teaches the limitation (page 3 section 3 “Combine all the aggregated segments, and we can rebuild the final aggregation result”, and page 3 section 4.1 “At each iteration, the workers (1) update the model with local dataset and meanwhile, (2) send the segment pulling requests to other workers, once the update is finished, they (3) send the segments to the requestors as a response of the pulling requests and when all the pulling requests are satisfied, the workers (4) aggregate the model segments and start next iteration ... The worker takes the aggregation result of the last iteration as the input model and updates it using stochastic gradient descent(SGD) with the local data.” Hu discloses the pulling of parameter segment from one worker to be received by another worker, and after receive the parameter segment, each worker combines all the aggregated segments, then finally update their model using the aggregated segments using technique such as SGD, thus obtaining updated parameter at each worker. This updated parameter is analogous to the claimed second fusion parameter. In view of the teaching by Zhu of the ring topology and parameter exchange between clients above with the motivation to combine the teachings by Zhu with Hu below, each worker by Hu may be configured within the ring topology by Zhu such that the iterative parameter segment transmit process allow one worker to receive parameter segment from the at least three other workers, thus providing the claimed second fusion segment comprising first segments determined by the at least three devices.)
Zhu does not teach the limitation “wherein the updating comprises” However, Hu teaches the limitation (page 3 section 4.1 “(1) Local Update. The learning process starts with the worker updating the model with the local dataset. The worker takes the aggregation result of the last iteration as the input model and updates it using stochastic gradient descent(SGD) with the local data” Hu discloses the updating the worker by taking the aggregation result and use the SGD technique to obtain updated parameter.)
Zhu does not teach the limitation “determining a target first segment in the target model parameter of the first service processing model based on the second fusion parameter” (page 3 section 4.1 “At each iteration, the workers (1) update the model with local dataset and meanwhile, (2) send the segment pulling requests to other workers, once the update is finished, they (3) send the segments to the requestors as a response of the pulling requests and when all the pulling requests are satisfied, the workers (4) aggregate the model segments and start next iteration ... The worker takes the aggregation result of the last iteration as the input model and updates it using stochastic gradient descent(SGD) with the local data.”, and page 4 section 4.3 “Generally, the deep learning uses the gradient descent algorithms to find the model parameters that minimize a user-defined loss function” Hu discloses the iterative process of each worker receive parameter segments from another workers, aggregate and update their own model parameter using SGD, and further start the next iteration with sending the segment to another worker, thus in the next iteration, the parameter segment transmitted between workers represents the worker’s updated model parameter after aggregating received segments. This further transmitted parameter segment functions as the model portion used by subsequent workers to update their own models. Accordingly, this transmitted updated segmented corresponds to the claimed target first segment, as it is the specific model portion determined based on the updated parameter (second fusion parameter).)
Zhu does not teach the limitation “obtaining the target model parameter of the first service processing model based on the target first segment” However, Hu teaches the limitation (page 3 section 4.1 “(1) Local Update. The learning process starts with the worker updating the model with the local dataset. The worker takes the aggregation result of the last iteration as the input model and updates it using stochastic gradient descent(SGD) with the local data” Hu discloses another worker may pull the parameter segment from another worker, which may be the further transmitted parameter segment as explained above, thus the received worker may aggregate such further transmitted parameter segment and update their own model to obtain further updated parameter, which is analogous to the claimed “target model parameter”)
Zhu does not teach the limitation “wherein while the first device determines the target first segment in the target model parameter, the second device determines a target second segment, which is a segment different from the target first segment in the target model parameter in parallel, and” However, Hu teaches the limitation (page 2 section 3 “Let W denote the model parameters. The worker firstly breaks the structure of W into S segments without overlapping such that W = (W[1], W[2], . . . , W[S]) (1) For each segment l, the worker chooses a peer worker ... and then actively pulls the corresponding segment Wjl[l] from it. Note that this step is parallelized to make full use of the bandwidth” Hu discloses the parallel process of each worker may break the structure of their parameter into segments to be transmitted to each other.)
Zhu does not teach a part of the limitation “wherein when N is equal to or less than a quantity of ... devices, N number of devices are elected ... to respectively determine the N segments of the target model parameter by performing weighted averaging on the N segments, and when N is greater than the quantity of ... devices, ... determine corresponding segments of the target model parameter by performing weighted averaging on segments corresponding to a number of the ... devices in a plurality of times” However, Hu teaches the limitation (page 2 section 3 “Let W denote the model parameters. The worker firstly breaks the structure of W into S segments without overlapping such that W = (W[1], W[2], . . . , W[S]) (1) For each segment l, the worker chooses a peer worker ... and then actively pulls the corresponding segment Wjl[l] from it”, page 3 section 3 equation 3 “Typically the model aggregation uses weighted averaging of the received model parameters with the worker’s dataset size as weight. But in segmented gossip aggregation, the mixed models are patched together from different workers, so it is hard to set a reasonable weight for the mixed model as a whole. For such case, we use a segment-wise model aggregation.” and page 5 section 5.1 “The communication behavior of Combo is controlled by two parameters: model segments as S and model replica as R. In our following experiments, we set S = 10 ..., that is in the synchronization phase, the model parameters are flattened and then divided into ten segments equally... The gossip approach is the special case of Combo when S = 1,” Hu discloses the segment of parameter at each worker can be divided into 10 segments or 1 segments, and given the number of devices can be configured as three or more as disclosed by Zhu above, a person ordinary skilled in the art would have understood that the number of segments S may be selected to be either less than the number of device as defined by Zhu above (e.g., S = 1 when clients number is 4), and the number of segments S may be greater than the number of devices (e.g., S = 10 when clients number is 4). When the number of segments is less than the device count, each worker (client device) is capable of determining the single segment of the model parameter and when the number of segments is greater than the device count, each worker (client device) is capable of determining corresponding segments in multiple rounds, which is analogous to the claimed requirement that when the number of segments exceed the number of devices, the segments are determined in a plurality of times and selecting the number of devices to determine the number of segments if the number of segments is less than the number of device. Furthermore, Hu suggests the claimed performing weighted averaging on segments corresponding to participating devices, wherein Hu discloses a segment-wise model aggregation in which each segment of the model parameter is aggregated based on corresponding segments provided by participating workers/devices. As shown in equation 3, the weight assigned to each worker/device represents that worker’s contribution to the segment aggregation based on the size of the worker’s dataset as understood by one of ordinary skill in the art.)
Before the effective filing date, it would have been obvious to a person ordinary skilled in the art to combine the teaching of methods and systems for decentralized federated learning with one or more client devices transmits and aggregate parameter at each client devices by Zhu, with the teaching of transmitting model parameter segments in a decentralized federate learning by Hu. The motivation to do so is referred to in Hu’s disclosure (page 1 column 1 “It is of great challenges for conventional federated learning approaches to efficiently utilize network capacities between nodes. In this paper, we propose a model segment level decentralized federated learning to tackle this problem. In particular, we propose a segmented gossip approach, which not only makes full utilization of node-to-node bandwidth, but also has good training convergence.”, and page 1 column 2 “An intuitive question is then, is it possible for workers to synchronize the model partially, from/to only a part of the workers, and still achieve good training results? Our answer to this question is a novel decentralized federated learning design, introducing a segmented gossip approach, which not only makes full utilization of node-to-node bandwidth by transmitting model segmentations in a peer-to-peer manner, but also has good training convergence by carefully forming dynamical synchronization gossiping groups.”, and page 2 section 3 “in the federated learning context where the workers are geo-distributed, the real bandwidth between the workers is typically small due to the potential bottleneck of WAN. Thus the traditional gossip-based schemes can not make full use of the worker’s bandwidth because the transmissions are limited in one or few links. We propose the Segmented Gossip Aggregation to solve this problem by “splitting” the transmission task and feeding them into more links” Hu discloses a decentralized federated learning system design similar to the teaching by Zhu that all workers/clients communicate with each other instead of a central server. Hu further discloses the benefit of the segmented gossip approach, which make full utilization of node-to-node bandwidth by transmitting model segmentations in a peer-to-peer manner since the real bandwidth between the workers/nodes is typically small due to the potential bottleneck of WAN, thus segmentation of model’s parameters can help transmit the information more efficiently. Therefore, given that Zhu also disclose a similar decentralized federated learning system and transmission of parameters among clients, the teaching by Zhu may be further improved upon adopting the teaching by Hu by segment the parameter to be sent to other clients to account for the limited bandwidth in connection between client devices.)

Hu/Zhu does not teach part of the limitation “... , the model parameters being rounded using a multiplier P, a value of the multiplier P indicating a degree of preservation of floating-point number precision” However, Gama teaches this part of the limitation (paragraph 72 “In particular, we show that to achieve p bits of numerical precision in MPC, it suffices to have p+2τ-bit floating points where τ is a fixed security parameter”, paragraph 73 “The secret shares we consider are real numbers. We would like to mask these shares using floating point numbers”, paragraph 80 “Rounding: ... that maps x to its nearest element”, paragraph 81 “fixed precision operations maintain p constant by applying a final rounding.” Gama discloses a method for enabling multiple parties to collaborate to produce a shared result while preserving the privacy of input data contributed by individual parties with a specified high degree of precision using masking secret sharing. Within the disclosure, Gama discloses p bits of numerical precision, masking secret shares using floating point numbers, and finally a rounding operation that map a value to its nearest representable element while maintaining a fix precision based on the p bits of numerical precision. Thus, in Gama, the rounding is performed using p as the precision-setting that controls the target rounded representation, corresponding to the claimed rounding of model parameters using a multiplier P. Furthermore, the secret-shared values correspond to the claimed model parameters, the disclosed precision value p corresponds to the claimed multiplier P, and Gama’s disclosure that p sets the numerical precision retained in the floating-point representation corresponds to the claimed value of the multiplier P indicating a degree of preservation of floating-point number precision.
Hu/Zhu does not teach limitations “wherein the transmitting the first model parameter comprises:”, “obtaining a candidate random mask”, “rounding the candidate random mask using the multiplier P, to obtain a random mask”, and “transmitting an encrypted model parameter obtained using the random mask” However, Gama teaches or at least suggests these limitations under the broadest reasonable interpretation (paragraph 59 “we would like to compute secret shares for the element ... the players must employ precomputed single-use random numerical masking data”, paragraph 72 “In particular, we show that to achieve p bits of numerical precision in MPC, it suffices to have p+2τ-bit floating points where τ is a fixed security parameter”, paragraph 73 “The secret shares we consider are real numbers. We would like to mask these shares using floating point numbers”, paragraph 80 “Rounding: ... that maps x to its nearest element”, paragraph 81 “fixed precision operations maintain p constant by applying a final rounding”, and figure 16 “receive a respective secret share of each component of each set of the plurality of sets of numerical masking data components from the trusted dealer ... received secret shares of numerical masking data components are used to mask data communicated during the computations, ... transmits a secret share of an instance of computed secret shared data to one or more others of the plurality of party computing systems” Gama discloses a secure computation in which secret-shared values, corresponding to the claimed model parameters under the broadest reasonable interpretation, are handled as floating-points values with p bits of numerical precision, are masked with random numerical masking data within that floating-point framework, and masked secret shares are communicated among multiple parties. Gama further discloses that a value in that framework is rounded to its nearest representable element while fixed precision is maintained based on p by rounding. Accordingly, the discloses secret-share values correspond to the claimed model parameters, the disclosed rounding corresponds to the claimed model parameter being rounded, the disclosed precision value p correspond to the claimed multiplier P, and Gama’s disclosure that p sets the numerical precision retained in the floating-point representation corresponds to the claimed using of the multiplier P indicating a degree of preservation of floating-point number precision.
Hu/Zhu does not teach the limitation “wherein, in performing the weighted averaging, the multiplier P multiplied with each model parameter for rounding is remove” However, Gama teaches or at least suggests this limitation under the broadest reasonable interpretation (figure 16 step b “a secret share reduction that transforms an instance of computed secret shared data stored in floating point representation into an equivalent, equivalently precise, and equivalently secure instanced of computed secret shared data having a reduced memory storage requirement” Gama discloses reducing computed secret-shared data stored in floating point representation into an equivalent secure precise reduced form. Under the broadest reasonable interpretation, the computed secret-shared data correspond to the weighted-average parameter segment values of Zhu/Hu, and Gama’s reduction corresponds to the claimed removal of the multiplier/rounding burden from those parameter values. A person ordinary skill in the art would have recognized that applying Gama’s reduction to Zhu/Hu weighted averaged parameter segments is a predictable use of known precision-control techniques so that the transmitted segment parameter values remain in the intended precision format while reducing the added floating-point rounding burden during the aggregation of Zhu/Hu.)
Before the effective filing date, it would have been obvious to a person ordinary skilled in the art to combine the teaching of methods and systems for decentralized federated learning with one or more client devices transmits and aggregate parameter at each client devices by Zhu, and the teaching of transmitting model parameter segments in a decentralized federate learning by Hu, with the teaching of method for enabling multiple parties to collaborate to produce a shared result while preserving the privacy of input data contributed by individual parties with a specified high degree of precision using masking secret sharing by Gama. The motivation to do so is referred to in Gama’s disclosure (paragraph 3 “A method for performing privacy-preserving or secure multi-party computations enables multiple parties to collaborate to produce a shared result while preserving the privacy of input data contributed by individual parties. The method can produce a result with a specified high degree of precision or accuracy in relation to an exactly accurate plaintext (non-privacy-preserving) computation of the result, without unduly burdensome amounts of inter-party communication”, and paragraph 50 “In this work, we introduce an alternative sharing scheme, where fixed-point values are shared directly using (possibly multibit) floating points, and present a technique to reduce the share sizes after each multiplication. This technique easily extends to an arbitrary number of players.”, and paragraph 51 “numerical masking data can be used to achieve much more than just ring additions or multiplications. In a nutshell, the amount of communications is reduced as a consequence of reusing the same masks, and the number of communication rounds is reduced as a consequence of masking directly matrices and other large structures. Therefore, the total communication time becomes negligible compared to the computing cost” Gama discloses several benefit and advantage of the current application in sharing information between multiple parties while preserving the privacy of input data contributed by individual parties and maintaining the precision in the computed result. The technique by Gama further comprises of sharing information under floating points format while reducing the share size and maintaining precision through rounding according to a precision value as well as representing the information being shared under masked data to reduce round of communications and reducing computing cost. Given the following advantages, one of ordinary skilled in the art would have been able to incorporate the teaching of Gama into the parameter transmission and aggregation teaching by Zhu/Hu to further implement the privacy procedure during transmission of parameter among clients/workers as taught by Zhu/Hu, thus obtain an overall improved framework of data processing.)

Regarding claim 3 depends on claim 1, thus the rejection of claim 1 is incorporated.
Zhu teaches a part of the limitation “(ii) transmitting the target first segment to one of the at least three devices connected to the first device; ...” (figure 2, paragraph 52 “It should be understood that there may be greater or fewer numbers of neighbor clients directly connected to a given client 102, and each neighbor client may also have its own respective one or more other neighbor clients”, and paragraph 80 “In a simple implementation, all clients 102 may transmit information (i.e., respective set of local model parameters and weighting coefficient) to all other clients 102”. Zhu discloses the transmit process of parameter to one or more other neighbor client, wherein there may be more than three client devices connected together in a ring topology as illustrated in figure 2, and wherein the transmitted parameter may be a segment of the local parameter of each client device in view of the teaching by Hu above.)
Hu teaches a part of the limitation “... and obtaining the target model parameter of the first service processing model, that is determined by a device other than the first device, among the at least three devices, by splicing the target first segment and the target second segment” (figure 2, page 2 section 3 “segmented pulling allows the worker to pull different parts of the model parameters from different workers and rebuild a mixed model for aggregation. Let W denote the model parameters. The worker firstly breaks the structure of W into S segments without overlapping such that W = (W[1], W[2], . . . , W[S])” Hu discloses an iterative process of each worker segment their parameter to be transmitted to another worker for aggregation and local update at each worker, such that each worker may then continuously segments their updated local parameter as illustrated in figure 2 and continuously transmitted the segmented updated local parameter, thus the transmitted parameter segment at further iteration corresponds to the claimed “target first segment and the target second segment” within the claim.)

Regarding claim 4 depends on claim 1, thus the rejection of claim 1 is incorporated.
Zhu teaches the limitation “receiving, from the second device, a second segment or a first fusion second segment in a model parameter” (paragraph 086 “At 512, each given client 102 receives a set of local model parameters and a weighting coefficient from at least one neighbor (e.g., via the respective agents 108).” Zhu discloses each client receives a set of local parameters from at least one neighbor, wherein the parameters may be a segment of parameters in view of the teaching by Hu above.)
Zhu teaches the limitation “performing fusion on the second segment or the first fusion second segment and a second segment in the model parameter determined by the first device, to obtain a second fusion second segment” (paragraph 88 “At 514, each given client 102 aggregates the set(s) of local model parameters received at 512. For example, the set(s) of local model parameters may be aggregated by computing a weighted sum or weighted average” and paragraph 91 “The given client 102 updates its respective set of local model parameters and its respective weighting coefficient using the computed aggregations ... an updating procedure, such as adding the aggregation to the existing set of local model parameters”. Zhu discloses the process to aggregate the parameters obtained from other client devices, wherein the parameters may be a segment of parameters that is transmitted and received in view of the teaching by Hu above, thus the aggregated parameters from transmitted parameter segments may correspond to the second fusion second segment within the claim.)
Zhu teaches the limitation “transmitting the second fusion second segment to another device in the at least three devices” (paragraph 85 “At 510, each given client 102 transmits its respective set of local model parameters and its respective weighting coefficient to each neighbor client 102 identified at step 508 (e.g., via the respective agents 108)”. Zhu discloses each client transmits is set of local model parameters, wherein a person ordinary skilled in the art can configure each client to transmit their updated segment of parameter as described above to other devices in further training round.)

Regarding claim 7 depends on claim 1, thus the rejection of claim 1 is incorporated.
Gama teaches the limitation “obtaining a random mask” (paragraph 59 “we would like to compute secret shares for the element ... the players must employ precomputed single-use random numerical masking data” Gama discloses computing the secret shares for the element to be shared, and further employ random numerical masking data to mask these shares to ensure the privacy during information sharing.)
Gama teaches the limitation “performing fusion on the random mask and the first model parameter to obtain an encrypted model parameter” (paragraph 7 “For each instance, the party computing system can compute a secret share of the instance based on at least one secret share of a set of input data or at least one secret share of another instance of computed secret shared data. Received secret shares of numerical masking data components can be used to mask data communicated during the computations.” Gama discloses the computing system compute a secret share of the instance, wherein the instance corresponds to the model parameter, and further apply masking on these secret share instances using a random mask for further communication, which corresponds to the using the random mask with the model parameter to obtain an encrypted model parameter under the broadest reasonable interpretation.)
Zhu teaches the limitation “transmitting, to the second device, the encrypted model parameter, based on which and based on the second model parameter of the second device, the first fusion parameter is determined at the second device” (paragraph 85 “At 510, each given client 102 transmits its respective set of local model parameters and its respective weighting coefficient to each neighbor client 102 identified at step 508 (e.g., via the respective agents 108)”, paragraph 86 “At 512, each given client 102 receives a set of local model parameters and a weighting coefficient from at least one neighbor”, paragraph 88 “At 514, each given client 102 aggregates the set(s) of local model parameters received at 512. For example, the set(s) of local model parameters may be aggregated by computing a weighted sum or weighted average”, and paragraph 91 “The given client 102 updates its respective set of local model parameters ... using the computed aggregations ... an updating procedure, such as adding the aggregation to the existing set of local model parameters”. Zhu discloses each client transmits its respective set of local model parameters to a neighbor client, wherein the client neighbor obtains the transmitted parameter and perform a weighted sum or weighted average computation to obtain an aggregated parameter, wherein this aggregated parameter is used to update the local model parameter of each client. The updating process based on the aggregated parameter to obtain an updated local parameter suggest the determination process to obtain a fusion parameter within the claim. The transmitted parameter may include encrypted data associated with the parameter based on the teaching combination below.)
Zhu teaches the limitation “receiving a candidate fusion parameter, the candidate fusion parameter comprising the random mask and the model parameters respectively determined by the at least three devices in the data processing system”  (paragraph 85 “At 510, each given client 102 transmits its respective set of local model parameters and its respective weighting coefficient to each neighbor client 102 identified at step 508 (e.g., via the respective agents 108)”, paragraph 86 “At 512, each given client 102 receives a set of local model parameters and a weighting coefficient from at least one neighbor”, Zhu discloses each client transmits and receives parameters from one or more other client devices, wherein the parameters may include masking data associated with the parameter based on the teaching combination below)
Gama teaches the limitation “removing the random mask from the candidate fusion parameter to obtain the second fusion parameter” (paragraph 135 “Finally, the players, who know the common mask, can independently unmask their secret shares, and obtain their final share of the numerical masking data, which is therefore unknown to the dealer.” Gama discloses each player, which correspond to each client/worker as disclosed by Zhu/Hu can independently unmask their secret shares to obtain their final share of the numerical masking data, which corresponds to the claimed process of removing the random mask from the parameter to obtain the parameter.)

Regarding claim 12, the applicant is directed to the rejection of claim 1 above, because the claim recites similar limitations and processing method, thus the claim is rejected under similar rationale.

Regarding claim 14, 
Zhu teaches following limitations “at least one memory configured to store program code”, “at least one processor configured to read the program code and operate as instructed by the program code” (paragraph 62 “The client 102 (e.g., embodied as a single physical or virtual machine) may include one or more processing devices 114, such as a processor”, paragraph 66 “The client 102 may include one or more memories 128, which may include a volatile or non-volatile memory (e.g., a flash memory, a random access memory (RAM), and/or a read-only memory (ROM)). The non-transitory memory(ies) 128 may store instructions for execution by the processing device”, paragraph 135 “Accordingly, the technical solution of the present disclosure may be embodied in the form of a software product. A suitable software product may be stored in a pre-recorded storage device or other similar non-volatile or non-transitory computer readable medium ... The software product includes instructions tangibly stored thereon that enable a processing device (e.g., a personal computer, a server, or a network device) to execute example embodiments of the methods disclosed herein. The machine-executable instructions may be in the form of code sequences, configuration information, or other data, which, when executed, cause a machine (e.g., a processor or other processing device) to perform steps in a method”. Zhu discloses the method may be implemented on a software product of each client device, wherein the product comprises one or more non-transitory computer readable medium such as a memory and a processing device with one or more processor to read and execute program instructions in the form of code sequences.)
The applicant is further directed to the rejection of claim 1 above, because the claim recites similar limitations and processing method, thus the claim is rejected under similar rationale.

Regarding claim 16 depends on claim 1, thus the rejection of claim 1 is incorporated. 
Zhu teaches the limitation “An electronic device, comprising one or more processors and one or more memories storing at least one piece of computer program, the at least one piece of computer program being loaded and executed by the one or more processors to implement the data processing method according to claim 1.” (paragraph 62 “The client 102 (e.g., embodied as a single physical or virtual machine) may include one or more processing devices 114, such as a processor”, paragraph 66 “The client 102 may include one or more memories 128, which may include a volatile or non-volatile memory (e.g., a flash memory, a random access memory (RAM), and/or a read-only memory (ROM)). The non-transitory memory(ies) 128 may store instructions for execution by the processing device”, paragraph 135 “Accordingly, the technical solution of the present disclosure may be embodied in the form of a software product. A suitable software product may be stored in a pre-recorded storage device or other similar non-volatile or non-transitory computer readable medium ... The software product includes instructions tangibly stored thereon that enable a processing device (e.g., a personal computer, a server, or a network device) to execute example embodiments of the methods disclosed herein. The machine-executable instructions may be in the form of code sequences, configuration information, or other data, which, when executed, cause a machine (e.g., a processor or other processing device) to perform steps in a method”. Zhu discloses the method may be implemented on a software product of each client device, wherein the product comprises one or more non-transitory computer readable medium such as a memory and a processing device with one or more processor to read and execute program instructions in the form of code sequences)

Regarding claim 17 depends on claim 12, thus the rejection of claim 12 is incorporated. The applicant is further directed to the rejection of claim 16 above, because the claim recites similar limitations and processing device, thus the claim is rejected under similar rationale.

Regarding claim 18 depends on claim 1, thus the rejection of claim 1 is incorporated.
Zhu teaches the limitation “A non-transitory computer-readable storage medium, storing at least one computer program, the at least one computer program being loaded and executed by a processor to implement the data processing method according to claim 1.” (paragraph 62 “The client 102 (e.g., embodied as a single physical or virtual machine) may include one or more processing devices 114, such as a processor”, paragraph 66 “The client 102 may include one or more memories 128, which may include a volatile or non-volatile memory (e.g., a flash memory, a random access memory (RAM), and/or a read-only memory (ROM)). The non-transitory memory(ies) 128 may store instructions for execution by the processing device”, paragraph 135 “Accordingly, the technical solution of the present disclosure may be embodied in the form of a software product. A suitable software product may be stored in a pre-recorded storage device or other similar non-volatile or non-transitory computer readable medium ... The software product includes instructions tangibly stored thereon that enable a processing device (e.g., a personal computer, a server, or a network device) to execute example embodiments of the methods disclosed herein. The machine-executable instructions may be in the form of code sequences, configuration information, or other data, which, when executed, cause a machine (e.g., a processor or other processing device) to perform steps in a method”. Zhu discloses the method may be implemented on a software product of each client device, wherein the product comprises one or more non-transitory computer readable medium such as a memory and a processing device with one or more processor to read and execute program instructions in the form of code sequences)

Regarding claim 19 depends on claim 12, thus the rejection of claim 12 is incorporated. The applicant is further directed to the rejection of claim 18 above, because the claim recites similar limitations and processing device, thus the claim is rejected under similar rationale.


Claims 5-6 are rejected under 35 U.S.C. 103 as being unpatentable over Zhu et.al (US 20220114475 A1) in view of Hu et al (NPL: Decentralized Federated Learning: A Segmented Gossip Approach), further in view of Gama et.al (US 20200304293 A1), further in view of Moradi et al (US 20230107301 A1)

Regarding claim 5 depends on claim 1, thus the rejection of claim 1 is incorporated.
Zhu/Hu does not teach the limitation “The method according to claim 1, wherein the data processing system comprises at least two device groups, each device group comprises a leader and a follower, and devices in each device group are connected in a ring architecture; leaders in the at least two device groups are connected in a ring architecture; the first device is a leader in a target device group and the second device is a follower in the target device group, and the target device group is any one of the at least two device groups”. However, Moradi teaches this limitation (Figure7, paragraph 33 “The master node 1000 may include network interface circuitry 1007 ... configured to provide communications with other nodes (e.g., with other master nodes ...).”, paragraph 58 “Another embodiment of leader election is in a network with a logical Ring topology”, and paragraph 109 “network nodes may represent any suitable device (or group of devices) capable, configured, arranged, and/or operable to enable and/or provide a wireless device with access to the wireless network”. Moradi discloses a method for dynamic leader selection for distributed machine learning. Within the disclosure Moradi discloses the node may represent any suitable device (or group of devices), and Figure 7 illustrates the architecture of master nodes and worker nodes, wherein each master node comprises one or more worker node, and each master node represents a device, suggesting one or more device group with each group comprise of one or more master node and worker node, which suggest the claim that recites two device groups, each device group comprises a leader and a follower. The one or more master node may be configured to be connected via a network environment in a logical ring topology.)
Before the effective filing date, it would have been obvious to a person ordinary skilled in the art to combine the teaching of methods and systems for decentralized federated learning with one or more client devices transmits and aggregate parameter at each client devices by Zhu, the teaching of transmitting model parameter segments in a decentralized federate learning by Hu, and the teaching of method for enabling multiple parties to collaborate to produce a shared result while preserving the privacy of input data contributed by individual parties with a specified high degree of precision using masking secret sharing by Gama, with the teaching of a method for dynamic leader selection for distributed machine learning by Moradi. The motivation to do so is referred to in Moradi’s disclosure (paragraph 37 “Described below are embodiments that may dynamically select/change a master node among different devices ... based on local resource status and using a distributed leader election during run time in case of any failure or high load situations, etc. In the description that follows, a master node may also be referred to as a leader computing device. Additionally, a worker node may also be referred to as a non-leader computing device”, and paragraph 60 “The embodiments described above can be beneficial in different scenarios where the state of the system (e.g., system status) can dynamically change. An example of a change is system status is a power outage in a site where the eNodeB/gNB is forced to use battery. In this case, in order to reduce energy consumption, the node should not remain the master node or even participate as a worker node until the power issue is resolved. A master node can also become unavailable due to power outage at a site without battery backup, which should re-enforce a new round of leader election as described above.” Moradi discloses the benefit of the system that utilize master node, worker node and the selection of master node to provide the best performing solution in case of any failure or high load situations. The application of the method by Moradi can be beneficial in different scenarios where the state of the system (e.g., system status) can dynamically change. Therefore, a person ordinary skilled in the art can configure one or more client devices by Zhu in an architecture of master node and worker node within a ring topology as disclosed by Moradi for further improvement.)

Regarding claim 6 depends on claim 5, thus the rejection of claim 5 is incorporated.
Moradi teaches the limitation “The method according to claim 5, wherein the transmitting the first model parameter to the second device comprises transmitting the first model parameter to the follower in the target device group” (paragraph 43 “Turning to FIG. 2, the node selected as the master node may initiate the machine learning model by communicating with all participating worker nodes and exchanging model weights, aggregating them”. Moradi discloses the master node may communicate with worker nodes and exchange model weights, wherein the model weights suggest the model parameter.)
Zhu teaches the limitation “obtaining a third fusion parameter, the third fusion parameter comprising a model parameter determined by the target device group” (paragraph 86 “At 512, each given client 102 receives a set of local model parameters and a weighting coefficient from at least one neighbor”, and paragraph 91 “The given client 102 updates its respective set of local model parameters and its respective weighting coefficient using the computed aggregations ... an updating procedure, such as adding the aggregation to the existing set of local model parameters” Zhu discloses each client device receives local model parameters from at least one neighbor client devices, wherein the given client update its local model parameters such as by adding the aggregation to the existing local model parameters, wherein this updated parameter suggests the third fusion parameter within the claim. The process may occur within a configuration of a group of master node and worker nodes connected with each other as configured by a person ordinary skilled in the art based on the teaching combination above, such that an updated local model parameters may be obtained at the master node device.)
Zhu teaches the limitation “transmitting, to a leader in another device group, the third fusion parameter, based on which and based on a fusion parameter determined by a device in the another device group to which the leader belongs to, a fourth fusion parameter is determined at the leader in the another device group” (paragraph 85 “At 510, each given client 102 transmits its respective set of local model parameters and its respective weighting coefficient to each neighbor client 102 identified at step 508 (e.g., via the respective agents 108)”, paragraph 86 “At 512, each given client 102 receives a set of local model parameters and a weighting coefficient from at least one neighbor”, paragraph 88 “At 514, each given client 102 aggregates the set(s) of local model parameters received at 512. For example, the set(s) of local model parameters may be aggregated by computing a weighted sum or weighted average”, and paragraph 91 “The given client 102 updates its respective set of local model parameters ... using the computed aggregations ... an updating procedure, such as adding the aggregation to the existing set of local model parameters”. Zhu discloses each client transmits its respective set of local model parameters to a neighbor client, wherein the client neighbor obtains the transmitted parameter and perform a weighted sum or weighted average computation to obtain an aggregated parameter, wherein this aggregated parameter is used to update the local model parameter of each client. A person ordinary skilled in the art may configure the neighbor client to be another master node device in another group of master-worker node, wherein the updated parameter from one master node device is transmitted to another master node device for aggregation and update. The updating process based on the aggregated parameter to obtain an updated local parameter suggest the fourth fusion parameter within the claim.)
Zhu teaches the limitation “obtaining the second fusion parameter, the second fusion parameter comprising fusion parameters respectively determined by the at least two device groups in the data processing system” (paragraph 88 “At 514, each given client 102 aggregates the set(s) of local model parameters received at 512. For example, the set(s) of local model parameters may be aggregated by computing a weighted sum or weighted average” Zhu discloses an aggregating process to aggregate parameters obtained from other neighbor client devices and further update the local model parameter using the aggregated parameter, wherein a person ordinary skilled in the art may configure another master-worker device group such that this group receive the updated parameter from two other groups and continuously perform parameters aggregation and update its local model parameter. The configuration of transmitting data from master-worker node device group to another group is possible by implementing the ring topology as disclosed by Zhu, Hu and Moradi above.)


Claims 8-9, 11 are rejected under 35 U.S.C. 103 as being unpatentable over Zhu etl.al (US 20220114475 A1) in view of Hu et al (NPL: Decentralized Federated Learning: A Segmented Gossip Approach), further in view of Gama et.al (US 20200304293 A1), further in view of Wiedemann et.al (US 20230075514 A1)

Regarding claim 8 depends on claim 7, thus the rejection of claim 7 is incorporated.
Gama teaches the limitation “rounding the first model parameter; ...” (paragraph 73 “The secret shares we consider are real numbers. We would like to mask these shares using floating point numbers”, paragraph 80 “Rounding: ... that maps x to its nearest element” Gama discloses a rounding operation that map a value to its nearest representable element while maintaining a fix precision based on the p bits of numerical precision, wherein the value correspond to the masked secret share which corresponds to the parameter in view of Zhu/Hu as disclosed above.))
Gama teaches a part of the 1st limitation “... performing summation on the random mask and a rounded first model parameter” (paragraph 58 “Computing secret shares for a sum ...  can be done non-interactively by each player by adding the corresponding shares of x and y”, and paragraph 59 “Given additive secret shares ... the players must employ precomputed single-use random numerical masking data ... ” Gama discloses that a sum can be computed by adding corresponding shares of x and y, and further discloses precomputed single-use random numerical masking data. Under the broadest reasonable interpretation, the disclosed random numerical masking data correspond to the claimed random mask, and the disclosed secret share value corresponds to the rounded first model parameter, such that the disclosed addition of secret shares value corresponds to the claimed performing summation on the random mask and the rounded first model parameter.)
Zhu/Hu/Gama does not teach the limitation “performing a modulo operation on a summation result to obtain the encrypted model parameter”. However, Wiedemann teaches or at least suggests this (paragraph 127 “modulo operator % are defined as follows: ... x % y is the modulo operator defined as x−y”. Wiedemann discloses modulo operator, wherein a person ordinary skilled in the art can configure a system of programming language to perform modulo operator onto the encrypted data comprising random mask value as disclosed by Zhu/Hu/Hama based on the teaching combination below to obtain part of the encrypted data which may be transmit to other client devices.)
Before the effective filing date, it would have been obvious to a person ordinary skilled in the art to combine the teaching of methods and systems for decentralized federated learning with one or more client devices transmits and aggregate parameter at each client devices by Zhu, the teaching of transmitting model parameter segments in a decentralized federate learning by Hu, the teaching of method for enabling multiple parties to collaborate to produce a shared result while preserving the privacy of input data contributed by individual parties with a specified high degree of precision using masking secret sharing by Gama, with the teaching of apparatus for quantizing a NN parameter to support an efficient encoding and/or decoding of such parameters by Wiedemann. The motivation to do so is referred to in Wiedemann’s disclosure (paragraph 15 “it is desired to improve a concept for a representation of neural network parameters to support an efficient encoding and/or decoding of such parameters. It might be desired to reduce a bit stream into which the neural network parameters are encoded and thus reduce a signalization cost. Additionally, or alternatively, it might be desired to reduce a complexity of computational resources to improve a neural network inference, e.g. it might be desired to achieve an efficient implementation for neural network inference.” Wiedemann discloses the invention provide an improvement toward representation of neural network parameters to support an efficient encoding and/or decoding of such parameters, wherein parameter may be quantized to reduce a signalization cost and reduce a complexity of computational resources required to handle the data. Therefore, the teaching by Zhu/Hu/Gama may further incorporate the teaching by Wiedemann for further improvement.)

Regarding claim 9 depends on claim 7, thus the rejection of claim 7 is incorporated.
	Gama teaches the limitation “rounding the first model parameter;” (paragraph 73 “The secret shares we consider are real numbers. We would like to mask these shares using floating point numbers”, paragraph 80 “Rounding: ... that maps x to its nearest element” Gama discloses a rounding operation that map a value to its nearest representable element while maintaining a fix precision based on the p bits of numerical precision, wherein the value correspond to the masked secret share which corresponds to the parameter in view of Zhu/Hu as disclosed above.)
Wiedemann teaches the limitation “obtaining a first product of a rounded first model parameter and a weight of the first device” (paragraph 29 “the NN parameter can be derived based on the multiplier, the bit shift number and the quantization value, for which reason it is possible to carry out computations, e.g. a summation of NN parameters and/or a multiplication of a NN parameter with a vector, in integer domain instead of floating point domain. Therefore, an efficient computation of the inference can be achieved by the device”. Wiedemann discloses computations of parameter should be perform with a vector, in integer domain instead of floating point domain, suggesting a rounding of parameter from integer type to a floating type, which may be configured by a person ordinary skilled in the art using a system of programming language. Wiedemann also discloses a multiplication of parameters may be obtained, in which a person ordinary skilled in the art may compute the multiplication between a local parameter of a client device and a weight of the device to scale the importance of the local parameter of each client device.)
	The applicant is further directed to the rejection of claim 8 above, because the claim further recites similar limitations, thus the claim is rejected under similar rationale.

Regarding claim 11 depends on claim 1, thus the rejection of claim 1 is incorporated.
Wiedemann teaches a part of the limitation “obtaining a quotient of the second fusion parameter and a total weight, and using the quotient as the target model parameter of the first service processing model, ...” (paragraph 20 “remainder of a division between a dividend derived by the quantization parameter and a divisor derived by an accuracy parameter and a bit shift number based on a rounding of the quotient of the division” Wiedemann discloses a division operation to obtain the quotient between parameters, in which a person ordinary skilled in the art may configure to apply the division operation between the updated local parameter of the device and a total weight parameter of the device, wherein the result quotient may be transmitted to other client devices for further update their local parameter and learning.)
Zhu teaches a part of the limitation “... the total weight being a sum of weights of the at least three devices in the data processing system” (paragraph 88 “At 514, each given client 102 aggregates the set(s) of local model parameters received at 512. For example, the set(s) of local model parameters may be aggregated by computing a weighted sum or weighted average.” Zhu discloses a process of weighted sum to aggregate the local parameters received from other client devices, wherein the parameter may be a weight value of a model employed by each device as configured by a person ordinary skilled in the art.)
The motivation to combine the teaching of Zhu with the teaching of Wiedemann in claim 11 is similar to the motivation as explained in claim 8 above.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DUY TU DIEP whose telephone number is (703)756-1738. The examiner can normally be reached M-F 8-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached at (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DUY T DIEP/Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123
Read full office action
Prosecution Timeline

Show 6 earlier events
Dec 18, 2025
Interview Requested
Jan 21, 2026
Examiner Interview Summary
Jan 21, 2026
Applicant Interview (Telephonic)
Feb 04, 2026
Request for Continued Examination
Feb 14, 2026
Response after Non-Final Action
Mar 31, 2026
Non-Final Rejection mailed — §103
May 21, 2026
Applicant Interview (Telephonic)
May 21, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

17/459,157
Patent 12608642
MODEL PARAMETER LEARNING METHOD AND MOVEMENT MODE DETERMINATION METHOD
4y 7m to grant Granted Apr 21, 2026
17/551,821
Patent 12579428
METHOD FOR INJECTING HUMAN KNOWLEDGE INTO AI MODELS
4y 3m to grant Granted Mar 17, 2026
17/557,096
Patent 12488223
FEDERATED LEARNING FOR TRAINING MACHINE LEARNING MODELS
3y 11m to grant Granted Dec 02, 2025
17/317,908
Patent 12412129
DISTRIBUTED SUPPORT VECTOR MACHINE PRIVACY-PRESERVING METHOD, SYSTEM, STORAGE MEDIUM AND APPLICATION
4y 4m to grant Granted Sep 09, 2025
Study what changed to get past this examiner. Based on 4 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
29%
Grant Probability
36%
With Interview (+6.7%)
4y 3m (~7m remaining)
Median Time to Grant
High
PTA Risk
Based on 24 resolved cases by this examiner. Grant probability derived from career allowance rate.