DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 43-45, 51, is/are rejected under 35 U.S.C. 103 as being unpatentable over Schiatti et al., US 20210067339 A1 (hereafter referred to as Schiatti) in view of Arora et al., USPN 11489734 A1 (hereafter referred to as Arora).
Claim 43, Schiatti teaches an apparatus, comprising:
at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to cause the apparatus at least to:
control (the decentralized federated training as programmed provides the control) transmitting, by a first electronic entity of a plurality of electronic entities in a network to a second electronic entity in the network (p. 20, “Each participant node of a distributed ledger network may access to the smart contract, which is validated and replicated across the nodes based on a consensus protocol for a distributed ledger technology. Nodes may participate in federated learning by joining and executing a smart contract.” Participant = 1st entity; local instance of blockchain = second entity. ), a request for first global model coefficients of a first global machine learning model (p. 38, “FIG. 2 illustrates an example of a DFL smart contract 118. The DFL smart contract 118 may include a model link 202. The model link 202 may include a link to access a global model for a current round. The global model may include a model designated for further training. For example, the global model may be stored on the file system 114 (FIG. 1). … participant nodes may access the global model by the link … ” );
control receiving, by the first electronic entity from the second electronic entity, the first global model coefficients of the first global machine learning model (p. 38 as cited above, receive “global model” And p. 30, “The machine learning model may include learned parameters that are adjusted/trained to improve predictive performance. The learned parameters may include, for example, weights and biases. A weight may correspond to a weight assigned to a particular neuron of a neural network and the bias may correspond to a particular layer of a neural network.”);
aggregate, by the first electronic entity, local model coefficients of a local machine learning model and the first global model coefficients to produce, as an aggregation, second global model coefficients of a second global machine learning model (p.18, “In response to detection of the first transition token, the participant node may receive the first model. The participant node may aggregate the first model with the trained global model to generate a second model.” And p. 30, “The machine learning model may include learned parameters that are adjusted/trained to improve predictive performance. The learned parameters may include, for example, weights and biases. A weight may correspond to a weight assigned to a particular neuron of a neural network and the bias may correspond to a particular layer of a neural network.”);
control transmitting, by the first electronic entity to the second electronic entity, the second global model coefficients of the second global machine learning model (p. 18, “The participant node may store, on the blockchain, a second transition token indicative of the second model. “See also, p. 42, “The particular order of the participant identifiers 208 may represent an order in which the participant nodes are to aggregate the trained models. Participant nodes that are parties to the DFL smart contract 118 may access the aggregation sequence list to determine when aggregation should be executed.”); and
perform, by the first electronic entity, a training operation on the second global machine learning model (p. 43, “An example of the aggregation sequence list 210 may include [A, B, C]. The letters A through C (A-C) may represent identifiers of participant nodes that are parties to the DFL smart contract. Each of the participant nodes may separately and independently train a global model based on private training data.”). Schiatti does not specifically teach perform, by the first electronic entity, a training operation on the second global machine learning model to produce updated local model coefficients of an updated local machine learning model using a local dataset based on data collected by the first electronic entity of signals in the network, the updated local machine learning model being used by the first electronic entity in determining a performance metric for the first electronic entity in the network. However, in the same field of endeavor, Arora teaches perform, by the first electronic entity, a training operation on the second global machine learning model to produce updated local model coefficients of an updated local machine learning model using a local dataset based on data collected by the first electronic entity of signals in the network (column 12, lines 27-33; “each time the global machine learning model is updated, the machine learning parameters of the global machine learning model are sent to the edge devices that are training the local machine learning models. The edge devices can update the local machine learning models using the received parameters and continue training the updated local machine learning models.”),
the updated local machine learning model being used by the first electronic entity in determining a performance metric for the first electronic entity in the network (column 6, lines 51-53; “In some implementations, the machine learning parameters 137a and 137b can include the characteristics of the network traffic and/or statistics of the network traffic used to train the local IoT traffic classifier model.” And column 3, lines 3-6; “(i) a data size of data packets received from a plurality of devices, (ii) a frequency at which the devices transmit data, (iii) an amount of time between successive transmissions by the devices; or (iv) changes in received signal power of data packets received from the devices.” ). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Schiatti by incorporating training a local model from the updated global model from Arora to improve enable the network devices to benefit from the updated global model and thereby enable improved local processing of network traffic.
Claim 51 is a method that comprises steps similar to the operations of claim 43 above. Claim 51 is rejected on a similar rationale.
Claim 44, Schiatti-Arora teaches the apparatus as in claim 43, wherein the at least one memory and the computer program code configured to cause the apparatus at least to:
control receiving, from a server in the network, a set of configuration parameters (Schiatti, p. 45, “It should be appreciated that multiple aggregation leaders may be selected in a round, for example, the DFL smart contract 118 may include multiple aggregation sequence lists. Each of the aggregation sequence lists may specify a different order for the participant identifiers 208.” And p. 47, “To perform distributed learning and aggregation based on the DFL smart contract 118, the participant nodes may generate one or more tokens, which are stored on the blockchain 106. A token may be indicative of interactions with the DFL Smart Contract 118 or with other participant nodes.”); and
select the second electronic entity of the plurality of electronic entities in the network based on the set of configuration parameters (Schiatti, p. 49, “… an identifier of an aggregation sequence list, identifier(s) of participant node(s) that have previously performed aggregation, and/or identifier(s) of participant nodes that are scheduled to perform aggregation next.” Following the sequence list. And p. 47, “To perform distributed learning and aggregation based on the DFL smart contract 118, the participant nodes may generate one or more tokens, which are stored on the blockchain 106. A token may be indicative of interactions with the DFL Smart Contract 118 or with other participant nodes.”).
Claim 52 is a method that comprises steps similar to the operations of claim 44 above. Claim 52 is rejected on a similar rationale.
Claim 45, Schiatti-Arora teaches the apparatus as in claim 43, wherein the at least one memory and the computer program code configured to cause the apparatus at least to:
control receiving, from a server in the network, a set of configuration parameters (Schiatti, p. 42, “The DFL smart contract 118 may include an aggregation sequence list 210. The aggregation sequence list 210 may include an ordered set of participant identifiers 208.”);
select the second electronic entity of the plurality of electronic entities in the network based on the set of configuration parameters (Schiatti, p. 47, “To perform distributed learning and aggregation based on the DFL smart contract 118, the participant nodes may generate one or more tokens, which are stored on the blockchain 106. A token may be indicative of interactions with the DFL Smart Contract 118 or with other participant nodes.”), wherein the set of configuration parameters includes an aggregation factor indicating an amount by which the second global model coefficients differ from the first global model coefficients (Arora, column 7, lines 36-50; “The computer system 140 also includes an IoT traffic classifier model generator 143 that generates a global IoT traffic classifier model using the aggregated machine learning parameters received for the local IoT traffic classifier models.” “The computer system 140 also includes an IoT traffic classifier model generator 143 that generates a global IoT traffic classifier model using the aggregated machine learning parameters received for the local IoT traffic classifier models.” Hence averaging is an aggregation factor.).
Claim 53 is a method that comprises steps similar to the operations of claim 45 above. Claim 53 is rejected on a similar rationale.
Claim(s) 46 and 54 is/are rejected under 35 U.S.C. 103 as being unpatentable over Schiatti and Arora as applied to claim 43 and 51 above, and further in view of Pezeshki et al., US 20230080218 A1 (hereafter referred to as Pezeshki).
Claim 46, Schiatti-Arora teaches the apparatus as in claims 43, wherein the at least one memory and the computer program code configured to cause the apparatus at least to:
control receiving, from a server in the network, a set of configuration parameters (Schiatti, p. 45, “It should be appreciated that multiple aggregation leaders may be selected in a round, for example, the DFL smart contract 118 may include multiple aggregation sequence lists. Each of the aggregation sequence lists may specify a different order for the participant identifiers 208.” And p. 47, “To perform distributed learning and aggregation based on the DFL smart contract 118, the participant nodes may generate one or more tokens, which are stored on the blockchain 106. A token may be indicative of interactions with the DFL Smart Contract 118 or with other participant nodes.”);
select the second electronic entity of the plurality of electronic entities in the network based on the set of configuration parameters (Schiatti, p. 42, “The particular order of the participant identifiers 208 may represent an order in which the participant nodes are to aggregate the trained models. Participant nodes … may access the aggregation sequence list to determine when aggregation should be executed.”), and
wherein the set of configuration parameters includes a model update schedule (Schiatti, p. 69, “an identifier of an aggregation sequence list, identifier(s) of participant node(s) that have previously performed aggregation, and/or identifier(s) of participant nodes that are scheduled to perform aggregation next.”),
Schiatti-Arora does not specifically teach the model update schedule indicating times at which the first electronic entity controls transmitting requests for updates to the local machine learning model to electronic entities of the plurality of electronic entities, the electronic entities being selected based on the set of configuration parameters. However, in the same field of endeavor, Pezeshki teaches the model update schedule indicating times at which the first electronic entity controls transmitting requests for updates to the local machine learning model to electronic entities of the plurality of electronic entities, the electronic entities being selected based on the set of configuration parameters (p. 78, “Similarly, a future federated learning round may be … a specified future federated learning round (e.g., a future federated learning round that is scheduled to occur during a specified time period … ”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Schiatti-Arora to incorporate future scheduled times from Pezeshki for the schedule from Schiatti-Arora to enable more predictable resource utilization. The motivation would have been to improve the likelihood that scheduled entities have resources to perform network communications.
Claim 54 is a method that comprises steps similar to the operations of claim 46 above. Claim 54 is rejected on a similar rationale.
Claim(s) 48-49 and 56-57 is/are rejected under 35 U.S.C. 103 as being unpatentable over Schiatti and Arora as applied to claim 43 and 51 above, and further in view of Choudhury et al., US 20210142223 A1 (hereafter referred to as Choudhury).
Claim 48, Schiatti-Arora teaches the apparatus as in claim 43, wherein the at least one memory and the computer program code configured to cause the apparatus at least to:
control receiving, from a server in the network, a set of configuration parameters (Schiatti, p. 45, “It should be appreciated that multiple aggregation leaders may be selected in a round, for example, the DFL smart contract 118 may include multiple aggregation sequence lists. Each of the aggregation sequence lists may specify a different order for the participant identifiers 208.” And p. 47, “To perform distributed learning and aggregation based on the DFL smart contract 118, the participant nodes may generate one or more tokens, which are stored on the blockchain 106. A token may be indicative of interactions with the DFL Smart Contract 118 or with other participant nodes.”);
select the second electronic entity of the plurality of electronic entities in the network based on the set of configuration parameters (Schiatti, p. 42, “The particular order of the participant identifiers 208 may represent an order in which the participant nodes are to aggregate the trained models. Participant nodes … may access the aggregation sequence list to determine when aggregation should be executed.”), and
Schiatti-Arora does not specifically teach wherein the set of configuration parameters includes a network topology indicator indicating whether the second electronic entity is the server or a peer device to the first electronic entity. However, in the same field of endeavor, Choudhury teaches wherein the set of configuration parameters includes a network topology indicator indicating whether the second electronic entity is the server or a peer device to the first electronic entity (p. 99-101, “the process can be implemented by training manager 206 in computer system 204 in FIG. 2.” “The process begins by identifying a hierarchical structure for nodes in which a global machine learning model is located at a primary node in the hierarchical structure (step 700).” “The process trains the machine learning models in the authorized nodes using the local data in the authorized nodes to generate local model updates to weights in the local machine learning models (step 704). The process propagates the local model updates to the weights upward in the hierarchical structure to the global machine learning model in the primary node …” The authorized nodes are considered peer nodes and the primary node is a server node.) It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Schiatti-Arora by incorporating network topology indication from Choudhury to recognize the relationship between nodes and thereby absorb the impact of functionality and change operations according to the relationship. (See Choudhury, p. 27, “many applications such as healthcare and Internet of things (IoT) applications are located on different data processing systems organized in a hierarchical structure. The illustrative embodiments also recognize and take into account that a hierarchical structure is often unknown and may change depending on the particular use case in which permissions may be based on the particular use case.”).
Claim 56 is a method that comprises steps similar to the operations of claim 48 above. Claim 56 is rejected on a similar rationale.
Claim 49, Schiatti-Arora teaches the apparatus as in claim 43, wherein the at least one memory and the computer program code configured to cause the apparatus at least to:
control receiving, from a server in the network, a set of configuration parameters (Schiatti, p. 45, “It should be appreciated that multiple aggregation leaders may be selected in a round, for example, the DFL smart contract 118 may include multiple aggregation sequence lists. Each of the aggregation sequence lists may specify a different order for the participant identifiers 208.”);
select the second electronic entity of the plurality of electronic entities in the network based on the set of configuration parameters (Schiatti, p. 42, “The particular order of the participant identifiers 208 may represent an order in which the participant nodes are to aggregate the trained models. Participant nodes … may access the aggregation sequence list to determine when aggregation should be executed.”), and
Schiatti-Arora does not specifically teach wherein the set of configuration parameters includes a network topology indicator indicating that the second electronic entity is a server to the first electronic entity; and wherein the second global model coefficients of the second global machine learning model are transmitted to the second electronic entity over a physical uplink shared channel. However, in the same field of endeavor, Choudhury teaches wherein the set of configuration parameters includes a network topology indicator indicating that the second electronic entity is a server to the first electronic entity; and wherein the second global model coefficients of the second global machine learning model are transmitted to the second electronic entity over a physical uplink shared channel (p. 101, “The process propagates the local model updates to the weights upward in the hierarchical structure to the global machine learning model in the primary node, wherein a node receiving local model updates to the weights from nodes from a lower level aggregates the weights in the local model updates received from the nodes in the lower level (step 706). “ And p. 129, “These signals can be transmitted over connections, such as wireless connections, optical fiber cable, coaxial cable, a wire, or any other suitable type of connection.”). It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Fedida-Arora by incorporating topology information from Choudhury to compliment the sequence list of Fedida-Arora. The motivation would have been to provide an equivalent substitution for the sequence list for variation that include server/client relationship. For further motivation see claim 48 above.
Claim 57 is a method that comprises steps similar to the operations of claim 43 above. Claim 57 is rejected on a similar rationale.
Claim(s) 47, 50, 55 and 58 is/are rejected under 35 U.S.C. 103 as being unpatentable over Schiatti and Arora as applied to claim 43 above, and further in view of Yajnanarayana et al., US 20230188430 A1 (hereafter referred to as Yajnanarayana).
Claims 47, Schiatti-Arora teaches he apparatus as in claim 43, wherein the at least one memory and the computer program code configured to cause the apparatus at least to:
control receiving, from a server in the network, a set of configuration parameters network (Schiatti, p. 45, “It should be appreciated that multiple aggregation leaders may be selected in a round, for example, the DFL smart contract 118 may include multiple aggregation sequence lists. Each of the aggregation sequence lists may specify a different order for the participant identifiers 208.”);
select the second electronic entity of the plurality of electronic entities in the network based on the set of configuration parameters, wherein the at least one memory and the computer program code configured to cause the apparatus at least to aggregate the local model coefficients and the first global model coefficients (Schiatti, p. 18, “The participant node may store, on the blockchain, a second transition token indicative of the second model. “See also, p. 42, “The particular order of the participant identifiers 208 may represent an order in which the participant nodes are to aggregate the trained models. Participant nodes that are parties to the DFL smart contract 118 may access the aggregation sequence list to determine when aggregation should be executed.”) is further configured to cause the apparatus at least to:
Schiatti-Arora does not specifically teach wherein each of the local machine learning model and the first machine learning model include a respective origin identifier identifying an initialization scheme used to generate that local machine learning model or global machine learning model, and in response to the origin identifier of the first global machine learning model being different from the origin identifier of the local machine learning model, set the local model coefficients equal to the first model coefficients. However, Yajnanarayana each of the local machine learning model and the first machine learning model include a respective origin identifier identifying an initialization scheme used to generate that local machine learning model or global machine learning model (p. 27, “…the evolution of the global ML model and the local ML models are maintained in a version control system as updates from local ML models are applied to the global ML model and updates from global ML model are suggested to the local ML models.”), and in response to the origin identifier of the first global machine learning model being different from the origin identifier of the local machine learning model, set the local model coefficients equal to the first model coefficients (p. 32, “Action 305. The first network node 111 may then select an ML model evolved along an evolution branch based on compared data values of different versions of the first ML model of different evolution branches. The data values of the different versions of the first ML model may be compared with a validation set.” Setting to the same branch.). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Schiatti-Arora by incorporating the version management from Yajnanarayana to prevent the catastrophic consequences from the model versions being different.
Claim 55 is a method that comprises steps similar to the operations of claim 47 above. Claim 55 is rejected on a similar rationale.
Claim 50, Schiatti-Arora teaches the apparatus as in claim 43, as cited above and the computer program code configured to cause the apparatus at least to aggregate the local model coefficients and the first global model coefficients is further configured (Schiatti-Arora does not specifically teach version number but teaches rounds for generating the next model.) Schiatti-Arora does not specifically teach wherein the first global machine learning model also includes a version identifier identifying a version number of the first global machine learning model, and is further configured to cause the apparatus at least to: increment the version number of the first global machine learning model to produce a version identifier of the second global machine learning model. However, in the same field of endeavor, Yajnanarayana teaches wherein the first global machine learning model also includes a version identifier identifying a version number of the first global machine learning model (p. 27, “…the evolution of the global ML model and the local ML models are maintained in a version control system as updates from local ML models are applied to the global ML model and updates from global ML model are suggested to the local ML models.” The branch is interpreted to be the version identifier.) and wherein the at least one memory and to cause the apparatus at least to: increment the version number of the first global machine learning model to produce a version identifier of the second global machine learning model (p. 32, “Action 305. The first network node 111 may then select an ML model evolved along an evolution branch based on compared data values of different versions of the first ML model of different evolution branches. The data values of the different versions of the first ML model may be compared with a validation set.”). It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Schiatti-Arora by incorporating branched model version from Yajnanarayana for model version from Schiatti-Arora to implement better version control and thereby prevent global model and local mismatch. The motivation would have been to prevent catastrophic consequences from using incorrect client updates. (See Yajnanarayana, p. 6, “With each local agent (client) update, the global model evolves smoothly towards a better model for estimation and prediction. However, in this federated learning architecture, an incorrect client update can have catastrophic consequences, as it will be carried through for further evolution of the global model.”).
Claim 58 is a method that comprises steps similar to the operations of claim 51 above. Claim 58 is rejected on a similar rationale.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Zhang et al., US 20220043920 A1, teaches pairs of secret agreements between two of the same qualified clients may be removed from the aggregated local model update such that only the local model updates corresponding to the qualified clients remain after aggregating the masked local model updates. The global machine-learning model may be updated based on the aggregated masked local model updates at operations 322, which may be the same or similar to updating the global machine-learning model at the operations 230 of FIG. 2.
Anwar et al., US-20220156633-A1, teaches when the worker receives the full global model 42 it replaces the local model stored at the worker with the global model and performs a training update on the global model 44. This training update adjusts the parameters according to an optimization method. In the present example, gradient descent is used to update the global model parameters based on training data that is available to the worker.
Lin et al, US-20230093067-A1, teaches a client computing device determines a local update to a model based on locally stored data and then communicates the local update to a cloud service (e.g., in a privacy preserving and communication efficient manner) for aggregation to generate a global update to the model. However, some of the client computing devices may not contribute to the training of the global model, which results in an unfairness problem.
Karakoc et al., US-20250150462-A1, teaches the detector node is for monitoring messages sent between the first client node, the second client node and the FL server, as part of a multi-hop FL process in which the first client node trains a copy of the global model on local data to produce a local model, and sends a local model update to the second client node for forwarding to the FL server.
H. Kavalionak et al., "Impact of Network Topology on the Convergence of Decentralized Federated Learning Systems," teaches different types of networks used to build the overlay between the nodes in decentralized federated learning.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PATRICE L WINDER whose telephone number is (571)272-3935. The examiner can normally be reached M-F 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KAMAL B DIVECHA can be reached at (571)272-5863. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Patrice L Winder/Primary Examiner, Art Unit 2453