Prosecution Insights
Last updated: May 29, 2026
Application No. 17/286,287

HANDLING OF MACHINE LEARNING TO IMPROVE PERFORMANCE OF A WIRELESS COMMUNICATIONS NETWORK

Final Rejection §101§103
Filed
Apr 16, 2021
Priority
Oct 19, 2018 — nonprovisional of PCTSE2018051069
Examiner
WELCH, JENNIFER N
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
OA Round
4 (Final)
75%
Grant Probability
Favorable
5-6
OA Rounds
0m
Est. Remaining
99%
With Interview

Examiner Intelligence

Grants 75% — above average
75%
Career Allowance Rate
253 granted / 339 resolved
+19.6% vs TC avg
Strong +29% interview lift
Without
With
+29.0%
Interview Lift
resolved cases with interview
Typical timeline
4y 4m
Avg Prosecution
7 currently pending
Career history
365
Total Applications
across all art units

Statute-Specific Performance

§101
5.3%
-34.7% vs TC avg
§103
73.0%
+33.0% vs TC avg
§102
9.8%
-30.2% vs TC avg
§112
9.3%
-30.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 339 resolved cases

Office Action

§101 §103
DETAILED ACTION Remarks Claims 1-7, 10-14, 16-18, and 25-27 have been examined and rejected. This Office action is responsive to the amendment filed on 09/02/2025, which has been entered in the above identified application. Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 4, 5, 10, 13, 14, 16, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Johnsson et al. (WO 2018101862 A1, published 06/07/2018), hereinafter Johnsson, in view of Li et al. (US 11075929 B1, published 07/27/2021), hereinafter Li. Regarding claim 1, Johnsson teaches the claim comprising: A method performed in a wireless communications system for handling of machine learning to improve performance of a wireless communications network operating in the wireless communications system, the wireless communications system comprising a central network node and one or more intermediate network nodes arranged between the central network node and one or more leaf network nodes operating in the wireless communications network, at least one out of: the central network node, the one or more intermediate network nodes or the one or more leaf network nodes comprising a machine learning unit, the method comprising (Johnsson Figs. 1-10; [0043], Briefly described, a master node, a local node, a service assurance system, and a respective method performed thereby for predicting one or more metrics associated with a communication network are provided. Nodes in the distributed machine learning scenario that do not have noticeable contribution in improving the overall accuracy of the distributed learning activity are identified and their participation in the distributed learning is deactivated or alternatively limited, via signalling. This allows sparing there local resources for other important system activities/services or simply reduce their bandwidth/energy consumption; [00088], The master node and the local nodes are comprised in (or are part of) the communication network. As described above, the communication network may comprise system with servers which are interconnected via a network, but the system is a data network within the communication network. Still further, the communication network may be a wired communication network, e.g. a landline communication network such as a Public Switched Telephone Network, PSTN, or a radio/wireless communication network, where e.g. the master node may be a base station and the local node(s) may be sensors, stations, wireless devices etc; [0091], many eNodeBs may potentially be or comprising a local node e.g. by locally installing an Analytics Agent, AA. A Central Analytics Agent, CAA, can be installed e.g. on a master node e.g. on Evolved Packet Core, EPC, or virtual EPC, vEPC; [0092], A typical architecture of an Evolved UMTS Terrestrial Radio Access Network (E-UTRAN) and interaction with EPC in an LTE network is shown in the figure 4a. Functional modules of the eNodeB and EPC are shown in figure 4b. An exemplifying implementation of the solution comprising the methods described above is illustrated in figure 4c and 4d. In this exemplifying implementation of the solution, the eNodeB and EPC with installed AA and CAA agents are illustrated in the figures. In these networks (e.g. LTE) an eNodeB works as a base-station responsible for the communication with the mobiles (wireless devices) in one or multiple cells; [0117], the master node 500 comprising a processor 521; [0133], the local node 700 comprising a processor 721): by means of the machine learning unit and a machine learning model relating to at least one network node out of the one or more intermediate network nodes or the one or more leaf network nodes, determining a prediction of a performance of the at least one network node based on input data relating to the at least one network node; based on the determined prediction, performing one or more operations relating to the at least one network node (Johnsson Figs. 1-10; [0049], Figure 1a illustrates the method 100 comprising receiving 120 prediction(s) based on training data from local nodes in the communication network; [0054], The metrics associated with the communication network may relate to performance, anomalies and other information relative to current circumstances of the communication network. The training data may consists of both measurement data (X) from e.g. sensors associated with local nodes and actual, true or measured values (Y) that the local model shall be learned to predict later. In the prediction phase there are only X, while the Y will be predicted. The predictions may further relate to Operation, Administration and Management, OAM, data; [0077], Based on the received local reporting policy from the master node, the method comprises building 230 a local model based on locally available data; performing 240 a prediction based on the local model; and transmitting 250 the prediction to the master node in accordance with the received local reporting policy; [0079], Once the local node has built its local model based on the locally available data, the local node may perform 240 the prediction based on the local model. The prediction may comprise an indication of a likely value of one or more metrics based on the part of the communication network represented by the local node. Once the local node has performed the prediction, the local node may send the prediction to the master node in accordance with the received local reporting policy; [0078], The master node will use the prediction from the local node, possible together with prediction(s) from other local node(s), in order to determine one or more metrics associated with the communication network; [0084-0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0087], the master node receiving 330 the prediction(s) from local nodes in the communication network, determining weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions, and adjusting a respective local reporting policy for one or more local nodes based on the determined weight parameter(s). The method 300 also comprises the one or more local nodes receiving 340 the local reporting policy from the master node in the communication network, the local reporting policy informing the local node of how to send prediction(s) to the master node. Based on the received reporting policy from the master node, the method comprises the one or more local nodes building 350 a local model based on locally available data, performing a prediction based on the local model, and transmitting the prediction to the master node in accordance with the received local reporting policy; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); [0126], the master node 500, 600 is configured for adjusting the respective local reporting policy by deactivating a local node having a determined weight parameter not meeting a first threshold, wherein the local node will stop to send predictions to the master node; [0128-0129], adjusting the respective local reporting policy by changing the frequency with which a local node sends predictions to the master node depending on (i) prediction accuracy of the received prediction(s) from the local nodes, and/or (ii) the determined weight parameter(s); [0139], According to an embodiment, the received reporting policy comprises information instructing the local node to activate its predictions, deactivate its predictions or changing the frequency with which the local node transmits predictions to the master node; see also [0096], [0098-0100]). by means one of the at least one network node and of another network node comprising the machine learning unit, training the machine learning model by using an input parameter relating to the performance of the at least one network node in order to choose one or more operations relating to the performance of the at least one network node, evaluating the machine learning model after performing the one or more operations relating to the performance of the at least one network node, and updating the machine learning model based on the one or more operations relating to the performance of the at least one network node (Johnsson Figs. 1-10; [0052], Once the master node has received the prediction(s) from the one or more local nodes, the master node may determine 130 weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions; the master node may calculate accuracy values for received predictions, which accuracy values may be based on the extent of which respective received prediction from respective local node differs from those respective local nodes' GT value and optionally also previously received one or more predictions. Then the master node may use these accuracy values when determining the weight parameter(s) for respective local nodes; [0054], The metrics associated with the communication network may relate to performance, anomalies and other information relative to current circumstances of the communication network. The training data may consists of both measurement data (X) from e.g. sensors associated with local nodes and actual, true or measured values (Y) that the local model shall be learned to predict later. In the prediction phase there are only X, while the Y will be predicted. The predictions may further relate to Operation, Administration and Management, OAM, data; [0068], overall accuracy at the master node goes below a certain threshold (or some other trigger), wherein the master node may activate a previously deactivated local node in order to achieve higher accuracy of the global prediction done by the master node (global model) based on the local predictions; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0096-0097], At the high level, CAA (at M0) uses the Winnow algorithm to make the predictions and the CAA works as a data fusion module based on the prediction inputs of all AAs (see figure 4e). The algorithm also assigns weights to all participating AA nodes. These weights may have to be calculated/updated e.g. using the training data gathered from local nodes (usually at periodic intervals) participating the distributed learning; [0103], One phase of the solution is model building; If the global model is outdated because of system state changes or concept drift; An update cycle can also be triggered periodically if new training data is available; [0104-0105], During each update cycle collect new training data at the CAA arriving from AAs of different node; Updating the global model at CAA involves updating the weight parameters for the Winnow algorithm. Compute/update the weights when the new training data becomes available; [0106], IF there is concept-drift detection or accuracy at CAA falls below a certain threshold value: 1 . Trigger signalling and send all the N local nodes: an ACTIVATE signal; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); see also [0078], [0087], [0098-0100], [0126], [0128-0129], [0139]). the training of the machine learning model comprising training the machine learning model by using the received input parameter and a state relating to an environment of the at least one network node to choose one or more actions relating to the performance of the at least one network node; the updating of the machine learning model based on the one or more operations comprising updating the machine learning model based on the one or more operations and based on the state relating to the environment of the at least one network node (Johnsson Figs. 1-10; [0052], Once the master node has received the prediction(s) from the one or more local nodes, the master node may determine 130 weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions; the master node may calculate accuracy values for received predictions, which accuracy values may be based on the extent of which respective received prediction from respective local node differs from those respective local nodes' GT value and optionally also previously received one or more predictions. Then the master node may use these accuracy values when determining the weight parameter(s) for respective local nodes; [0054], The metrics associated with the communication network may relate to performance, anomalies and other information relative to current circumstances of the communication network. The training data may consists of both measurement data (X) from e.g. sensors associated with local nodes and actual, true or measured values (Y) that the local model shall be learned to predict later. In the prediction phase there are only X, while the Y will be predicted; [0068], overall accuracy at the master node goes below a certain threshold (or some other trigger), wherein the master node may activate a previously deactivated local node in order to achieve higher accuracy of the global prediction done by the master node (global model) based on the local predictions; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0096-0097], At the high level, CAA (at M0) uses the Winnow algorithm to make the predictions and the CAA works as a data fusion module based on the prediction inputs of all AAs (see figure 4e). The algorithm also assigns weights to all participating AA nodes. These weights may have to be calculated/updated e.g. using the training data gathered from local nodes (usually at periodic intervals) participating the distributed learning; [0103], One phase of the solution is model building; If the global model is outdated because of system state changes or concept drift; An update cycle can also be triggered periodically if new training data is available; [0104-0105], During each update cycle collect new training data at the CAA arriving from AAs of different node; Updating the global model at CAA involves updating the weight parameters for the Winnow algorithm. Compute/update the weights when the new training data becomes available; [0106], IF there is concept-drift detection or accuracy at CAA falls below a certain threshold value: 1 . Trigger signalling and send all the N local nodes: an ACTIVATE signal; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); see also [0078], [0087], [0098-0100], [0126], [0128-0129], [0139]). and transmitting information relating to the machine learning model to one or more other network nodes (Johnsson Figs. 1-10; [0078], The master node will use the prediction from the local node, possible together with prediction(s) from other local node(s), in order to determine one or more metrics associated with the communication network; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0087], the master node receiving 330 the prediction(s) from local nodes in the communication network, determining weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions, and adjusting a respective local reporting policy for one or more local nodes based on the determined weight parameter(s). The method 300 also comprises the one or more local nodes receiving 340 the local reporting policy from the master node in the communication network, the local reporting policy informing the local node of how to send prediction(s) to the master node. Based on the received reporting policy from the master node, the method comprises the one or more local nodes building 350 a local model based on locally available data, performing a prediction based on the local model, and transmitting the prediction to the master node in accordance with the received local reporting policy; [0096], The way the typical distributed learning algorithm operates (e.g. Winnow) will now be described. From figure 4e, local service predictions from AAs at these machines are first executed and local prediction results are then sent to the CAA (at Mo). CAA updates the associated weights for the different server machines or nodes based on the prediction results (previous steps need to be executed whenever weights needs to be updated at CAA). After the previous steps, the fusion step is executed by the CAA (e.g. weighted majority algorithm) so as to compute the final prediction; [0098-0100], Defined control signals are sent from the CAA (at the master node) to the AAs (at the local nodes) to control their participation behaviour in the distributed learning algorithm dynamically under changing workload and resource availability; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); [0126], According to a further embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by deactivating a local node having a determined weight parameter not meeting a first threshold, wherein the local node will stop to send predictions to the master node; [0128], According to an embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by changing the frequency with which a local node sends predictions to the master node depending on the determined weight parameter(s) of the local node; [0129], According to yet an embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by changing the frequency with which a local node sends predictions to the master node depending on (i) prediction accuracy of the received prediction(s) from the local nodes, and/or (ii) the determined weight parameter(s); [0139], According to an embodiment, the received reporting policy comprises information instructing the local node to activate its predictions, deactivate its predictions or changing the frequency with which the local node transmits predictions to the master node) However, Johnsson fails to expressly disclose the prediction comprising one or more of a modulation and coding scheme (MCS) to use, which transmitter beam to use, and which receiver beam to use; the one or more operations comprising at least one of a change of transmit beam, a change of receive beam, and a change of MCS selection operation, based on the determined prediction of the performance of the at least one network node. In the same field of endeavor, Li teaches: the prediction comprising one or more of a modulation and coding scheme (MCS) to use, which transmitter beam to use, and which receiver beam to use; the one or more operations comprising at least one of a change of transmit beam, a change of receive beam, and a change of MCS selection operation, based on the determined prediction of the performance of the at least one network node (Li Figs. 1-19; abs. inputting one or more features derived from the time series data into a machine learning module; generating a report comprising an indication that the anomaly exists and a description of the anomaly type, and determining one or more treatments for the determined anomaly; col. 1 [line 53], An anomaly may be any problem that occurs within the network; col. 7 [line 52], transmission beam index indicator, receiving beam index indicator; col. 8 [line 10], The treatment may include one or more of the following: fast link adaptation (e.g., lower the initial modulation and coding scheme (MCS) configuration, increase the initial transmitting power configuration, increase step sizes related to link recovery, increase the transmitting power modification step sizes, increase MCS modification step sizes, and increase the rate of link adaptation. The treatments may further include fast link recovery, such as switching to other beam pairs quickly; a treatment may be to fast-switch beam pairs; Table 1 lower the initial MCS unvarying configuration, increase MCS modification step sizes, increase MCS modification step sizes; col. 17 [line 17], the anomaly detection system may recommend using special adaptation (e.g., starting MCS being lower than in general), or very fast link recovery (e.g., large step size), or even skip certain adaptation and jump to the process of using alternative links. If these treatments are labeled, then the machine learning algorithm may be used to learn the model where the characteristics of the channel related measurements as the input of the model, where the output can be the way of link adaption, or the treatment) It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated the prediction comprising one or more of a modulation and coding scheme (MCS) to use, which transmitter beam to use, and which receiver beam to use; the one or more operations comprising at least one of a change of transmit beam, a change of receive beam, and a change of MCS selection operation, based on the determined prediction of the performance of the at least one network node as suggested in Li into Johnsson. Doing so would be desirable because the anomaly detection system may use data analytics and machine learning to automatically detect various anomalies that occur on the millimeter wave communication network. To detect anomalies automatically, machine learning algorithms can help enhance the accuracy of determining when an anomaly exists and determining a categorization for the anomaly. The anomaly detection system may enable automatic detection and treatment of anomalies, which may reduce the need for engineers to make field trips and may also lower operational costs. Furthermore, automatically detecting and eliminating anomalies may improve the reliability and speed of the communication network (see Li col. 6 [line 23]). The anomaly detection system may not need to rely on interference measurement which may need to take resources to do that and which may hurt the system's efficiency. Rather, the machine-learning based approach can tell whether a link suffers interference via the waveform of the dynamic signal such as path loss, signal to interference and noise ratio, etc. One of the advantages is to increase the system efficiency by reducing the interference measurement necessity (see Li col. 20 [line 52]). As disclosed in Johnsson, systems are needed to detect and resolve problems that impact service quality (see Johnsson background). Li would improve the system of Johnsson by enhancing the ability to detect and eradicates anomalies, which can include any problem that occurs within the network (see Li col. 1 [line 53]). Li contemplates using any suitable network environment including any suitable number of any suitable systems and components arranged in any suitable manner (see Li col. 2 [line 40]), using any suitable frequency (see Li col. 1 [line 53]). Regarding claims 10, 16, and 25, claims 10, 16, and 25 contain substantially similar limitations to those found in claim 1. Claims 16 and 25 further recite by means of the at least one network node to which the machine learning model relates, receive an input parameter relating to a performance of the at least one network node (Johnsson Figs. 1-10; [0054], The metrics associated with the communication network may relate to performance, anomalies and other information relative to current circumstances of the communication network. The training data may consists of both measurement data (X) from e.g. sensors associated with local nodes and actual, true or measured values (Y) that the local model shall be learned to predict later. In the prediction phase there are only X, while the Y will be predicted. The predictions may further relate to Operation, Administration and Management, OAM, data; [0077], Based on the received local reporting policy from the master node, the method comprises building 230 a local model based on locally available data; performing 240 a prediction based on the local model; and transmitting 250 the prediction to the master node in accordance with the received local reporting policy; [0079], Once the local node has built its local model based on the locally available data, the local node may perform 240 the prediction based on the local model. The prediction may comprise an indication of a likely value of one or more metrics based on the part of the communication network represented by the local node. Once the local node has performed the prediction, the local node may send the prediction to the master node in accordance with the received local reporting policy; [0078], The master node will use the prediction from the local node, possible together with prediction(s) from other local node(s), in order to determine one or more metrics associated with the communication network; [0084-0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0087], the master node receiving 330 the prediction(s) from local nodes in the communication network, determining weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions, and adjusting a respective local reporting policy for one or more local nodes based on the determined weight parameter(s). The method 300 also comprises the one or more local nodes receiving 340 the local reporting policy from the master node in the communication network, the local reporting policy informing the local node of how to send prediction(s) to the master node. Based on the received reporting policy from the master node, the method comprises the one or more local nodes building 350 a local model based on locally available data, performing a prediction based on the local model, and transmitting the prediction to the master node in accordance with the received local reporting policy; see also [0049], [0096], [0098-0100], [0124-0129], [0139]) Consequently, claims 10, 16, and 25 are rejected for the same reasons. Regarding claim 4, Johnsson in view of Li teaches all the limitations of claim 1, further comprising: wherein the determining of the prediction of the performance of the at least one network node comprises: by means of the at least one network node, performing one or more measurements; and by means of the machine learning unit, using information relating to the performed one or more measurements as input data to the machine learning model in order to determine the prediction of the performance of the at least one network node, wherein the prediction is based on output data from the machine learning model (Johnsson Figs. 1-10; [0078], The master node will use the prediction from the local node, possible together with prediction(s) from other local node(s), in order to determine one or more metrics associated with the communication network; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0087], the master node receiving 330 the prediction(s) from local nodes in the communication network, determining weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions, and adjusting a respective local reporting policy for one or more local nodes based on the determined weight parameter(s); [0096], The way the typical distributed learning algorithm operates (e.g. Winnow) will now be described. From figure 4e, local service predictions from AAs at these machines are first executed and local prediction results are then sent to the CAA (at Mo). CAA updates the associated weights for the different server machines or nodes based on the prediction results (previous steps need to be executed whenever weights needs to be updated at CAA). After the previous steps, the fusion step is executed by the CAA (e.g. weighted majority algorithm) so as to compute the final prediction; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); see also [0098-0100], [0126], [0128-0129], [0139]). Regarding claim 13, claim 13 contains substantially similar limitations to those found in claim 4. Consequently, claim 13 is rejected for the same reasons. Regarding claim 5, Johnsson in view of Li teaches all the limitations of claim 1, further comprising: further comprising: evaluating the machine learning model after the performing of the one or more operations relating to the at least one network node based on the determined prediction; and updating the machine learning model based on an evaluation (Johnsson Figs. 1-10; [0052] Once the master node has received the prediction(s) from the one or more local nodes, the master node may determine 130 weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions; the master node may calculate accuracy values for received predictions, which accuracy values may be based on the extent of which respective received prediction from respective local node differs from those respective local nodes' GT value and optionally also previously received one or more predictions. Then the master node may use these accuracy values when determining the weight parameter(s) for respective local nodes; [0068], overall accuracy at the master node goes below a certain threshold (or some other trigger), wherein the master node may activate a previously deactivated local node in order to achieve higher accuracy of the global prediction done by the master node (global model) based on the local predictions; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0096-0097], At the high level, CAA (at M0) uses the Winnow algorithm to make the predictions and the CAA works as a data fusion module based on the prediction inputs of all AAs (see figure 4e). The algorithm also assigns weights to all participating AA nodes. These weights may have to be calculated/updated e.g. using the training data gathered from local nodes (usually at periodic intervals) participating the distributed learning; [0103], One phase of the solution is model building; An update cycle can also be triggered periodically if new training data is available; [0104-0105], During each update cycle collect new training data at the CAA arriving from AAs of different node; Updating the global model at CAA involves updating the weight parameters for the Winnow algorithm. Compute/update the weights when the new training data becomes available; [0106], IF there is concept-drift detection or accuracy at CAA falls below a certain threshold value: 1 . Trigger signalling and send all the N local nodes: an ACTIVATE signal; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); see also [0078], [0087], [0098-0100], [0126], [0128-0129], [0139]). Regarding claim 14, claim 14 contains substantially similar limitations to those found in claim 5. Consequently, claim 14 is rejected for the same reasons. Claims 2, 3, 11, 12, 17, 18, 26, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Johnsson in view of Li in view of Kasaragod et al. (US 20190037040 A1, published 01/31/2019), hereinafter Kasaragod. Regarding claim 26, Johnsson in view of Li teaches all the limitations of claim 25, further comprising: wherein the network node is a radio network node, wherein the processor is further configured to: receive from the communications device, information relating to one or more objectives of the communications device when a leaf network node being a communications device connects to the radio network node; transmit, to the communications device, a request to collect data to be used as input data for training of a machine learning model relating to the communications device; receive, from the communications device, the collected data; based on the received collected data, update the machine learning model suitable for the communications device's one or more objectives (Johnsson Figs. 1-10; [0054], in case a local node is not powerful enough to process its own data, then it can share its data with a neighbouring local node so that neighbouring node may act and process this data on the behalf of the less powerful node (i.e. building local model, doing local predictions and transmitting these predictions to the master node); [0078], The master node will use the prediction from the local node, possible together with prediction(s) from other local node(s), in order to determine one or more metrics associated with the communication network; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0087], the master node receiving 330 the prediction(s) from local nodes in the communication network, determining weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions, and adjusting a respective local reporting policy for one or more local nodes based on the determined weight parameter(s). The method 300 also comprises the one or more local nodes receiving 340 the local reporting policy from the master node in the communication network, the local reporting policy informing the local node of how to send prediction(s) to the master node. Based on the received reporting policy from the master node, the method comprises the one or more local nodes building 350 a local model based on locally available data, performing a prediction based on the local model, and transmitting the prediction to the master node in accordance with the received local reporting policy; [0096], The way the typical distributed learning algorithm operates (e.g. Winnow) will now be described. From figure 4e, local service predictions from AAs at these machines are first executed and local prediction results are then sent to the CAA (at Mo). CAA updates the associated weights for the different server machines or nodes based on the prediction results (previous steps need to be executed whenever weights needs to be updated at CAA). After the previous steps, the fusion step is executed by the CAA (e.g. weighted majority algorithm) so as to compute the final prediction; [0098-0100], Defined control signals are sent from the CAA (at the master node) to the AAs (at the local nodes) to control their participation behaviour in the distributed learning algorithm dynamically under changing workload and resource availability; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); [0126], According to a further embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by deactivating a local node having a determined weight parameter not meeting a first threshold, wherein the local node will stop to send predictions to the master node; [0128], According to an embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by changing the frequency with which a local node sends predictions to the master node depending on the determined weight parameter(s) of the local node; [0129], According to yet an embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by changing the frequency with which a local node sends predictions to the master node depending on (i) prediction accuracy of the received prediction(s) from the local nodes, and/or (ii) the determined weight parameter(s); [0139], According to an embodiment, the received reporting policy comprises information instructing the local node to activate its predictions, deactivate its predictions or changing the frequency with which the local node transmits predictions to the master node; [0117], the master node 500 comprising a processor 521; [0133], the local node 700 comprising a processor 721). However, Johnsson in view of Li fails to expressly disclose receive from the communications device, information relating to one or more objectives of the communications device when a leaf network node being a communications device connects to the radio network node; transmit, to the communications device, a machine learning model suitable for the communications device's one or more objectives; transmit, to the communications device, a request to collect data to be used as input data for training of a machine learning model relating to the communications device; receive, from the communications device, the collected data; based on the received collected data, update the machine learning model suitable for the communications device's one or more objectives; and transmit the updated machine learning model to the communications device. In the same field of endeavor, Kasaragod teaches: receive from the communications device, information relating to one or more objectives of the communications device when a leaf network node being a communications device connects to the radio network node; transmit, to the communications device, a machine learning model suitable for the communications device's one or more objectives; transmit, to the communications device, a request to collect data to be used as input data for training of a machine learning model relating to the communications device; receive, from the communications device, the collected data; based on the received collected data, update the machine learning model suitable for the communications device's one or more objectives; and transmit the updated machine learning model to the communications device (Kasaragod Figs. 1-19; [0027], a remote provider network and a local network device (e.g., hub device or edge device) may be used to generate a split prediction. For example, a hub device of a local network may receive data from sensors and process the data using a local model (e.g., data processing model) to generate a local prediction. The sensor data may also be transmitted to a remote provider network, which returns another prediction. If the returned prediction from the provider network is more accurate, then the local prediction may be corrected by the returned prediction; [0028] In some embodiments, a provider network and/or a hub device may update local data models of edge devices based on data collected by the edge devices. For example, a provider network and/or a hub device may periodically receive data from edge devices and generate new updates to local models based on the new data. The provider network and/or a hub device may then deploy the updates to the respective edge devices. In embodiments, entirely new versions of the local models are deployed to replace current models of the respective edge devices; [0029-0030], multiple models may be implemented across multiple respective edge devices (e.g., tier devices) of a network; [0035], A hub device 100, a provider network 102, local network 108, edge devices 106; [0036], multiple hub devices may be used as redundant hub devices; [0038], the hub device 100 includes a local model 108 that may receive data from one or more edge devices 106 and process the received data; [0039], the provider network 102 includes a data processing service 116 that includes a model 118 that receives the data from the hub device 112 and processes the received data; [0049], data collector 122 may be a sensor or other device that detects performance and/or other operational aspects of the network (e.g., network bandwidth or traffic, power consumption of a data source device, etc.) and generates data based on the detected performance. Thus, the generated data may indicate performance or other operational aspects of the local network and/or the edge device 106. In embodiments, the generated data may be sent to the hub device 100; [0063], FIG. 4 illustrates a system for implementing a split prediction based on a local model of an edge device and a provider network, according to some embodiments. The edge device 400 includes a result manager 112, a local model 108, and one or more data collectors 122; [0065], the data processing service 116 and/or the model 118 receives the data sent from the hub device 112 and processes the received data; [0088], FIG. 6 illustrates a system for updating models for edge devices by a provider network, according to some embodiments. In the depicted embodiment, the edge devices 600 are connected to a local network 104; [0090], The model training service 604 may then generate a local model update 606a to the local model 602a based on the analysis of the data 608a and generate a local model update 606n to the local model 602n based on the analysis of the data 608n; [0091], the local model update 606a is configured to update the local model 602a and the local model update 606n is configured to update the local model 602n; [0092], the model training service 604 may deploy the local model updates 606a, 606n to the local network 104; [0093], instead of just modifying an existing local model, in some cases it is replaced by a different model that is a more recent version; [0095], the edge device 600 may also send the data received from the data collector 122 to the model training service 604. The edge device 600 may then receive another local model update 606 from the model training service, wherein the local model update 606 is based on the data received from the data collector 122; [0114], FIG. 9 illustrates a system for updating models for edge devices by a hub device; [0115-0116], the model trainer 902 of the hub device 100 may receive the data 608 from one or more of the edge devices 600; [0117-0118], In response to the generating of the local model updates 606a, 606n, the model trainer 902 may deploy the local model updates 606a, 606n to the respective edge device 600a, 600n; [0119], The model trainer may then receive one or more local model updates 606 from the provider network 102. The local model updates 606 may then be deployed to one or more respective edge devices 600; [0121], the model trainer may generate a given local model based on topology data or any other data received from a corresponding edge device that will be implementing the local model; [0130], the model training service of the provider network and/or the model trainer may obtain one or more indications of the state of one or more edge devices. The indications may be used to optimize and/or generate respective local model updates that are sent to the edge device and applied to the local model to update the local model; the indications may include reliability of a connection for an edge device, an amount of free memory available at an edge device, an amount of non-volatile storage available at an edge device, a health of an edge device (e.g., with respect to a previous health state or with respect to other edge devices of the local network), and any other suitable indication of state of an edge device, where the state may affect how the local model update is optimized and/or generated; [0132], the edge device may receive updates to one or more of its local models 602 and/or may receive the local models 602 as deployed models in the same way or similar way as described for the figures above; [0133], the local models 602 are different from each other (e.g., perform one or more different operations than each other for given input data) and the different local models 602 are configured to process different data received at different times by the at least one edge device) It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated receive from the communications device, information relating to one or more objectives of the communications device when a leaf network node being a communications device connects to the radio network node; transmit, to the communications device, a machine learning model suitable for the communications device's one or more objectives; transmit, to the communications device, a request to collect data to be used as input data for training of a machine learning model relating to the communications device; receive, from the communications device, the collected data; based on the received collected data, update the machine learning model suitable for the communications device's one or more objectives; and transmit the updated machine learning model to the communications device as suggested in Kasaragod into Johnsson in view of Li. Doing so would be desirable because there some IoT devices are powerful enough to implement a relatively simple data processing model to analyze data and generate a result, such as a prediction. However, the reliability of such a prediction may not be as good as the reliability of a larger model running on a more powerful computing device. For example, a large model implemented by a service provider network or a server computer may use hundreds of millions of parameters, whereas a model running on an IoT device may use only a few hundred thousand. Moreover, the amount and the type of data received by a model at a given IoT device may change over time (see Kasaragod [0003]). The systems and methods described herein implement techniques for configuring local networks of internet-connectable devices (e.g., IoT devices) to implement data processing models to rapidly generate local results (e.g., local predictions), while also taking advantage of larger, more accurate data processing models running on more powerful devices (e.g., servers of a provider network) that generate more accurate results based on the same data (see Kasaragod [0026]). By using the above techniques, an edge device may update the local model by applying the update 614 to improve accuracy of the local model. Then, another local model update 606 may be received from the model training service and the edge device may apply the local model update 606 to improve the accuracy of the local model even more (see Kasaragod [0100]). The provider network and/or hub device may obtain topology data from the local network at multiple points in time (e.g., on a periodic basis) and based on the topology data, periodically modify or replace models to improve accuracy of the models, improve confidence levels of the results (e.g. predictions) generated by the models, and/or to improve performance of the local network (see Kasaragod [0185]). Additionally, the system of Kasaragod can be used for many types of predictions (see Kasaragod [0033]), including network conditions and performance (see Kasaragod 0049]). Regarding claims 2, 11, and 17, claims 2, 11, and 17 contain substantially similar limitations to those found in claim 26. Consequently, claims 2, 11, and 17 are rejected for the same reasons. Regarding claim 3, Johnsson in view of Li teaches all the limitations of claim 1, further comprising: wherein a respective first and second leaf network node is a respective first and second communications device connected to an intermediate network node being a radio network node, wherein the method further comprises: by means of the radio network node, performing a negotiation process when the respective first and second communications devices have conflicting one or more objectives (Johnsson Figs. 1-10; [0054], in case a local node is not powerful enough to process its own data, then it can share its data with a neighbouring local node so that neighbouring node may act and process this data on the behalf of the less powerful node (i.e. building local model, doing local predictions and transmitting these predictions to the master node); [0078], The master node will use the prediction from the local node, possible together with prediction(s) from other local node(s), in order to determine one or more metrics associated with the communication network; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0087], the master node receiving 330 the prediction(s) from local nodes in the communication network, determining weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions, and adjusting a respective local reporting policy for one or more local nodes based on the determined weight parameter(s). The method 300 also comprises the one or more local nodes receiving 340 the local reporting policy from the master node in the communication network, the local reporting policy informing the local node of how to send prediction(s) to the master node. Based on the received reporting policy from the master node, the method comprises the one or more local nodes building 350 a local model based on locally available data, performing a prediction based on the local model, and transmitting the prediction to the master node in accordance with the received local reporting policy; [0096], The way the typical distributed learning algorithm operates (e.g. Winnow) will now be described. From figure 4e, local service predictions from AAs at these machines are first executed and local prediction results are then sent to the CAA (at Mo). CAA updates the associated weights for the different server machines or nodes based on the prediction results (previous steps need to be executed whenever weights needs to be updated at CAA). After the previous steps, the fusion step is executed by the CAA (e.g. weighted majority algorithm) so as to compute the final prediction; [0098-0100], Defined control signals are sent from the CAA (at the master node) to the AAs (at the local nodes) to control their participation behaviour in the distributed learning algorithm dynamically under changing workload and resource availability; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); [0126], According to a further embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by deactivating a local node having a determined weight parameter not meeting a first threshold, wherein the local node will stop to send predictions to the master node; [0128], According to an embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by changing the frequency with which a local node sends predictions to the master node depending on the determined weight parameter(s) of the local node; [0129], According to yet an embodiment, the master node 500, 600 is configured for adjusting the respective local reporting policy by changing the frequency with which a local node sends predictions to the master node depending on (i) prediction accuracy of the received prediction(s) from the local nodes, and/or (ii) the determined weight parameter(s); [0139], According to an embodiment, the received reporting policy comprises information instructing the local node to activate its predictions, deactivate its predictions or changing the frequency with which the local node transmits predictions to the master node). However, Johnsson in view of Li fails to expressly disclose by means of the radio network node, performing a negotiation process when the respective first and second communications devices have conflicting one or more objectives and updating the respective first and second communications devices' machine learning model based on a result of the negotiation process. In the same field of endeavor, Kasaragod teaches: by means of the radio network node, performing a negotiation process when the respective first and second communications devices have conflicting one or more objectives and updating the respective first and second communications devices' machine learning model based on a result of the negotiation process (Kasaragod Figs. 1-19; [0027], a remote provider network and a local network device (e.g., hub device or edge device) may be used to generate a split prediction. For example, a hub device of a local network may receive data from sensors and process the data using a local model (e.g., data processing model) to generate a local prediction. The sensor data may also be transmitted to a remote provider network, which returns another prediction. If the returned prediction from the provider network is more accurate, then the local prediction may be corrected by the returned prediction; [0028] In some embodiments, a provider network and/or a hub device may update local data models of edge devices based on data collected by the edge devices. For example, a provider network and/or a hub device may periodically receive data from edge devices and generate new updates to local models based on the new data. The provider network and/or a hub device may then deploy the updates to the respective edge devices. In embodiments, entirely new versions of the local models are deployed to replace current models of the respective edge devices; [0029-0030], multiple models may be implemented across multiple respective edge devices (e.g., tier devices) of a network; [0035], A hub device 100, a provider network 102, local network 108, edge devices 106; [0036], multiple hub devices may be used as redundant hub devices; [0038], the hub device 100 includes a local model 108 that may receive data from one or more edge devices 106 and process the received data; [0039], the provider network 102 includes a data processing service 116 that includes a model 118 that receives the data from the hub device 112 and processes the received data; [0049], data collector 122 may be a sensor or other device that detects performance and/or other operational aspects of the network (e.g., network bandwidth or traffic, power consumption of a data source device, etc.) and generates data based on the detected performance. Thus, the generated data may indicate performance or other operational aspects of the local network and/or the edge device 106. In embodiments, the generated data may be sent to the hub device 100; [0063], FIG. 4 illustrates a system for implementing a split prediction based on a local model of an edge device and a provider network, according to some embodiments. The edge device 400 includes a result manager 112, a local model 108, and one or more data collectors 122; [0065], the data processing service 116 and/or the model 118 receives the data sent from the hub device 112 and processes the received data; [0088], FIG. 6 illustrates a system for updating models for edge devices by a provider network, according to some embodiments. In the depicted embodiment, the edge devices 600 are connected to a local network 104; [0090], The model training service 604 may then generate a local model update 606a to the local model 602a based on the analysis of the data 608a and generate a local model update 606n to the local model 602n based on the analysis of the data 608n; [0091], the local model update 606a is configured to update the local model 602a and the local model update 606n is configured to update the local model 602n; [0092], the model training service 604 may deploy the local model updates 606a, 606n to the local network 104; [0093], instead of just modifying an existing local model, in some cases it is replaced by a different model that is a more recent version; [0095], the edge device 600 may also send the data received from the data collector 122 to the model training service 604. The edge device 600 may then receive another local model update 606 from the model training service, wherein the local model update 606 is based on the data received from the data collector 122; [0114], FIG. 9 illustrates a system for updating models for edge devices by a hub device; [0115-0116], the model trainer 902 of the hub device 100 may receive the data 608 from one or more of the edge devices 600; [0117-0118], In response to the generating of the local model updates 606a, 606n, the model trainer 902 may deploy the local model updates 606a, 606n to the respective edge device 600a, 600n; [0119], The model trainer may then receive one or more local model updates 606 from the provider network 102. The local model updates 606 may then be deployed to one or more respective edge devices 600; [0121], the model trainer may generate a given local model based on topology data or any other data received from a corresponding edge device that will be implementing the local model; [0130], the model training service of the provider network and/or the model trainer may obtain one or more indications of the state of one or more edge devices. The indications may be used to optimize and/or generate respective local model updates that are sent to the edge device and applied to the local model to update the local model; the indications may include reliability of a connection for an edge device, an amount of free memory available at an edge device, an amount of non-volatile storage available at an edge device, a health of an edge device (e.g., with respect to a previous health state or with respect to other edge devices of the local network), and any other suitable indication of state of an edge device, where the state may affect how the local model update is optimized and/or generated; [0132], the edge device may receive updates to one or more of its local models 602 and/or may receive the local models 602 as deployed models in the same way or similar way as described for the figures above; [0133], the local models 602 are different from each other (e.g., perform one or more different operations than each other for given input data) and the different local models 602 are configured to process different data received at different times by the at least one edge device; [0141], In some embodiments, the tier device 1302a behaves in a way similar to a hub device as described above; [0151], an edge device or tier device may generate a prediction based on processing of the data 1312 using the model of the edge device or the tier device. The tier manager may then determine whether a confidence level of the prediction is below a threshold confidence level. If so, then the tier manager may send the data to a tier device (or another tier device) for processing by a model of the tier device (or other tier device)) It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated by means of the radio network node, performing a negotiation process when the respective first and second communications devices have conflicting one or more objectives and updating the respective first and second communications devices' machine learning model based on a result of the negotiation process as suggested in Kasaragod into Johnsson in view of Li. Doing so would be desirable because there some IoT devices are powerful enough to implement a relatively simple data processing model to analyze data and generate a result, such as a prediction. However, the reliability of such a prediction may not be as good as the reliability of a larger model running on a more powerful computing device. For example, a large model implemented by a service provider network or a server computer may use hundreds of millions of parameters, whereas a model running on an IoT device may use only a few hundred thousand. Moreover, the amount and the type of data received by a model at a given IoT device may change over time (see Kasaragod [0003]). The systems and methods described herein implement techniques for configuring local networks of internet-connectable devices (e.g., IoT devices) to implement data processing models to rapidly generate local results (e.g., local predictions), while also taking advantage of larger, more accurate data processing models running on more powerful devices (e.g., servers of a provider network) that generate more accurate results based on the same data (see Kasaragod [0026]). By using the above techniques, an edge device may update the local model by applying the update 614 to improve accuracy of the local model. Then, another local model update 606 may be received from the model training service and the edge device may apply the local model update 606 to improve the accuracy of the local model even more (see Kasaragod [0100]). The provider network and/or hub device may obtain topology data from the local network at multiple points in time (e.g., on a periodic basis) and based on the topology data, periodically modify or replace models to improve accuracy of the models, improve confidence levels of the results (e.g. predictions) generated by the models, and/or to improve performance of the local network (see Kasaragod [0185]). Additionally, the system of Kasaragod can be used for many types of predictions (see Kasaragod [0033]), including network conditions and performance (see Kasaragod 0049]). Regarding claims 12, 18, and 27, claims 12, 18, and 27 contain substantially similar limitations to those found in claim 3. Consequently, claims 12, 18, and 27 are rejected for the same reasons. Claims 6 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Johnsson in view of Li in further view of Kasaragod et al. (US 20190037040 A1, published 01/31/2019), hereinafter Kasaragod, in further view of Burges et al. (US 20070094171 A1, published 04/26/2007), hereinafter Burges. Regarding claim 6, Johnsson in view of Li teaches all the limitations of claim 1, further comprising: wherein the machine learning model is a representation of the at least one network node and of one or more network nodes communicatively connected to the at least one network node, wherein the method further comprises: by means of the machine learning unit, training the machine learning model based on one or more known input data and on one or more known output data relating to a result of an operation of the at least one network node with the known input data, wherein each one of the one or more known output data corresponds to a respective one of the one or more known input data (Johnsson Figs. 1-10; [0054], The metrics associated with the communication network may relate to performance, anomalies and other information relative to current circumstances of the communication network. The training data may consists of both measurement data (X) from e.g. sensors associated with local nodes and actual, true or measured values (Y) that the local model shall be learned to predict later. In the prediction phase there are only X, while the Y will be predicted; [0078], The master node will use the prediction from the local node, possible together with prediction(s) from other local node(s), in order to determine one or more metrics associated with the communication network; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0087], the master node receiving 330 the prediction(s) from local nodes in the communication network, determining weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions, and adjusting a respective local reporting policy for one or more local nodes based on the determined weight parameter(s); [0096], The way the typical distributed learning algorithm operates (e.g. Winnow) will now be described. From figure 4e, local service predictions from AAs at these machines are first executed and local prediction results are then sent to the CAA (at Mo). CAA updates the associated weights for the different server machines or nodes based on the prediction results (previous steps need to be executed whenever weights needs to be updated at CAA). After the previous steps, the fusion step is executed by the CAA (e.g. weighted majority algorithm) so as to compute the final prediction; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); see also [0098-0100], [0126], [0128-0129], [0139]). However, Johnsson in view of Li fails to expressly disclose wherein the machine learning model comprises an input layer, an output layer and one or more hidden layers, each of the input layer, an output layer and one or more hidden layers comprising one or more artificial neurons linked to one or more other artificial neurons of one of a same layer and of another layer. In the same field of endeavor, Kasaragod teaches: wherein the machine learning model comprises an input layer, an output layer and one or more hidden layers, each of the input layer, an output layer and one or more hidden layers comprising one or more artificial neurons linked to one or more other artificial neurons of one of a same layer and of another layer (Kasaragod Figs. 1-19; [0027], a remote provider network and a local network device (e.g., hub device or edge device) may be used to generate a split prediction. For example, a hub device of a local network may receive data from sensors and process the data using a local model (e.g., data processing model) to generate a local prediction. The sensor data may also be transmitted to a remote provider network, which returns another prediction. If the returned prediction from the provider network is more accurate, then the local prediction may be corrected by the returned prediction; [0028], a provider network and/or a hub device may update local data models of edge devices based on data collected by the edge devices. For example, a provider network and/or a hub device may periodically receive data from edge devices and generate new updates to local models based on the new data. The provider network and/or a hub device may then deploy the updates to the respective edge devices. In embodiments, entirely new versions of the local models are deployed to replace current models of the respective edge devices; [0029-0030], as used herein, a model may be any data processing model suitable for processing input data to generate one or more results. For example, a model may include a neural network, deep neural network, static or dynamic neural network; multiple models may be implemented across multiple respective edge devices (e.g., tier devices) of a network; [0035], A hub device 100, a provider network 102, local network 108, edge devices 106; [0036], multiple hub devices may be used as redundant hub devices; [0038], the hub device 100 includes a local model 108 that may receive data from one or more edge devices 106 and process the received data; [0039], the provider network 102 includes a data processing service 116 that includes a model 118 that receives the data from the hub device 112 and processes the received data; [0049], data collector 122 may be a sensor or other device that detects performance and/or other operational aspects of the network (e.g., network bandwidth or traffic, power consumption of a data source device, etc.) and generates data based on the detected performance. Thus, the generated data may indicate performance or other operational aspects of the local network and/or the edge device 106. In embodiments, the generated data may be sent to the hub device 100; [0063], FIG. 4 illustrates a system for implementing a split prediction based on a local model of an edge device and a provider network, according to some embodiments. The edge device 400 includes a result manager 112, a local model 108, and one or more data collectors 122; [0065], the data processing service 116 and/or the model 118 receives the data sent from the hub device 112 and processes the received data; [0088], FIG. 6 illustrates a system for updating models for edge devices by a provider network, according to some embodiments. In the depicted embodiment, the edge devices 600 are connected to a local network 104; [0090], The model training service 604 may then generate a local model update 606a to the local model 602a based on the analysis of the data 608a and generate a local model update 606n to the local model 602n based on the analysis of the data 608n; [0091], the local model update 606a is configured to update the local model 602a and the local model update 606n is configured to update the local model 602n; [0092], the model training service 604 may deploy the local model updates 606a, 606n to the local network 104; [0093], instead of just modifying an existing local model, in some cases it is replaced by a different model that is a more recent version; [0095], the edge device 600 may also send the data received from the data collector 122 to the model training service 604. The edge device 600 may then receive another local model update 606 from the model training service, wherein the local model update 606 is based on the data received from the data collector 122; [0130], the model training service of the provider network and/or the model trainer may obtain one or more indications of the state of one or more edge devices. The indications may be used to optimize and/or generate respective local model updates that are sent to the edge device and applied to the local model to update the local model; the indications may include reliability of a connection for an edge device, an amount of free memory available at an edge device, an amount of non-volatile storage available at an edge device, a health of an edge device (e.g., with respect to a previous health state or with respect to other edge devices of the local network), and any other suitable indication of state of an edge device, where the state may affect how the local model update is optimized and/or generated; [0135], a model training service of a provider network, a model trainer of a hub device, and/or a local model trainer of the edge device may use ensemble methods to use different machine learning algorithms to generate different models (e.g., neural network, deep neural network, memory network, etc.); see also [0114-0121], [0132-033]) It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the machine learning model comprises an input layer, an output layer and one or more hidden layers, each of the input layer, an output layer and one or more hidden layers comprising one or more artificial neurons linked to one or more other artificial neurons of one of a same layer and of another layer as suggested in Kasaragod into Johnsson in view of Li. Doing so would be desirable because there some IoT devices are powerful enough to implement a relatively simple data processing model to analyze data and generate a result, such as a prediction. However, the reliability of such a prediction may not be as good as the reliability of a larger model running on a more powerful computing device. For example, a large model implemented by a service provider network or a server computer may use hundreds of millions of parameters, whereas a model running on an IoT device may use only a few hundred thousand. Moreover, the amount and the type of data received by a model at a given IoT device may change over time (see Kasaragod [0003]). The systems and methods described herein implement techniques for configuring local networks of internet-connectable devices (e.g., IoT devices) to implement data processing models to rapidly generate local results (e.g., local predictions), while also taking advantage of larger, more accurate data processing models running on more powerful devices (e.g., servers of a provider network) that generate more accurate results based on the same data (see Kasaragod [0026]). By using the above techniques, an edge device may update the local model by applying the update 614 to improve accuracy of the local model. Then, another local model update 606 may be received from the model training service and the edge device may apply the local model update 606 to improve the accuracy of the local model even more (see Kasaragod [0100]). The provider network and/or hub device may obtain topology data from the local network at multiple points in time (e.g., on a periodic basis) and based on the topology data, periodically modify or replace models to improve accuracy of the models, improve confidence levels of the results (e.g. predictions) generated by the models, and/or to improve performance of the local network (see Kasaragod [0185]). Additionally, the system of Kasaragod can be used for many types of predictions (see Kasaragod [0033]), including network conditions and performance (see Kasaragod 0049]). However, Johnsson in view of Li in further view of Kasaragod fails to expressly disclose wherein each of the one or more artificial neurons has an activation function, an input weighting coefficient, a bias and an output weighting coefficient, and wherein weighting coefficients and the bias are changeable during training of the machine learning model. In the same field of endeavor, Burges teaches: wherein each of the one or more artificial neurons has an activation function, an input weighting coefficient, a bias and an output weighting coefficient, and wherein weighting coefficients and the bias are changeable during training of the machine learning model (Burges Figs. 1-12; [0059], The training of a learning system can be further explained by looking at a specific example. For example, the learning component 110 can include a neural network. Neural networks are commonly used for classification and regression tasks. A neural network is commonly organized as a multilayered, hierarchical arrangement of processing elements, also referred to as neurons, nodes or units. For the purposes of this disclosure, the terms neuron, node and unit will be used interchangeably. Each unit typically has one or more inputs and one output. Each input is typically weighted by some coefficient value. Each output of a unit is typically a result of processing its input value(s) in accordance with an activation function and any weight or bias applied; [0060], In a hierarchical arrangement of neurons in a neural network, the neurons are usually arranged into layers. The output of a neuron in one layer can be an input to one or more neurons in a successive layer. Layers may be exposed in the sense that either the inputs of neurons in that layer directly receive input from a data source external to the neural network or the outputs of neurons are the desired result of processing. Layers may also be hidden in the sense that the inputs of units in that layer are computed using the outputs of units in a previous or lower layer, and the outputs of units in a hidden layer feed inputs for units in a successive or higher layer. An exemplary neural network can include any suitable number of layers such as an input layer, an intermediate or hidden layer and an output layer; [0061], The use of a neural network typically involves a training phase and a testing phase. During the training phase, one of a preselected group of data patterns called the `training set` is presented to the network for classification. This process is often referred to as forward propagation. An objective of the training step is to minimize the cost function, thereby minimizing errors in the network. Results from the training are then used to adjust parameters of the network, such as weights or biases, in such a way that, if that pattern were presented for forward propagation again, the network would yield a lower cost. This adjustment process is referred to as backward propagation. Forward propagation and backward propagation are usually performed successively until the cost function, averaged over a suitable, second preselected group of data patterns called a `validation set`, is minimized; [0062], A test data set is presented to the network and the results of computation on that test set are evaluated and compared with a known ideal result. If that evaluation yields a result that is within an acceptable margin, the network is accepted for use; [0063], FIG. 5 is a system block diagram of a multi-layer neural network 500 that can be used to implement the learning component 110. The neural network 500 depicted includes an input layer 510, a hidden layer 520 and an output layer 530. Each layer includes one or more neurons 541, 542, 543 that each accept an input; process that input with respect to some predefined function and optional weight or bias; and provide an output; [0069], specific activation functions employed are largely a matter of implementation choice in any given application. It is possible for each and every unit in a neural network to have a unique activation function; [0070], Appropriate activation functions and thresholds are created or selected. Input data formats are defined. The number of units and layers is determined, along with interconnection topologies for those units and layers) It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein each of the one or more artificial neurons has an activation function, an input weighting coefficient, a bias and an output weighting coefficient, and wherein weighting coefficients and the bias are changeable during training of the machine learning model as suggested in Burges into Johnsson in view of Li in further view of Kasaragod. Doing so would be desirable because the amount of data available to information seekers has grown astronomically, whether as the result of the proliferation of information sources on the Internet, or as a result of private efforts to organize business information within a company, or any of a variety of other causes. As the amount of available data has grown, so has the need to be able to sort and locate relevant data (see Burges [0002]). Machine learning systems include such systems as neural networks, support vector machines ("SVMs") and perceptrons, among others. These systems can be used for a variety of data processing or analysis tasks, including, but not limited to, optical pattern and object recognition, control and feedback systems and text categorization. Other potential uses for machine learning systems include any application that can benefit from data classification or regression. Typically, the machine learning system is trained to improve performance (see Burges [0004]). Such machine learning systems are usually trained using a cost function, which the learning process attempts to minimize. Often, however, the cost functions of interest are not minimized directly, since this has presented too difficult a problem to solve (see Burges [0005]). Additionally, the system of Burges would improve the systems of Johnsson, Li, and Kasaragod by clarifying the composition and training of the machine learning system. Regarding claim 7, Johnsson in view of Li in further view of Kasaragod in further view of Burges teaches all the limitations of claim 6, further comprising: wherein the training of the machine learning model comprises: adjusting until the known output data is given as an output from the machine learning model when the corresponding known input data is given as an input to the machine learning model (Johnsson Figs. 1-10; [0054], The metrics associated with the communication network may relate to performance, anomalies and other information relative to current circumstances of the communication network. The training data may consists of both measurement data (X) from e.g. sensors associated with local nodes and actual, true or measured values (Y) that the local model shall be learned to predict later. In the prediction phase there are only X, while the Y will be predicted; [0078], The master node will use the prediction from the local node, possible together with prediction(s) from other local node(s), in order to determine one or more metrics associated with the communication network; [0084], Different predictions may be more or less accurate. In an example, the local node may itself determine the accuracy of a newly performed prediction. The local node may itself also determine if a newly performed prediction deviates more than to a certain extent from previously performed predictions. Depending on the determined accuracy, the local node may determine a weight parameter associated with itself; [0085], The local node may send also the weight parameter to the master node, wherein the master node may adjust the local reporting policy for the local node based on either one of or both the sent prediction and the sent determined weight parameter; [0087], the master node receiving 330 the prediction(s) from local nodes in the communication network, determining weight parameter(s) associated with the local nodes based on the received prediction(s) and previously received predictions, and adjusting a respective local reporting policy for one or more local nodes based on the determined weight parameter(s); [0096], The way the typical distributed learning algorithm operates (e.g. Winnow) will now be described. From figure 4e, local service predictions from AAs at these machines are first executed and local prediction results are then sent to the CAA (at Mo). CAA updates the associated weights for the different server machines or nodes based on the prediction results (previous steps need to be executed whenever weights needs to be updated at CAA). After the previous steps, the fusion step is executed by the CAA (e.g. weighted majority algorithm) so as to compute the final prediction; [0124], the master node 500, 600 is configured for determining a global model for predicting one or more metrics associated with the communication network based on the received prediction(s) from the local nodes and the determined weight parameter(s); see also [0098-0100], [0126], [0128-0129], [0139]). Burges further teaches: adjusting weighting coefficients and biases for one or more of the artificial neurons until the known output data is given as an output from the machine learning model when the corresponding known input data is given as an input to the machine learning model (Burges Figs. 1-12; [0030], the metric or cost function measures only if the top-returned answer is correct; [0059], The training of a learning system can be further explained by looking at a specific example. For example, the learning component 110 can include a neural network. Neural networks are commonly used for classification and regression tasks. A neural network is commonly organized as a multilayered, hierarchical arrangement of processing elements, also referred to as neurons, nodes or units. For the purposes of this disclosure, the terms neuron, node and unit will be used interchangeably. Each unit typically has one or more inputs and one output. Each input is typically weighted by some coefficient value. Each output of a unit is typically a result of processing its input value(s) in accordance with an activation function and any weight or bias applied; [0060-0061], The use of a neural network typically involves a training phase and a testing phase. During the training phase, one of a preselected group of data patterns called the `training set` is presented to the network for classification. This process is often referred to as forward propagation. An objective of the training step is to minimize the cost function, thereby minimizing errors in the network. Results from the training are then used to adjust parameters of the network, such as weights or biases, in such a way that, if that pattern were presented for forward propagation again, the network would yield a lower cost. This adjustment process is referred to as backward propagation. Forward propagation and backward propagation are usually performed successively until the cost function, averaged over a suitable, second preselected group of data patterns called a `validation set`, is minimized; [0062-0063], A test data set is presented to the network and the results of computation on that test set are evaluated and compared with a known ideal result. If that evaluation yields a result that is within an acceptable margin, the network is accepted for use; [0063], FIG. 5 is a system block diagram of a multi-layer neural network 500 that can be used to implement the learning component 110. The neural network 500 depicted includes an input layer 510, a hidden layer 520 and an output layer 530. Each layer includes one or more neurons 541, 542, 543 that each accept an input; process that input with respect to some predefined function and optional weight or bias; and provide an output; see also [0069-0070]) It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated adjusting weighting coefficients and biases for one or more of the artificial neurons until the known output data is given as an output from the machine learning model when the corresponding known input data is given as an input to the machine learning model as suggested in Burges into Johnsson in view of Li in further view of Kasaragod. Doing so would be desirable because the amount of data available to information seekers has grown astronomically, whether as the result of the proliferation of information sources on the Internet, or as a result of private efforts to organize business information within a company, or any of a variety of other causes. As the amount of available data has grown, so has the need to be able to sort and locate relevant data (see Burges [0002]). Machine learning systems include such systems as neural networks, support vector machines ("SVMs") and perceptrons, among others. These systems can be used for a variety of data processing or analysis tasks, including, but not limited to, optical pattern and object recognition, control and feedback systems and text categorization. Other potential uses for machine learning systems include any application that can benefit from data classification or regression. Typically, the machine learning system is trained to improve performance (see Burges [0004]). Such machine learning systems are usually trained using a cost function, which the learning process attempts to minimize. Often, however, the cost functions of interest are not minimized directly, since this has presented too difficult a problem to solve (see Burges [0005]). Additionally, the system of Burges would improve the systems of Johnsson, Li and Kasaragod by clarifying the composition and training of the machine learning system. Response to Arguments The Examiner acknowledges the Applicant’s amendments to claims 1, 10, 16, and 25. The rejection to the claims under 35 U.S.C. 101 for being directed to an abstract idea without significantly more is respectfully withdrawn. Regarding independent claim 1, Applicant alleges that Johnsson as described in the previous Office action, does not explicitly teach the prediction comprising one or more of a modulation and coding scheme (MCS) to use, which transmitter beam to use, and which receiver beam to use and the one or more operations comprising at least one of a change of transmit beam, a change of receive beam, and a change of MCS selection operation, based on the determined prediction of the performance of the at least one, as has been amended to the claim. Examiner has therefore rejected independent claim 1 under 35 U.S.C § 103 as unpatentable over Johnsson in view of Li. Similar arguments have been presented for claims 10, 16, and 25 and thus, Applicant’s arguments are not persuasive for the same reasons. Applicant states that the dependent claims recite all the limitations of the independent claims, and thus, are allowable in view of the remarks set forth regarding the independent claims. However, as discussed above, Johnsson is considered to teach the independent claims, and consequently, the dependent claims are rejected. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Veijalainen (US 20220021469 A1) see Figs. 1-27 and [0045-0055], [0095], [0109]. Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN T REPSHER III whose telephone number is (571)272-7487. The examiner can normally be reached Monday - Friday, 8AM-5PM EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Welch can be reached at (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JOHN T REPSHER III/ Primary Examiner, Art Unit 2143
Read full office action

Prosecution Timeline

Show 2 earlier events
Apr 24, 2024
Non-Final Rejection mailed — §101, §103
Jul 17, 2024
Response Filed
Aug 13, 2024
Final Rejection mailed — §101, §103
Nov 13, 2024
Request for Continued Examination
Nov 16, 2024
Response after Non-Final Action
Jun 03, 2025
Non-Final Rejection mailed — §101, §103
Sep 02, 2025
Response Filed
Oct 01, 2025
Final Rejection mailed — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12619342
COGNITIVE DETECTION OF USER INTERFACE ERRORS
4y 3m to grant Granted May 05, 2026
Patent 12585984
POINT-OF-INTEREST RECOMMENDATION
4y 4m to grant Granted Mar 24, 2026
Patent 12585929
Layered Gradient Accumulation and Modular Pipeline Parallelism for Improved Training of Machine Learning Models
4y 1m to grant Granted Mar 24, 2026
Patent 12581159
METHOD AND APPARATUS FOR OPTIMIZING VIDEO PLAYBACK START, DEVICE AND STORAGE MEDIUM
4y 9m to grant Granted Mar 17, 2026
Patent 12541282
LEARNING USER INTERFACE
2y 8m to grant Granted Feb 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

5-6
Expected OA Rounds
75%
Grant Probability
99%
With Interview (+29.0%)
4y 4m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 339 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month