Last updated: May 29, 2026
Application No. 18/298,949
MACHINE LEARNING TECHNIQUES FOR IMPLEMENTING TREE-BASED NETWORK CONGESTION CONTROL

Final Rejection §101§102§103
Filed
Apr 11, 2023
Priority
Jun 29, 2022 — provisional 63/356,795
Examiner
HOANG, AMY P
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
Nvidia Corporation
OA Round
2 (Final)
Interview Optional

— +64.2% interview lift. Examiner has a relatively high allowance rate (70%); +64.2% interview lift. A written response may suffice.
Based on 233 resolved cases, 2023–2026
Examiner Intelligence

HOANG, AMY P View full profile →
Grants 70% — above average
Career Allowance Rate
164 granted / 233 resolved
+15.4% vs TC avg
Strong +64% interview lift
Without
With
+64.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
16 currently pending
Career history
264
Total Applications
across all art units
Statute-Specific Performance

§101
4.1%
-35.9% vs TC avg
§103
87.8%
+47.8% vs TC avg
§102
5.7%
-34.3% vs TC avg
§112
2.0%
-38.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 233 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is responsive to the application filed on 04/11/2023. Claims 1-20 are presented in the case. Claims 1, 11 and 20 are independent claims.

Priority
Applicant's claim for the benefit of a prior-filed Provisional application No. 63/356,795 filed on June 29, 2022 is acknowledged.

Information Disclosure Statement
The information disclosure statement submitted on 07/17/2023 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: Claims 1-10 are directed to a method, claims 11-19 are directed to a medium and claim 20 is directed to a system. Therefore, the claims are eligible under Step 1 for being directed to a process, a manufacture and a machine respectively.
Independent claims 1, 11 and 20:
Step 2A Prong 1:  
Claims recite:
generating a first trained decision tree model based on an initial loss for an initial model relative to the training dataset - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating a model based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
generating a final tree-based model based on the first trained decision tree model and at least a second trained decision tree model - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating a model based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
A computer-implemented method for controlling congestion in data transmission networks; One or more non-transitory computer readable media including instructions that, when executed by one or more processors, cause the one or more processors to automatically control congestion in data transmission networks; A system comprising: one or more memories storing instructions; and one or more processors coupled to the one or more memories that, when executing the instructions - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
executing a first trained neural network in conjunction with a simulated data transmission network to generate a training dataset, wherein the first trained neural network has been trained to control congestion in the simulated data transmission network - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
executing the final tree-based model in conjunction with a first data transmission network to control congestion within the first data transmission network - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
A computer-implemented method for controlling congestion in data transmission networks; One or more non-transitory computer readable media including instructions that, when executed by one or more processors, cause the one or more processors to automatically control congestion in data transmission networks; A system comprising: one or more memories storing instructions; and one or more processors coupled to the one or more memories that, when executing the instructions - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
executing a first trained neural network in conjunction with a simulated data transmission network to generate a training dataset, wherein the first trained neural network has been trained to control congestion in the simulated data transmission network - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
executing the final tree-based model in conjunction with a first data transmission network to control congestion within the first data transmission network - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claims 2 and 12:
Step 2A Prong 1:  
Claims recite:
wherein generating the final tree-based model comprises constructing a combination of the first trained decision tree model and the at least the second trained decision tree model - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and generating a model based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claims 3 and 13:
Step 2A Prong 1: The claim recites the abstract ideas of claims 1 and 11.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
executing the initial model on a plurality of feature vectors included in the training dataset to generate a plurality of predicted outputs - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
training a decision tree model to predict a negative gradient of a loss function with respect to the plurality of predicted outputs - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
executing the initial model on a plurality of feature vectors included in the training dataset to generate a plurality of predicted outputs - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
training a decision tree model to predict a negative gradient of a loss function with respect to the plurality of predicted outputs - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claims 4 and 14:
Step 2A Prong 1:  
Claims recite:
while training the decision tree model, determining that a tree depth of the decision tree model is equal to a maximum tree depth - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship by comparing values to determine if a tree depth of the decision tree model is equal to a maximum tree depth; and
designating the decision tree model as the first trained decision tree model - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and selecting data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claims 5 and 15:
Step 2A Prong 1:  
Claims recite:
determining that a total number of trained decision tree models included in the final tree-based model is equal to a maximum number of trees - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship by comparing values to determine if a total number of trained decision tree models is equal to a maximum number of trees;
designating the final tree-based model as a trained tree-based model - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and selecting data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claim 6:
Step 2A Prong 1: The claim recites the abstract ideas of claim 1.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the first data transmission network comprises a remote direct memory access network - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the first data transmission network comprises a remote direct memory access network - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claims 7 and 17:
Step 2A Prong 1:  
Claims recite:
wherein executing the final tree-based model in conjunction with the first data transmission network to control congestion comprises: computing, via the final tree-based model, a first modification to be made to a network flow included in the first data transmission network based on at least one of a delay measurement, a latency measurement, or a transmission rate of the network flow; and modifying the network flow based on the first modification - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation of calculating using mathematical methods to determine a first modification to be made to a network flow to update the network flow.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claim 8:
Step 2A Prong 1:  
Claims recite:
wherein the training dataset includes a mapping from a first feature vector associated with a network flow included in the simulated data transmission network to a first modification to be made to the network flow - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and mapping data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claims 9 and 19:
Step 2A Prong 1: The claim recites the abstract ideas of claims 1 and 11.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
further comprising modifying a transmission rate of the network flow in accordance with the first modification - the steps recited at a high level of generality, and amounts to mere data modifying which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
further comprising modifying a transmission rate of the network flow in accordance with the first modification - the steps recited at a high level of generality, and amounts to mere data modifying which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 10:
Step 2A Prong 1: The claim recites the abstract ideas of claim 1.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the first feature vector comprises at least one of a delay measurement, a latency measurement, or a transmission rate associated with the network flow - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the first feature vector comprises at least one of a delay measurement, a latency measurement, or a transmission rate associated with the network flow - - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 18:
Step 2A Prong 1:  
Claims recite:
wherein the training dataset includes a mapping from at least one of a delay measurement, a latency measurement, or a transmission rate of a network flow included in the simulated data transmission network to a modification to be made to the transmission rate - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and mapping data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-2, 7-12 and 16-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ismailsheriff et al. (hereinafter Ismailsheriff), US 10873533 B1.

Regarding independent claim 1, Ismailsheriff teaches a computer-implemented method for controlling congestion in data transmission networks (Col 2, lines 39-41 Systems and methods provide for generating traffic class-specific congestion signatures and other machine learning models for improving network performance), the method comprising:
executing a first trained neural network in conjunction with a simulated data transmission network to generate a training dataset, wherein the first trained neural network has been trained to control congestion in the simulated data transmission network (Col 18, lines 35-38 The traffic data generator 406 can generate simulated traffic that the training data assembler 410 can use to construct training data from which the machine learning model generator 412 can build machine learning models; Fig. 5; Col 30, lines 55-60 The process 500 can begin at step 502 with the network controller receiving one or more traffic shaping policies for one or more traffic classes (including at least one predetermined traffic class) when flows of the one or more traffic classes are in one or more congestion states (including at least one predetermined congestion state); Col 33, lines 1-8 At step 504, the network controller can receive telemetry or traffic data (e.g., NetFlow records; network device statistics, such as CPU and memory utilization, interface counters, etc.; SNMP data; IP SLA performance measurements; system event (syslog) records, SSH CLI data, etc.) captured by the network devices within a period of time the network devices apply the one or more traffic shaping policies received at step 502; Col 36, lines 39-41 At step 506, the network controller can generate one or more training data sets based on the telemetry or traffic data captured at step 504);
generating a first trained decision tree model based on an initial loss for an initial model relative to the training dataset (Col 25, lines 51-64 A decision tree may be created from a data set in which each node of the tree can correspond to one or more features, and a branch or edge from the node to a child node can correspond to the possible values of the features. Each leaf can represent a class label whose feature values satisfy the specified ranges of the path from the root of the tree to the leaf. The partitioning at each level of the tree can be based on a split criterion, such as a condition or rule based on one or more features. Decision trees try to recursively split the training data so as to maximize the discrimination among different classes over different nodes of the tree. Decision tree algorithms may differ on how to select the splitting features and how to prune the tree when it becomes too large; Col 38, lines 30-33 and 60-66 At step 508, the network controller can build one or more machine learning models for evaluating the traffic data observed by the network devices to improve network operations. In some embodiments, the network controller can generate the traffic class-specific congestion signatures 426 by applying the training data sets to one or more classifiers (e.g., logistic regression classifier, LDA classifier, QDA classifier, SVM classifier, SGD classifier, K-NN classifier, GPC, Naïve Bayes classifier, decision tree classifier, random forest classifier, boosting classifier, neural network classifier, etc.); Col 4, lines 48-52 For traffic shaping to be effective, it can be important to identify the optimal window size W and/or the optimal congestion threshold T for a flow to achieve transmission as close to network capacity as possible without experiencing traffic loss; Col 5, lines 34-37 generating traffic class-specific congestion signatures, window size and/or congestion threshold estimators, and reinforcement learning agents that network devices can apply to improve network performance);
generating a final tree-based model based on the first trained decision tree model and at least a second trained decision tree model (Col 38, lines 66-67 and Col 39, lines 1-15 the network controller may evaluate multiple permutations of training data sets and classifiers and select the final classifiers the network devices apply based on the size of the training data sets and the precision, recall, accuracy, and/or other performance metrics of each permutation. In some embodiments, the traffic class-specific congestion signatures 426 may utilize different types of classifiers for different sets of (one or more) traffic classes. For example, the congestion signature for a first traffic class may be first type of classifier (e.g., an SVM classifier) because the first type of classifier may perform better in classifying a flow of the first traffic class better than other types of classifiers while the congestion signature for a second traffic class may be a second type of classifier (e.g., a neural network classifier) because the second type of classifier may perform better in classifying flows of the second traffic class than other types of classifiers); and
executing the final tree-based model in conjunction with a first data transmission network to control congestion within the first data transmission network (Col 40, lines 6-35 At step 510, the network controller can distribute the machine learning models generated at step 508 to the network devices for the devices to apply to new traffic data, and the network controller can receive the output of the machine learning models. For example, the network controller can continuously monitor the performance of a machine learning model and when precision, recall, accuracy, and/or other performance metrics (e.g., Table 4) are below certain thresholds or when a new machine learning model improves on the performance of the older machine learning model, the network controller can generate an alert to inform an administrator to update the older machine learning model. As another example, the network controller can receive congestion information from individual network devices when they believe the flows they process experience congestion. The network controller can aggregate the congestion information from the network devices and other sources throughout the network, and identify if throughput is abnormally low compared to expectation for a given location, time of day, device type, and so on, to facilitate root-cause analysis for the abnormal throughput. If the traffic class-specific congestion signatures 426 can distinguish between self-limiting or external congestion, the network controller can generate reports regarding where and when additional capacity can improve network performance. In some embodiments, the network controller can use congestion information to identify root causes of potential network outages as they develop, and preemptively remediate these issues before a network outage occurs; Col 41, lines 25-30 The process 500 can conclude at step 512 by the network controller adjusting one or more traffic shaping operations of the network devices based on the output from the traffic class-specific congestion signatures 426, the window size and/or congestion threshold estimators 428, the STACKing reinforcement learning agents 430, and other machine learning models).

Regarding dependent claim 2, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Ismailsheriff teaches wherein generating the final tree-based model comprises constructing a combination of the first trained decision tree model and the at least the second trained decision tree model (Col 38, lines 66-67 and Col 39, lines 1-15 the network controller may evaluate multiple permutations of training data sets and classifiers and select the final classifiers the network devices apply based on the size of the training data sets and the precision, recall, accuracy, and/or other performance metrics of each permutation. In some embodiments, the traffic class-specific congestion signatures 426 may utilize different types of classifiers for different sets of (one or more) traffic classes. For example, the congestion signature for a first traffic class may be first type of classifier (e.g., an SVM classifier) because the first type of classifier may perform better in classifying a flow of the first traffic class better than other types of classifiers while the congestion signature for a second traffic class may be a second type of classifier (e.g., a neural network classifier) because the second type of classifier may perform better in classifying flows of the second traffic class than other types of classifiers).

Regarding dependent claim 7, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Ismailsheriff teaches wherein executing the final tree-based model in conjunction with the first data transmission network to control congestion comprises:
computing, via the final tree-based model, a first modification to be made to a network flow included in the first data transmission network based on at least one of a delay measurement, a latency measurement, or a transmission rate of the network flow; and modifying the network flow based on the first modification (Col 40, lines 6-35 At step 510, the network controller can distribute the machine learning models generated at step 508 to the network devices for the devices to apply to new traffic data, and the network controller can receive the output of the machine learning models. For example, the network controller can continuously monitor the performance of a machine learning model and when precision, recall, accuracy, and/or other performance metrics (e.g., Table 4) are below certain thresholds or when a new machine learning model improves on the performance of the older machine learning model, the network controller can generate an alert to inform an administrator to update the older machine learning model. As another example, the network controller can receive congestion information from individual network devices when they believe the flows they process experience congestion. The network controller can aggregate the congestion information from the network devices and other sources throughout the network, and identify if throughput is abnormally low compared to expectation for a given location, time of day, device type, and so on, to facilitate root-cause analysis for the abnormal throughput. If the traffic class-specific congestion signatures 426 can distinguish between self-limiting or external congestion, the network controller can generate reports regarding where and when additional capacity can improve network performance. In some embodiments, the network controller can use congestion information to identify root causes of potential network outages as they develop, and preemptively remediate these issues before a network outage occurs).

Regarding dependent claim 8, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Ismailsheriff teaches wherein the training dataset includes a mapping from a first feature vector associated with a network flow included in the simulated data transmission network to a first modification to be made to the network flow (Fig. 4; Col 16, lines 66-67 and Col 17, lines 1-4 one or more data stores for storing the output data for the machine learning platform 400, such as a data store for traffic class-specific congestion signatures 426 to determine whether a given flow corresponds to a predetermined traffic class and predetermined congestion state; Col 22, lines 7-18 a network device can apply the traffic class-specific congestion signatures 426 to determine whether a given flow corresponding to a predetermined traffic class is in a predetermined congestion state. To generate a training data set for a traffic class-specific congestion signature 426, the traffic data collector 404 can collect or the traffic data generator 406 can create traffic data for a period of time, the traffic data processor 408 can process the traffic data, and the training data assembler 410 can label a portion of the processed traffic data that correspond to the predetermined traffic class and predetermined congestion state; Col 39, lines 55-67 and Col 40, lines 1-5 In some embodiments, the network controller can periodically re-run machine learning workflows using more recent traffic data or traffic data observed over longer period of times as the traffic data accumulates. The network controller can update the traffic class-specific congestion signatures 426, the window size and/or congestion threshold estimators 428, and/or the STACKing reinforcement learning agents 430. For example, if the performance metrics for models generated from more recent traffic data show a significant drift in traffic patterns such that the more recently generated models are likely to perform better than older models, the network controller can update the network devices to apply the more recently generated models. As another example, if the performance metrics for models generated from traffic data occurring over longer periods of time improve upon older models, then the network controller can update the network devices to apply the models generated from traffic data spanning longer periods of time).

Regarding dependent claim 9, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 8 that is incorporated. Ismailsheriff teaches further comprising modifying a transmission rate of the network flow in accordance with the first modification (Col 5, lines 32-50 Various embodiments of the present disclosure can overcome the above and other deficiencies of the prior art by generating traffic class-specific congestion signatures, window size and/or congestion threshold estimators, and reinforcement learning agents that network devices can apply to improve network performance. For example, a network device can apply the traffic class-specific signatures to determine whether a given flow corresponds to a predetermined traffic class and predetermined congestion state. If the flow corresponds to the predetermined traffic class and predetermined congestion state, the network device can calculate the current window size W.sub.LATEST and/or the congestion threshold T.sub.LATEST utilizing one or more window size and/or congestion threshold estimators to evaluate the current extent of congestion in the network. The network device can pace or adjust the transmission rate or throughput TR of the flow according to a target transmission rate TR.sub.TGT for the traffic class and congestion state to which the flow corresponds; Col 30, lines 60-65 and Col 31, lines 1-3 The traffic shaping policies can influence how network devices treat flows corresponding to certain traffic classes. For example, the traffic shaping policies can regulate the flow of traffic (on a per-traffic-class basis) to match the flow to a specified transmission rate or throughput or a transmission rate or throughput derived based on a level of network congestion. The traffic shaping policies can specify average rate or peak rate traffic shaping for better use of available bandwidth by permitting more data than the configured traffic shaping rate to be sent if the bandwidth is available).

Regarding dependent claim 10, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 8 that is incorporated. Ismailsheriff teaches wherein the first feature vector comprises at least one of a delay measurement, a latency measurement, or a transmission rate associated with the network flow (Col 5, lines 32-50 Various embodiments of the present disclosure can overcome the above and other deficiencies of the prior art by generating traffic class-specific congestion signatures, window size and/or congestion threshold estimators, and reinforcement learning agents that network devices can apply to improve network performance. For example, a network device can apply the traffic class-specific signatures to determine whether a given flow corresponds to a predetermined traffic class and predetermined congestion state. If the flow corresponds to the predetermined traffic class and predetermined congestion state, the network device can calculate the current window size W.sub.LATEST and/or the congestion threshold T.sub.LATEST utilizing one or more window size and/or congestion threshold estimators to evaluate the current extent of congestion in the network. The network device can pace or adjust the transmission rate or throughput TR of the flow according to a target transmission rate TR.sub.TGT for the traffic class and congestion state to which the flow corresponds).

Regarding independent claim 11, it is a media claim that corresponding to the method of claim 1. Therefore, it is rejected for the same reason as claim 1 above. Ismailsheriff further teaches one or more non-transitory computer readable media including instructions that, when executed by one or more processors, cause the one or more processors to automatically control congestion in data transmission networks (Col 54, lines 41-50 the storage device 830 can include the software modules 832, 834, 836 for controlling the processor 810. Other hardware or software modules are contemplated. The storage device 830 can be connected to the system bus 805. In some embodiments, a hardware module that performs a particular function can include a software component stored in a computer-readable medium in connection with the necessary hardware components, such as the processor 810, bus 805, output device 835, and so forth, to carry out the function).

Regarding dependent claim 12, it is a media claim that corresponding to the method of claim 2. Therefore, it is rejected for the same reason as claim 2 above.

Regarding dependent claim 16, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 11 that is incorporated. Ismailsheriff teaches wherein executing the final tree-based model in conjunction with the first data transmission network comprises causing a first processor included in a network interface card to execute a first instance of the final tree-based model in order to control a transmission rate of a network flow included in the first data transmission network (Col 40, lines 6-35 At step 510, the network controller can distribute the machine learning models generated at step 508 to the network devices for the devices to apply to new traffic data, and the network controller can receive the output of the machine learning models. For example, the network controller can continuously monitor the performance of a machine learning model and when precision, recall, accuracy, and/or other performance metrics (e.g., Table 4) are below certain thresholds or when a new machine learning model improves on the performance of the older machine learning model, the network controller can generate an alert to inform an administrator to update the older machine learning model. As another example, the network controller can receive congestion information from individual network devices when they believe the flows they process experience congestion. The network controller can aggregate the congestion information from the network devices and other sources throughout the network, and identify if throughput is abnormally low compared to expectation for a given location, time of day, device type, and so on, to facilitate root-cause analysis for the abnormal throughput. If the traffic class-specific congestion signatures 426 can distinguish between self-limiting or external congestion, the network controller can generate reports regarding where and when additional capacity can improve network performance. In some embodiments, the network controller can use congestion information to identify root causes of potential network outages as they develop, and preemptively remediate these issues before a network outage occurs; Col 41, lines 25-30 The process 500 can conclude at step 512 by the network controller adjusting one or more traffic shaping operations of the network devices based on the output from the traffic class-specific congestion signatures 426, the window size and/or congestion threshold estimators 428, the STACKing reinforcement learning agents 430, and other machine learning models; Col 5, lines 46-50 The network device can pace or adjust the transmission rate or throughput TR of the flow according to a target transmission rate TR.sub.TGT for the traffic class and congestion state to which the flow corresponds).

Regarding dependent claim 17, it is a media claim that corresponding to the method of claim 7. Therefore, it is rejected for the same reason as claim 7 above.

Regarding dependent claim 18, it is a media claim that corresponding to the method of claims 8 and 10. Therefore, it is rejected for the same reason as claims 8 and 10 above.

Regarding dependent claim 19, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 11 that is incorporated. Ismailsheriff teaches wherein a final loss for the final tree-based model relative to the training dataset is less than the initial loss (Col 4, lines 48-52 For traffic shaping to be effective, it can be important to identify the optimal window size W and/or the optimal congestion threshold T for a flow to achieve transmission as close to network capacity as possible without experiencing traffic loss; Col 5, lines 34-37 generating traffic class-specific congestion signatures, window size and/or congestion threshold estimators, and reinforcement learning agents that network devices can apply to improve network performance).

Regarding independent claim 20, it is a system claim that corresponding to the method of claim 1. Therefore, it is rejected for the same reason as claim 1 above. Ismailsheriff teaches a system comprising: one or more memories storing instructions; and one or more processors coupled to the one or more memories that, when executing the instructions, perform the steps (Figs. 8A&8B; Col 53, lines 57-67 and Col 54, lines 1-19; Col 54, lines 41-50)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3-4 and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Ismailsheriff as applied in claims 1 and 11, in view of Ironside, US 20190213685 A1.

Regarding dependent claim 3, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Ismailsheriff does not explicitly teach wherein generating the first trained decision tree model comprises:
executing the initial model on a plurality of feature vectors included in the training dataset to generate a plurality of predicted outputs; and
training a decision tree model to predict a negative gradient of a loss function with respect to the plurality of predicted outputs.
However, in the same field of endeavor, Ironside teaches wherein generating the first trained decision tree model comprises:
executing the initial model on a plurality of feature vectors included in the training dataset to generate a plurality of predicted outputs (Fig. 10; [0161] In embodiments, a process 1000 begins with receiving a plurality of data records. In embodiments (operation 1002), each data record of the plurality of records comprises a feature vector comprising a plurality of predictor variables, a plurality of corresponding predictor variable values, a dependent variable, and a corresponding dependent variable value; [0162] In embodiments, process 1000 continues with generating a gradient boosted tree model using the plurality of data records (operation 1004)); and
training a decision tree model to predict a negative gradient of a loss function with respect to the plurality of predicted outputs ([0071] Gradient boosting is a machine learning technique for regression and classification problems which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function. For example, gradient boosting combines weak learners into a single strong learner in an iterative fashion. Gradient boosting tends to aggressively exploit any opportunity to improve predictive accuracy, to the detriment of clarity of interpretation (or, indeed, the feasibility of any interpretation whatsoever)).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of a process of generating a gradient boosted tree model using the plurality of data records as suggested in Ironside into Ismailsheriff’s system because both of these systems are addressing a machine learning technique for regression and classification problems which produces a prediction model. This modification would have been motivated by the desire for developing solutions for a number of deficiencies and problems associated with existing techniques involving generalized linear models and gradient boosting (Ironside, [0004]).

Regarding dependent claim 4, the combination of Ismailsheriff and Ironside teaches all the limitations as set forth in the rejection of claim 3 that is incorporated. Ismailsheriff further teaches further comprising:
while training the decision tree model, determining that a tree depth of the decision tree model is equal to a maximum tree depth ([0163] In embodiments, generating the gradient boosted tree model comprises forming a first plurality of decision tree structures each having a maximum tree depth of one (1) (operation 1006). In embodiments, each decision tree structure comprises a split node and a pair of leaf nodes. In embodiments, the first plurality of decision tree structures comprises a first number of decision tree structures necessary to exhaust all main effects of the plurality of predictor variables on a dependent variable of the generalized linear model structure definition; [0164] In embodiments, generating the gradient boosted tree model further comprises iteratively forming successive pluralities of decision tree structures each having a maximum tree depth increased by one (1) as compared to its immediately preceding plurality of decision tree structures (operation 1008). In embodiments, each successive plurality of decision tree structures comprises a number of decision tree structures necessary to exhaust all interactions (involving a number of predictor variables equal to the maximum depth of the current plurality of decision trees) between the plurality of predictor variables); and
designating the decision tree model as the first trained decision tree model ([0165] In embodiments, process 1000 continues with separating each decision tree of each of the successive pluralities of decision tree structures into a plurality of indicator variables represented by the leaf nodes of the decision tree (operation 1010), thus creating a generalized linear model (GLM) structure. In embodiments, the indicator variables are defined by a series of split decisions leading up to each leaf node; [0166] In embodiments, process 1000 continues with reducing the plurality of indicator variables into a subset of the plurality of indicator variables by combining those indicator variable definitions that are identical (operation 1012); [0167] In embodiments, process 1000 continues with combining the indicator variables of the subset of the plurality of indicator variables into a generalized linear model structure definition upon which the dependent variable depends (operation 1014)).

Regarding dependent claim 13, it is a media claim that corresponding to the method of claim 3. Therefore, it is rejected for the same reason as claim 3 above.

Regarding dependent claim 14, it is a media claim that corresponding to the method of claim 4. Therefore, it is rejected for the same reason as claim 4 above.

Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Ismailsheriff as applied in claims 1 and 11, in view of Zhang et al. (hereinafter Zhang), US 20230084325 A1.

Regarding dependent claim 5, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Ismailsheriff does not explicitly further comprising:
determining that a total number of trained decision tree models included in the final tree-based model is equal to a maximum number of trees; and
designating the final tree-based model as a trained tree-based model.
However, in the same field of endeavor, Zhang teaches
determining that a total number of trained decision tree models included in the final tree-based model is equal to a maximum number of trees ([0050] Step 1: a coordinator setting relevant parameters of a Gradient Boosting Decision Tree model, including a maximum number of decision trees T, a maximum depth of trees L, an initial predicted value base, etc., and sending the relevant parameters to respective participants p.sub.i; [0051] Step 2: letting a tree counter t=1; [0052] Step 3: for each participant p.sub.i, initializing a training target of a k.sup.th tree y.sub.k−y.sub.k-1−ŷ.sub.k-1; wherein y.sub.0=y, ŷ.sub.0=base; [0053] Step 4: letting a tree layer counter l=1; [0054] Step 5: letting a node counter of a current layer n=1;[0055] Step 6: for each participant p.sub.i, determining a segmentation point of a local current node n according to the data of the current node and an optimal segmentation point algorithm and sending the segmentation point information to the coordinator; [0056] Step 7: the coordinator counting the segmentation point information of all participants, and determining a segmentation feature f and a segmentation value v according to an epsilon-greedy algorithm; [0057] Step 8, the coordinator sending the finally determined segmentation information, including the determined segmentation feature f and segmentation value v, to respective participants; [0058] Step 9: each participant segmenting a data set of the current node according to the segmentation feature f and the segmentation value v, and distributing new segmentation data to child nodes; [0059] Step 10: letting n=n+1, and continuing with the Step 3 if n is less than or equal to a maximum number of nodes in the current layer; otherwise, proceeding to a next step; [0060] Step 11: resetting the node information of the current layer according to the child nodes of a node of a l.sup.th layer, so that l=l+1, and continuing with the Step 5 if l is less than or equal to the maximum tree depth L; otherwise, proceeding to a next step); and
designating the final tree-based model as a trained tree-based model ([0061] Step 12: letting t=t+1, and continuing with the Step 3 if t is greater than or equal to the maximum number of decision trees T; otherwise, ending.).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of a horizontal federated Gradient Boosting Decision Tree optimization method based on a random greedy algorithm as suggested in Zhang into Ismailsheriff’s system because both of these systems are addressing a machine learning framework, which can effectively help multiple organizations to model data usage and machine learning. This modification would have been motivated by the desire for an efficient training without affecting the network stability (Zhang, [0005]).

Regarding dependent claim 15, it is a media claim that corresponding to the method of claim 5. Therefore, it is rejected for the same reason as claim 5 above.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Ismailsheriff as applied in claims 1, in view of RAMACHANDRAN et al. (hereinafter RAMACHANDRAN), US 20220245522 A1.

Regarding dependent claim 6, Ismailsheriff teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Ismailsheriff does not explicitly teach wherein the first data transmission network comprises a remote direct memory access network.
However, in the same field of endeavor, RAMACHANDRAN teaches wherein the first data transmission network comprises a remote direct memory access network ([0056] Data centers may employ additional mechanisms to avoid/prevent congestion, such as flow-control techniques like priority flow control (PFC), DCQCN (Data Center Quantized Congestion Notification) for RoCEv2 (Remote Direct Memory Access (RDMA) over Converged Ethernet, version 2), and DCTCP, which is a modified version of TCP implemented in data centers that leverages Explicit Congestion Notification (ECN) to provide multi-bit feedback to end hosts).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of employing Remote Direct Memory Access in the network as suggested in RAMACHANDRAN into Ismailsheriff’s system because both of these systems are addressing congestion control for Artificial Intelligence (AI) workloads. This modification would have been motivated by the desire for performance improvement by addressing congestion control for Artificial Intelligence workloads (RAMACHANDRAN, [0002]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
ARZANI et al. (US 20210012239 A1) discloses automating the generation of machine learning models for evaluation of computer networks.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way.  A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY P HOANG whose telephone number is (469)295-9134. The examiner can normally be reached M-TH 8:30-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER WELCH can be reached at 571-272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AMY P HOANG/           Examiner, Art Unit 2143                                                                                                                                                                                             

/JENNIFER N WELCH/           Supervisory Patent Examiner, Art Unit 2143
Read full office action
Prosecution Timeline

Apr 11, 2023
Application Filed
Dec 22, 2025
Non-Final Rejection mailed — §101, §102, §103
Mar 10, 2026
Response Filed
May 26, 2026
Final Rejection mailed — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/699,613
Patent 12632792
STABLE LOCAL INTERPRETABLE MODEL FOR PREDICTION
4y 2m to grant Granted May 19, 2026
18/224,509
Patent 12619452
INTELLIGENT AUTOMATED ASSISTANT IN A MESSAGING ENVIRONMENT
2y 9m to grant Granted May 05, 2026
17/455,325
Patent 12602596
APPARATUS AND METHOD FOR VALIDATING DATASET BASED ON FEATURE COVERAGE
4y 4m to grant Granted Apr 14, 2026
18/525,453
Patent 12572263
ACCESS CARD WITH CONFIGURABLE RULES
2y 3m to grant Granted Mar 10, 2026
17/572,921
Patent 12536432
PRE-TRAINING METHOD OF NEURAL NETWORK MODEL, ELECTRONIC DEVICE AND MEDIUM
4y 0m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+64.2%)
3y 1m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 233 resolved cases by this examiner. Grant probability derived from career allowance rate.