Last updated: April 19, 2026
Application No. 18/230,029
EDGE DATA GATHERING USING REINFORCEMENT LEARNING AND DISTRIBUTION CLIQUES

Non-Final OA §101§103
Filed
Aug 03, 2023
Examiner
HOANG, AMY P
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
EMC Ip Holding Company LLC
OA Round
1 (Non-Final)
Interview Optional

— +64.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 232 resolved cases, 2023–2026
Examiner Intelligence

HOANG, AMY P View full profile →
Grants 70% — above average
Career Allow Rate
163 granted / 232 resolved
+15.3% vs TC avg
Strong +64% interview lift
Without
With
+64.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
31 currently pending
Career history
263
Total Applications
across all art units
Statute-Specific Performance

§101
15.9%
-24.1% vs TC avg
§103
46.0%
+6.0% vs TC avg
§102
17.0%
-23.0% vs TC avg
§112
13.4%
-26.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 232 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is responsive to the application filed on 08/03/2023. Claims 1-20 are presented in the case. Claims 1, 10 and 19 are independent claims.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: Claims 1-9 are directed to a system, claims 10-18 are directed to a method and claims 19-20 are directed to a medium. Therefore, the claims are eligible under Step 1 for being directed to a machine, a process and a manufacture respectively.
Independent claims 1, 10 and 19:
Step 2A Prong 1:  
Claims recite:
using the probability distributions to identify a set of distribution cliques of the edge nodes - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation using an identification algorithm comprising calculating a divergence value, comparing the divergence value and determine that the two edge nodes are in a clique;
selecting one or more representative edge nodes from each clique - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and selecting data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
associating the feature data with the corresponding clique for the edge node at the first time - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and selecting data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
using the probability distributions, cliques, and feature data to obtain episode data for each clique for the first time - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and selecting data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper; and
training a ML-based divergence model using a portion of the episode data to update a divergence threshold value for the clique for a second time, t, that is different from the first time - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation to update a divergence threshold value.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
A system comprising: at least one processing device including a processor coupled to a memory; the at least one processing device being configured to implement the following steps; A non-transitory processor-readable storage medium having stored thereon program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device to perform the following steps - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
receiving a plurality of probability distributions from a plurality of edge nodes - the steps recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g));
receiving feature data from the edge nodes, the feature data comprising resource information that includes a resource availability and a utilization status of the edge node at a first time t−1 - the steps recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g));
training a machine learning (ML)-based model using a portion of the feature data - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
A system comprising: at least one processing device including a processor coupled to a memory; the at least one processing device being configured to implement the following steps; A non-transitory processor-readable storage medium having stored thereon program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device to perform the following steps - These limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
receiving a plurality of probability distributions from a plurality of edge nodes - which is a well-understood, routine, conventional activity similar to receiving or transmitting data over a network described in MPEP 2106.05(d)(II);
receiving feature data from the edge nodes, the feature data comprising resource information that includes a resource availability and a utilization status of the edge node at a first time t−1 - which is a well-understood, routine, conventional activity similar to receiving or transmitting data over a network described in MPEP 2106.05(d)(II);
training a machine learning (ML)-based model using a portion of the feature data - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claims 2 and 11: 
Step 2A Prong 1:  The claims recite the abstract ideas of claims 1 and 10.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the divergence threshold value is updated based on an average of divergence metrics output by the divergence model after the divergence model is deployed in an edge network that includes the plurality of edge nodes - the steps recited at a high level of generality, and amounts to mere data modifying which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the divergence threshold value is updated based on an average of divergence metrics output by the divergence model after the divergence model is deployed in an edge network that includes the plurality of edge nodes - viewed individually or in combination, describes mere data gathering similar to Presenting offers to potential customers and gathering statistics generated based on the testing about how potential customers responded to the offers; the statistics are then used to calculate an optimized price, OIP Technologies, 788 F.3d at 1363, 115 USPQ2d at 1092-93 described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible
Dependent claims 3 and 12: 
Step 2A Prong 1:  The claims recite the abstract ideas of claims 1 and 10.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein a number of cliques is updated for a future training cycle of the ML model - the steps recited at a high level of generality, and amounts to mere data modifying which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein a number of cliques is updated for a future training cycle of the ML model - viewed individually or in combination, describes mere data gathering similar to Presenting offers to potential customers and gathering statistics generated based on the testing about how potential customers responded to the offers; the statistics are then used to calculate an optimized price, OIP Technologies, 788 F.3d at 1363, 115 USPQ2d at 1092-93 described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claims 4 and 13: 
Step 2A Prong 1:  The claims recite the abstract ideas of claims 1 and 10.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the divergence model comprises a deep Q-learning reinforcement learning model - the step recited at a high level of generality, and amounts to merely indicating a field of use or technological environment in which the judicial exception is performed (see MPEP § 2106.05(h)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the divergence model comprises a deep Q-learning reinforcement learning model - generally linking the use of the judicial exception to indicate a field of use or technological environment.
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claims 5 and 14: 
Step 2A Prong 1:  The claims recite the abstract ideas of claims 1 and 10.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the reinforcement learning model is trained using a graph neural network - the step recited at a high level of generality, and amounts to merely indicating a field of use or technological environment in which the judicial exception is performed (see MPEP § 2106.05(h)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the reinforcement learning model is trained using a graph neural network - generally linking the use of the judicial exception to indicate a field of use or technological environment.
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claims 6 and 15: 
Step 2A Prong 1:  The claims recite the abstract ideas of claims 1 and 10.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the cliques comprise graphs, the probability distributions and the feature data comprise metadata annotated to nodes of the graphs, and the annotated graphs are used as input to the graph neural network - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the cliques comprise graphs, the probability distributions and the feature data comprise metadata annotated to nodes of the graphs, and the annotated graphs are used as input to the graph neural network - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claims 7 and 16:
Step 2A Prong 1:  
Claims recite:
wherein the episode data for the second time, t, is obtained without considering the clique for the second time - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and obtaining the clique based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claims 8 and 17:
Step 2A Prong 1:  
Claims recite:
wherein the representative edge nodes are selected at random from each clique - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and selecting data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claims 9 and 18:
Step 2A Prong 1:  
Claims recite:
calculating a divergence value between two edge nodes of the plurality of edge nodes; comparing the divergence value with the divergence threshold value to obtain a result; and using the result to determine that the two edge nodes are in a clique - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation of calculating and comparing a divergence value to determine that the two edge nodes are in a clique.
wherein the representative edge nodes are selected at random from each clique - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and selecting data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent 20:
Step 2A Prong 1:  The claims recite the abstract ideas of claim 19.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the divergence model comprises a deep Q-learning reinforcement learning model - the step recited at a high level of generality, and amounts to merely indicating a field of use or technological environment in which the judicial exception is performed (see MPEP § 2106.05(h));
wherein the reinforcement learning model is trained using a graph neural network - the step recited at a high level of generality, and amounts to merely indicating a field of use or technological environment in which the judicial exception is performed (see MPEP § 2106.05(h));
wherein the cliques comprise graphs, the probability distributions and the feature data comprise metadata annotated to nodes of the graphs, and the annotated graphs are used as input to the graph neural network - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the divergence model comprises a deep Q-learning reinforcement learning model - generally linking the use of the judicial exception to indicate a field of use or technological environment;
wherein the reinforcement learning model is trained using a graph neural network - generally linking the use of the judicial exception to indicate a field of use or technological environment;
wherein the cliques comprise graphs, the probability distributions and the feature data comprise metadata annotated to nodes of the graphs, and the annotated graphs are used as input to the graph neural network - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-14 and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Balakrishnan et al. (hereinafter Balakrishnan), US 20230177349 A1, in view of FAROOQ et al. (hereinafter FAROOQ), US 20230162089 A1.

Regarding independent claim 1, Balakrishnan teaches a system (Fig. 12, 1208; [0123] A central server as described herein may refer to an edge compute node that acts as a server to other edge compute notes of an edge computing environment.) comprising:
at least one processing device including a processor coupled to a memory (Fig. 8, 850, 852; [0076] In a more detailed example, FIG. 8 illustrates a block diagram of an example of components that may be present in an edge computing node 850 for implementing the techniques (e.g., operations, processes, methods, and methodologies) described herein; [0077] The edge computing device 850 may include processing circuitry in the form of a processor 852);
the at least one processing device being configured to implement the following steps:
receiving a plurality of probability distributions from a plurality of edge nodes ([0167] In an initialization phase of a clustering approach, each client may transmit probability distribution information, e.g., a histogram of its data, to the central server; Fig. 16; [0171] At 1602, each client reports the probability mass function (PMF) of its local data to the central server. This may include a PMF of the client's training examples (x) or associated labels (y));
using the probability distributions to identify a set of distribution cliques of the edge nodes ([0167] The central server then normalizes the histogram and clusters the clients having similar distributions/normalized histograms; [0172] At 1604, the central server utilizes a clustering algorithm based on the clients' data probability distributions to determine cluster groups);
selecting one or more representative edge nodes from each clique ([0172] Each cluster group is identified by its nominal data distribution qi. Thus, clients with similar data distributions are likely to belong to the same cluster; [0174] At 1608, for each epoch, the central server draws a random batch of cluster(s) from the distribution p. A fixed fraction of clients is then selected from each of the drawn clusters and the selected clients are notified by the central server);
using the probability distributions, cliques, and feature data to obtain episode data for each clique for the first time ([0217] FIG. 21 illustrates an example reinforcement learning (RL) model that may be used in federated learning embodiments; [0227] Number of training examples at the federated node t (nt): In some embodiments, the number of training examples at each client node may be utilized as a state parameter input to the policy (e.g., 2106). This can be an average estimate at the central server based on historical information from the federated node t or a real-time report of the number of training examples (e.g., images). Each node t can report this information back to the central server at a certain periodicity; [0245] FIG. 22 illustrates an example training architecture for RL based optimization of federated learning. In order to allow an RL policy to learn from diverse conditions, several scenarios may be spawned in parallel. Example scenarios can include, e.g., the data distribution across clients, number of clients in federated learning and dataset, communication/compute capabilities of devices. Different initialization of the policy weights can be utilized, such as, for example, random initialization, Xavier initialization, etc); and
training a ML-based divergence model using a portion of the episode data to update a divergence threshold value for the clique for a second time, t, that is different from the first time ([0246] Each trial progresses independently, where actions are sampled from the RL policy network 2202 given the state of each trial at every global epoch. Even though the trials share the same policy, they will gradually diverge as the policy is stochastic and can yield different actions in each trial. A reward is obtained for every action after each global epoch in every trial in every scenario. After every policy interval (e.g., a number of global epochs after a policy update), the state, action, and rewards for the set of global epochs are collected. When multiple scenarios are executed, the state, action, and reward tuples are collected from all the scenarios for the set of global epochs into a common buffer. The RL policy network 2202 is then updated, utilizing these experiences (state, action, reward). Several approaches can be utilized to update the policy network (e.g., policy gradient, proximal policy optimization (PPO)). After a policy update, the initial states of all the trials within a scenario are set to the same and the next global epoch is executed by repeating the above steps; [0247] The policy network converges after several policy updates. In some embodiments, the policy network could be continuously trained until federated learning scenarios are completed (i.e., federated learning networks are fully trained). In other embodiments, policy network can be trained until their convergence and the policy network parameters frozen for subsequent global epochs. In another embodiment, the policy network can be updated whenever the environment undergoes a change (e.g., when the number of clients changes, i.e., new devices arriving/leaving, or other triggers)).
Balakrishnan does not explicitly disclose
receiving feature data from the edge nodes, the feature data comprising resource information that includes a resource availability and a utilization status of the edge node at a first time t−1;
training a machine learning (ML)-based model using a portion of the feature data;
associating the feature data with the corresponding clique for the edge node at the first time.
However, in the same field of endeavor, FAROOQ teaches
receiving feature data from the edge nodes, the feature data comprising resource information that includes a resource availability and a utilization status of the edge node at a first time t−1 ([0049] FIGS. 3A and 3B describe the operation of the process as implemented by the hyperparameter server and the edge devices respectively. FIG. 3A is a flowchart of one embodiment of the operation of a hyperparameter server. In one embodiment, the hyperparameter server determines a set of groupings for edge devices in the communication network (Block 301) … The hyperparameter server can then send instructions to configure the data feature collection of each edge device in each cluster (Block 303); [0050] After the data feature collection configuration is complete for each edge device, the hyperparameter server will begin to receive hyperparameter performance data from each of the configured edge devices (Block 305). The performance data provides feedback on how tested hyperparameters performed on collected distribution feature data at the edge device. This information can be provided as a unit (e.g., a tuple) of the performance metrics, the hyperparameters tested, and the distribution feature data; [0041] Returning to FIG. 1, the process continues with the hyperparameter server instructing all base stations to utilize a specific scheme for extracting and encoding distribution features. Distribution features, as used herein refer to dataset features that are designed to extract general properties and able to characterize datasets. Examples of such schemes can include finding statistical measures, fitting parametric distribution models, training non-parametric generative models and similar schemes; [0042] The choice of scheme for extracting and encoding distribution features can depend upon the priority of base stations in the cluster. For clusters with resource constrained devices or experiencing high load, extracting statistical features can be more beneficial as compared to training GANs);
training a machine learning (ML)-based model using a portion of the feature data ([0051] The hyperparameter server collects the performance data into a global training database (Block 307). This global training database can have any format or organization. An example is discussed herein above in relation to FIG. 2. The updated global training database is utilized to train a shared hyperparameter model for each cluster (Block 309));
associating the feature data with the corresponding clique for the edge node at the first time ([0051] Each cluster has a separate shared hyperparameter model that is trained on the training data to determine optimal hyperparameters for the associated cluster. Once the shared hyperparameter model has been trained, then the hyperparameter server can service queries from edge devices to provide optimal hyperparameters according to the associated cluster of the requesting edge device);
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of collecting hyperparameter performance data from edge devices to determine hyperparameter values to train a shared hyperparameter model for each cluster as suggested in FAROOQ into Balakrishnan’s system because both of these systems are addressing an automated process for determining parameters for a machine learning (ML) model and/or algorithm. This modification would have been motivated by the desire to automate the process of identifying appropriate parameters for machine learning algorithms (FAROOQ, [0004]).

Regarding dependent claim 2, the combination of Balakrishnan and FAROOQ teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. FAROOQ further teaches wherein the divergence threshold value is updated based on an average of divergence metrics output by the divergence model after the divergence model is deployed in an edge network that includes the plurality of edge nodes ([0178] each of the selected client devices will send statistical information to the central server, such as probability mass function information or a KL-divergence metric relative to the overall probability distribution; [0247] the policy network can be updated whenever the environment undergoes a change (e.g., when the number of clients changes, i.e., new devices arriving/leaving, or other triggers)).

Regarding dependent claim 3, the combination of Balakrishnan and FAROOQ teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Balakrishnan further teaches wherein a number of cliques is updated for a future training cycle of the ML model ([0169] With affinity propagation, a similarity metric is used to cluster the clients. For instance, let hi and hj be the denote the histograms transmitted by clients i and j. A pairwise similarity metric sij between i and j may be used, e.g., where sij is denoted by −0.5(d(hi, hj)+d(hj, hi)), where d is some distance metric between distributions. Some examples for the distance metric d include: KL divergence, Wasserstein metric, Bhattacharyya distance etc. The set {sij} is used as the similarity matrix for affinity propagation. Note that this approach has the potential advantage that the number of clusters does not have to be specified a priori).

Regarding dependent claim 4, the combination of Balakrishnan and FAROOQ teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Balakrishnan further teaches wherein the divergence model comprises a deep Q-learning reinforcement learning model ([0248] In some embodiments, Q-learning and/or deep Q-networks (DQN) may be used to train the RL agent).

Regarding dependent claim 5, the combination of Balakrishnan and FAROOQ teaches all the limitations as set forth in the rejection of claim 4 that is incorporated. Balakrishnan further teaches wherein the reinforcement learning model is trained using a graph neural network ([0113] In some cases, an ML model may include an artificial neural network (NN), which is based on a collection of connected nodes (“neurons”) and each connection (“edges”) transmit information (a “signal”) from one node to other nodes. A neuron that receives a signal processes the signal using an activation function and then signals other neurons based on the processing. Neurons and edges typically have weights that adjust as learning proceeds. The weights may increase or decrease the strength of a signal at a connection).

Regarding dependent claim 7, the combination of Balakrishnan and FAROOQ teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Balakrishnan further teaches wherein the episode data for the second time, t, is obtained without considering the clique for the second time ([0246] Each trial progresses independently, where actions are sampled from the RL policy network 2202 given the state of each trial at every global epoch. Even though the trials share the same policy, they will gradually diverge as the policy is stochastic and can yield different actions in each trial. A reward is obtained for every action after each global epoch in every trial in every scenario. After every policy interval (e.g., a number of global epochs after a policy update), the state, action, and rewards for the set of global epochs are collected. When multiple scenarios are executed, the state, action, and reward tuples are collected from all the scenarios for the set of global epochs into a common buffer. The RL policy network 2202 is then updated, utilizing these experiences (state, action, reward). Several approaches can be utilized to update the policy network (e.g., policy gradient, proximal policy optimization (PPO)). After a policy update, the initial states of all the trials within a scenario are set to the same and the next global epoch is executed by repeating the above steps).

Regarding dependent claim 8, the combination of Balakrishnan and FAROOQ teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Balakrishnan further teaches wherein the representative edge nodes are selected at random from each clique ([0138] In the example shown, at 1302, global model weights (e.g., weights for model 1204 of FIG. 12) are sent by a central server (e.g., 1208 of FIG. 12) to a selected set of clients (e.g., 1202 of FIG. 12). The clients may be selected by the central server in any suitable way. For example, in some instances, K clients may be selected randomly from N total clients).

Regarding dependent claim 9, the combination of Balakrishnan and FAROOQ teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Balakrishnan further teaches wherein the cliques are identified using an identification algorithm comprising:
calculating a divergence value between two edge nodes of the plurality of edge nodes ([0169] With affinity propagation, a similarity metric is used to cluster the clients. For instance, let hi and hj be the denote the histograms transmitted by clients i and j. A pairwise similarity metric sij between i and j may be used, e.g., where sij is denoted by −0.5(d(hi, hj)+d(hj, hi)), where d is some distance metric between distributions. Some examples for the distance metric d include: KL divergence, Wasserstein metric, Bhattacharyya distance etc.);
comparing the divergence value with the divergence threshold value to obtain a result ([0169] The set {sij} is used as the similarity matrix for affinity propagation); and
using the result to determine that the two edge nodes are in a clique ([0167] The central server then normalizes the histogram and clusters the clients having similar distributions/normalized histograms).

Regarding independent claim 10, it is a method claim that corresponding to the system of claim 1. Therefore, it is rejected for the same reason as claim 1 above.

Regarding dependent claim 11, it is a method claim that corresponding to the system of claim 2. Therefore, it is rejected for the same reason as claim 2 above.

Regarding dependent claim 12, it is a method claim that corresponding to the system of claim 3. Therefore, it is rejected for the same reason as claim 3 above.

Regarding dependent claim 13, it is a method claim that corresponding to the system of claim 4. Therefore, it is rejected for the same reason as claim 4 above.

Regarding dependent claim 14, it is a method claim that corresponding to the system of claim 5. Therefore, it is rejected for the same reason as claim 5 above.

Regarding dependent claim 16, it is a method claim that corresponding to the system of claim 7. Therefore, it is rejected for the same reason as claim 7 above.

Regarding dependent claim 17, it is a method claim that corresponding to the system of claim 8. Therefore, it is rejected for the same reason as claim 8 above.

Regarding dependent claim 18, it is a method claim that corresponding to the system of claim 9. Therefore, it is rejected for the same reason as claim 9 above.

Regarding independent claim 19, it is a medium claim that corresponding to the system of claim 1. Therefore, it is rejected for the same reason as claim 1 above. Balakrishnan further teaches A non-transitory processor-readable storage medium having stored thereon program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device to perform the steps of claim 1 ([0093]-[0094]).

Claims 6, 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Balakrishnan, in view of FAROOQ as applied in claims 5, 14 and 19, further in view of CRISAN et al. (hereinafter CRISAN), US 20250028928 A1.

Regarding dependent claim 6, the combination of Balakrishnan and FAROOQ teaches all the limitations as set forth in the rejection of claim 5 that is incorporated. The combination of Balakrishnan and FAROOQ does not explicitly disclose wherein the cliques comprise graphs, the probability distributions and the feature data comprise metadata annotated to nodes of the graphs, and the annotated graphs are used as input to the graph neural network.
However, in the same field of endeavor, CRISAN teaches wherein the cliques comprise graphs, the probability distributions and the feature data comprise metadata annotated to nodes of the graphs, and the annotated graphs are used as input to the graph neural network ([0030] Described herein is an example method that uses a two staged GNN pipeline, an example of which is shown in FIG. 2. In the initial training phase, node features and graph properties for an input graph 202 are used to compute (204) an embedding representation 206. Features can also be ascribed to edges, but this scenario is not evaluated in great detail here. To derive the embedding, in some implementations, information is aggregated from neighboring nodes of the input graphs, an operation that is sometimes referred to as message passing. Each convolution layer in the GNN aggregates information from more distant neighbors; [0042] Some implementations use a GNN developed using previously described two-stage pipeline (e.g., as in FIG. 2). Ahead of training the GNN model, some implementations derive a set of features for each node. These features included asset type, community clustering, centrality, and node degree. GNNs can treat node features as homogeneous (i.e., all nodes have the same features) or heterogeneously (i.e., sets of nodes have features distinct from others). For the sake of illustration, node features described herein are treated homogeneously; [0043] The input graph and set of node features are then used to train a two-layer GNN).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of using graphs including a plurality of nodes stored metadata as input to a graph neural network as suggested in CRISAN into Balakrishnan and FAROOQ’s system because both of these systems are addressing different data inputs to a neural network. This modification would have been motivated by the desire to use graph data in conventional machine learning models (CRISAN, [0003]-[0004]).

Regarding dependent claim 15, it is a method claim that corresponding to the system of claim 6. Therefore, it is rejected for the same reason as claim 6 above.

Regarding dependent claim 20, it is a medium claim that corresponding to the system of claims 4, 5 and 6. Therefore, it is rejected for the same reason as claims 4, 5 and 6 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
MORADI et al. (US 20230259744 A1) discloses grouping of nodes in a system, and particularly methods and apparatus for grouping worker nodes in a machine learning system comprising a master node and a plurality of worker nodes.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way.  A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY P HOANG whose telephone number is (469)295-9134. The examiner can normally be reached M-TH 8:30-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER WELCH can be reached at 571-272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AMY P HOANG/Examiner, Art Unit 2143                                                                                                                                                                                                        
/JENNIFER N WELCH/Supervisory Patent Examiner, Art Unit 2143
Read full office action
Prosecution Timeline

Aug 03, 2023
Application Filed
Mar 07, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/455,325
Patent 12602596
APPARATUS AND METHOD FOR VALIDATING DATASET BASED ON FEATURE COVERAGE
2y 5m to grant Granted Apr 14, 2026
18/525,453
Patent 12572263
ACCESS CARD WITH CONFIGURABLE RULES
2y 5m to grant Granted Mar 10, 2026
17/572,921
Patent 12536432
PRE-TRAINING METHOD OF NEURAL NETWORK MODEL, ELECTRONIC DEVICE AND MEDIUM
2y 5m to grant Granted Jan 27, 2026
17/241,391
Patent 12475669
METHOD AND APPARATUS WITH NEURAL NETWORK OPERATION FOR DATA NORMALIZATION
2y 5m to grant Granted Nov 18, 2025
18/386,907
Patent 12461595
SYSTEM AND METHOD FOR EMBEDDED COGNITIVE STATE METRIC SYSTEM
2y 5m to grant Granted Nov 04, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+64.2%)
3y 3m
Median Time to Grant
Low
PTA Risk
Based on 232 resolved cases by this examiner. Grant probability derived from career allow rate.