DETAILED ACTION
This action is responsive to the Application filed on 03/07/2023. Claims 1-14 are pending in the case. Claims 1 and 13 are independent claims.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1, 3-13 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Qiu “Dynamic Electronic Toll Collection via Multi-Agent Deep Reinforcement Learning with Edge-Based Graph Convolutional Networks”
Claim 1
Qiu teaches, A system for adaptive control of a heterogeneous system of systems, comprising: a memory having modules stored thereon; and a processor for performing executable instructions in the modules stored on the memory, the modules comprising… (Section 5 pg 5 “All models are implemented by Python and run on a 64-bit server with 500 GB RAM and 80 Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz processors.”) a graph convolutional network (GCN) ( pg 1 “We apply a novel edgebased graph convolutional neural network (eGCN) to extract the spatio-temporal correlations of the state features for the road network”) comprising hidden layers, the GCN configured to: ( Section 5 “There are 2 dense layers for all neural networks each with 128 neurons for single-agent models” 2 dense layers is synonymous with having 2 hidden layers) receive a time series of graphs, each graph comprising nodes and edges representing topology of an observed environment at a time moment and state of a system (pg 4 figure 1 caption “The architecture of the proposed MARL-eGCN with three modules, i.e., the traffic dynamics, the state representation, and the tolling agent. …At each time step, state matrix is extracted from the traffic dynamics module and fed into the state representation (eGCN) module together with adjacency matrix of the road network.” pg 2 “We define eGCN as f(s, V) where s ∈ R |E|×|Z| is the state matrix of a given road network” the state matrix is represented as a graph, the state is the observed state of the traffic environment depicted in the figure as nodes and edges. Further the state is a time series representation of graphs as the state matrix is the representation at each time step.) extract initial features of each graph; process the initial features to extract embedded features according to a series of aggregations and non-linear transformations performed in the hidden layers, (pg 3 “We first apply an embedding layer to each layer to obtain a latent matrix Y and apply a non-linear activation function to Y. We feed a state matrix s t at time step t to eGCN…
PNG
media_image1.png
31
269
media_image1.png
Greyscale
” the state matrix is fed into the network to extract latent of embedding features according to the aggregation via the matrix multiplication and nonlinear activation function) wherein the embedded features comprise local information for each node; and divide the embedded features into embedded states grouped according to a defined grouping; (Section 4.3 “A key observation is that traffic and road networks in metropolises can be partitioned by geographical distance, demographic distribution and electoral divisions…As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones). The magenta lines in the figure are tolled roads. Therefore, the pricing scheme in each partition is largely based on the traffic condition within the partition and, correspondingly, the pricing scheme between partitions is dominated by traffic conditions between neighbouring partitions…by dividing the planning areas into N partitions…We define the state of agent i of partition i at time step …where s…denotes the number of vehicles on road e of partition i, which travel to zone j of partition i at time step t” the features in the graph are partitioned according to distance or electoral divisions, i.e local information, for each node or agent. The features are partitioned or divided into defined zones.) a reinforcement learning module comprising a plurality of reinforcement learning algorithms, each algorithm being assigned to a unique group and having an adaptive control policy respectively linked to the unique group, each algorithm configured to (abstract pg 1 “a novel cooperative multi-agent reinforcement learning (MARL) which divides the whole road network into partitions according to their geographic and economic characteristics and trains a tolling agent for each partition” pg 5-6 Section 4.3 “We use Beta distribution as the policy distribution for the bounded and continuous action space, then we define the set of policies…for each agent as π = {πi}i=1,..,N . Hence, we can derive the gradient of the performance measure J for policy function of agent i as…
PNG
media_image2.png
32
361
media_image2.png
Greyscale
each agent has its own control policy and each agent is assigned a unique partition or group for the respective policy.) learn a control action for a given embedded state according to the adaptive control policy; (figure 1 caption “The actor and critic of the tolling agent are updated according to the reward emitted by the traffic dynamics.” And algorithm 1 pg 4 The actor, or control action, for a given agent is learned or updated according to the embedded states. The for loop in the algorithm shows the learning or updating of each respective actor of the N partitions
PNG
media_image3.png
128
350
media_image3.png
Greyscale
)
receive reward information from the environment including a local reward related to performance specific to the unique group and a global reward related to performance of the whole graph responsive to the control action; (pg 2 Section 3.2 “Conventionally, an MDP is defined as a tuple (S, A, r, T), which consists of a set of states S, a set of actions A, a reward function r” pg 4 “The reward r t i for each agent i given s at time step t is the number of vehicles that arrive at their destinations” algorithm 1
PNG
media_image4.png
217
417
media_image4.png
Greyscale
the reward for each agent in N is received in the algorithm, it is local because it is the reward for a given agent. The global reward corresponds to the accumulation of each of the local rewards for each agent.) and update parameters of the adaptive control policy using state information, control action information, and reward information; (pg 5 “the parameters of actor and critic are updated in line 11 and 12.” Algorithm 1
PNG
media_image5.png
23
184
media_image5.png
Greyscale
PNG
media_image6.png
103
372
media_image6.png
Greyscale
as shown in the algorithm the update equation uses the state information, s, the control action, a, and the reward information, r, to update the parameters.) wherein the state information, the control action information and the reward information are also used to update parameters for the hidden layers of the GCN. (pg 5 “We define loss function of value function as follows:…
PNG
media_image7.png
36
295
media_image7.png
Greyscale
… where φ is the parameter matrix of L… We call our MARL model MARL-eGCN. Algorithm 1 illustrates the detailed description of our MARLeGCN where embedded features are output by eGCN in line 4” pg 3 Section 4 “Φl is a trainable weight matrix for the l-th layer, and σ(·) is the activation function. The final output of eGCN is put into the neural network of policy and value function as demonstrated in Figure 1.” The eGCN includes trainable parameters which are trained according to the loss function with is a function of state, action and reward information, and includes hidden layers.)
Claim 3
Qiu teaches claim 1
Further Qiu teaches, wherein the graph is static. (pg 5 Section 5.1 “As shown in Figure 3, we choose the road network of Singapore central region (Central Net) and the road network of Singapore all planning areas (Whole Net) as our experimental scenarios…. We merge some zones which have low population and small acreage into their neighbouring zones, and thus we get 11 zones and 40 edges (roads) for Central Net and 33 zones and 124 edges for Whole Net” the graph has a defined static amount of edges/zones.)
Claim 4
Qiu teaches claim 1
Further Qiu teaches, wherein the graph is dynamic such that connections between nodes change dynamically as the nodes move in the environment. ( pg 3 Section 4 “we design a novel edge based graph convolutional neural network (eGCN) on DETC road network to extract the spatio-temporal correlations of the state (edge) features… s ∈ R |E|×|Z| is the state matrix of a given road network, and s t is state matrix at time step t which is defined earlier at section 3.2” Section 3.2 pg 2 “where s t e,j denotes the number of vehicles that travel to zone j at time step t on road e” the graph is dynamic insofar as the edges/state features have temporal/spatial correlation which are dependent on the state at a given time, t. The state is a function of the number of vehicles in certain zones in the environment, therefore the connection between nodes changes as the nodes move in the environment.)
Claim 5
Qiu teaches claim 1
Further Qiu teaches, the grouping is defined according to node type. (pg 4 Section 4.3 “As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones)….” Pg 5 Section 5.1 “There are 11 zones (planning areas) in Central Net and over 33 zones in Whole Net. We create an abstract road network for both the Central Net and Whole Net based on the distribution of ETC gantries, the arterial roads and highways network of Singapore. We merge some zones which have low population and small acreage into their neighbouring zones, and thus we get 11 zones” zones are merged according to the type of population/acreage the defined by the zone/node type)
Claim 6
Qiu teaches claim 1
Further Qiu teaches, wherein the grouping is defined according to domain. (pg 4 Section 4.3 “As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones)….” Pg 5 Section 5.1 “There are 11 zones (planning areas) in Central Net and over 33 zones in Whole Net. We create an abstract road network for both the Central Net and Whole Net based on the distribution of ETC gantries, the arterial roads and highways network of Singapore. We merge some zones which have low population and small acreage into their neighbouring zones, and thus we get 11 zones” zones are grouped according to distribution of roads and highways, i.e their domain.)
Claim 7
Qiu teaches claim 1
Further Qiu teaches, wherein the grouping is defined according to graph topology. (pg 4 Section 4.3 “As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones)….” Figure 2
PNG
media_image8.png
172
301
media_image8.png
Greyscale
as shown in the figure zones are partitioned according to the typology in the graph)
Claim 8
Qiu teaches claim 1
Further Qiu teaches, wherein the defined grouping is data-driven. (pg 4 Section 4.3 “As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones)….” Pg 5 Section 5.1 “There are 11 zones (planning areas) in Central Net and over 33 zones in Whole Net. We create an abstract road network for both the Central Net and Whole Net based on the distribution of ETC gantries, the arterial roads and highways network of Singapore. We merge some zones which have low population and small acreage into their neighbouring zones, and thus we get 11 zones” The partitioning of zones into groups is based on a priori data such as population, thus data driven.)
Claim 9
Qiu teaches claim 1
Further Qiu teaches, the defined grouping is function driven. (pg 4 Section 4.3 “As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones)….” Pg 5 Section 5.1 “There are 11 zones (planning areas) in Central Net and over 33 zones in Whole Net. We create an abstract road network for both the Central Net and Whole Net based on the distribution of ETC gantries, the arterial roads and highways network of Singapore. We merge some zones which have low population and small acreage into their neighbouring zones, and thus we get 11 zones” a decision based on a distribution of data is based on a function of the data, therefore grouping according to a distribution is function driven)
Claim 10
Qiu teaches claim 1
Further Qiu teaches, wherein the defined grouping allows nodes of one type to be in different groups (pg 4 Section 4.3 “As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones)….” Pg 5 Section 5.1 “There are 11 zones (planning areas) in Central Net and over 33 zones in Whole Net. We create an abstract road network for both the Central Net and Whole Net based on the distribution of ETC gantries, the arterial roads and highways network of Singapore. We merge some zones which have low population and small acreage into their neighbouring zones, and thus we get 11 zones” each zone or node shares the characteristic type of being “planning areas” each zone defines different groups, thus each of the differing groups share one type.)
Claim 11
Qiu teaches claim 1
Further Qiu teaches, the defined grouping allows a group to contain nodes of different types. (Section 4.3 “As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones). The magenta lines in the figure are tolled roads.” Figure 2
PNG
media_image9.png
192
346
media_image9.png
Greyscale
some of the nodes are connected to only toll roads, some toll roads, or no toll roads. The amount of toll roads defines a characteristic of a node, and thus defines a difference in type. Each group, therefore, contains nodes of different types.)
Claim 12
Qiu teaches claim 1
Further Qiu teaches, the defined grouping allows all nodes to be of the same type globally (pg 4 Section 4.3 “As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones)….” Pg 5 Section 5.1 “There are 11 zones (planning areas) in Central Net and over 33 zones in Whole Net. We create an abstract road network for both the Central Net and Whole Net based on the distribution of ETC gantries, the arterial roads and highways network of Singapore. We merge some zones which have low population and small acreage into their neighbouring zones, and thus we get 11 zones” the nodes are the same type because globally they all represent in part a traffic pattern of the Central/World net.)
Claim 13
Qiu teaches, A method for adaptive control of a heterogeneous system of systems (Section 5 pg 5 “All models are implemented by Python and run on a 64-bit server with 500 GB RAM and 80 Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz processors.”) receiving, by a graph convolutional network (GCN), a time series of graphs, each graph comprising nodes and edges representing topology of an observed environment at a time moment and state of a system, ( pg 1 “We apply a novel edgebased graph convolutional neural network (eGCN) to extract the spatio-temporal correlations of the state features for the road network” pg 4 figure 1 caption “The architecture of the proposed MARL-eGCN with three modules, i.e., the traffic dynamics, the state representation, and the tolling agent. …At each time step, state matrix is extracted from the traffic dynamics module and fed into the state representation (eGCN) module together with adjacency matrix of the road network.” pg 2 “We define eGCN as f(s, V) where s ∈ R |E|×|Z| is the state matrix of a given road network” the state matrix is represented as a graph, the state is the observed state of the traffic environment depicted in the figure as nodes and edges. Further the state is a time series representation of graphs as the state matrix is the representation at each time step.) extracting, by the GCN, initial features of each graph; processing, by the GCN, the initial features to extract embedded features according to a series of aggregations and non-linear transformations performed in the hidden layers ( Section 5 “There are 2 dense layers for all neural networks each with 128 neurons for single-agent models” 2 dense layers is synonymous with having 2 hidden layers pg 3 “We first apply an embedding layer to each layer to obtain a latent matrix Y and apply a non-linear activation function to Y. We feed a state matrix s t at time step t to eGCN…
PNG
media_image1.png
31
269
media_image1.png
Greyscale
” the state matrix is fed into the network to extract latent of embedding features according to the aggregation via the matrix multiplication and nonlinear activation function) wherein the embedded features comprise local information for each node; and dividing, by the GCN, the embedded features into embedded states grouped according to a defined grouping; (Section 4.3 “A key observation is that traffic and road networks in metropolises can be partitioned by geographical distance, demographic distribution and electoral divisions…As shown in Figure 2, there are 4 partitions and each partition has some nodes (zones). The magenta lines in the figure are tolled roads. Therefore, the pricing scheme in each partition is largely based on the traffic condition within the partition and, correspondingly, the pricing scheme between partitions is dominated by traffic conditions between neighbouring partitions…by dividing the planning areas into N partitions…We define the state of agent i of partition i at time step …where s…denotes the number of vehicles on road e of partition i, which travel to zone j of partition i at time step t” the features in the graph are partitioned according to distance or electoral divisions, i.e local information, for each node or agent. The features are partitioned or divided into defined zones.) learning, by a reinforcement learning module algorithm, a control action for a given embedded state according to an adaptive control policy, wherein the algorithm is assigned to a unique group by the grouping policy and having an adaptive control policy respectively linked to the unique group; (abstract pg 1 “a novel cooperative multi-agent reinforcement learning (MARL) which divides the whole road network into partitions according to their geographic and economic characteristics and trains a tolling agent for each partition” pg 5-6 Section 4.3 “We use Beta distribution as the policy distribution for the bounded and continuous action space, then we define the set of policies…for each agent as π = {πi}i=1,..,N . Hence, we can derive the gradient of the performance measure J for policy function of agent i as…
PNG
media_image2.png
32
361
media_image2.png
Greyscale
each agent has its own control policy, and each agent is assigned a unique partition or group for the respective policy. figure 1 caption “The actor and critic of the tolling agent are updated according to the reward emitted by the traffic dynamics.” And algorithm 1 pg 4 The actor, or control action, for a given agent is learned or updated according to the embedded states. The for loop in the algorithm shows the learning or updating of each respective actor of the N partitions
PNG
media_image3.png
128
350
media_image3.png
Greyscale
)
receiving, by the reinforcement learning module algorithm, reward information from the environment including a local reward related to performance specific to the unique group and a global reward related to performance of the whole graph responsive to the control action;; (pg 2 Section 3.2 “Conventionally, an MDP is defined as a tuple (S, A, r, T), which consists of a set of states S, a set of actions A, a reward function r” pg 4 “The reward r t i for each agent i given s at time step t is the number of vehicles that arrive at their destinations” algorithm 1
PNG
media_image4.png
217
417
media_image4.png
Greyscale
the reward for each agent in N is received in the algorithm, it is local because it is the reward for a given agent. The global reward corresponds to the accumulation of each of the local rewards for each agent.) and updating, by the reinforcement learning module algorithm, parameters of the adaptive control policy using state information, control action information, and reward information (pg 5 “the parameters of actor and critic are updated in line 11 and 12.” Algorithm 1
PNG
media_image5.png
23
184
media_image5.png
Greyscale
PNG
media_image6.png
103
372
media_image6.png
Greyscale
as shown in the algorithm the update equation uses the state information, s, the control action, a, and the reward information, r, to update the parameters.) wherein the state information, the control action information and the reward information are also used to update parameters for the hidden layers of the GCN. (pg 5 “We define loss function of value function as follows:…
PNG
media_image7.png
36
295
media_image7.png
Greyscale
… where φ is the parameter matrix of L… We call our MARL model MARL-eGCN. Algorithm 1 illustrates the detailed description of our MARLeGCN where embedded features are output by eGCN in line 4” pg 3 Section 4 “Φl is a trainable weight matrix for the l-th layer, and σ(·) is the activation function. The final output of eGCN is put into the neural network of policy and value function as demonstrated in Figure 1.” The eGCN includes trainable parameters which are trained according to the loss function with is a function of state, action and reward information, and includes hidden layers.)
Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA 35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2 and 14 are rejected under 35 U.S.C. § 103 as being unpatentable over Qiu, further in view of Chen “Gated Residual Recurrent Graph Neural Networks for Traffic Prediction”
Claim 2/14
Qiu teaches claim 1/13
Further Qiu teaches, the GCN …configured to:…capture, in the embedded states, graph dynamics as evolutions of nodes and edges at the feature level, including non-linear interactions between nodes at each time step and across multiple time steps, using a set of previous graphs as input ( pg 3 “we design a novel edge based graph convolutional neural network (eGCN) on DETC road network to extract the spatio-temporal correlations of the state (edge) features… We define eGCN as f(s, V) where s ∈ R |E|×|Z| is the state matrix of a given road network, and s t is state matrix at time step t which is defined earlier at section 3.2… We first apply an embedding layer to each layer to obtain a latent matrix Y and apply a non-linear activation function to Y” the function f(s,V) captures temporal dynamics across time steps of the connections between nodes of a time series of graphs captured in the embedded state which is a non-linear interaction as a result of the non-linear activation function.) and wherein the reinforcement learning module is configured to use the embedded states to anticipate adjustment of group control policies based on functional properties of the nodes and edges. ( Section 3.2 pg 2 “The goal of the agent for each time step t is to maximize the expected accumulated reward” algorithm 1 pg 4
PNG
media_image10.png
367
409
media_image10.png
Greyscale
the reinforcement learning policy/module, π, uses the embedded state, ρ, to anticipate adjustment of control policies by learning a suitable policy which maximizes reward. This is based on the functional/embedded properties of the nodes and edges of the graph. As described in the specification paragraph 0018, the anticipation is the intended result achieved by capturing temporal dynamics of the graph in the embedded state, which as noted in rejection above is captured because the graph is a spatio-temporal representation of the state.)
Qiu does not explicitly teach, the GCN further comprises a plurality of recurrent layers
Chen however when addressing the use of both convolutional and recurrent layers together in graph neural networks teaches, the GCN further comprises a plurality of recurrent layers (pg 4-5 “In MRes-RGNN, we utilize graph convolutional, recurrent, and residual neural networks together to extract the spatial-temporal features” pg 6 “In our implementation, two Res-RGNN layers are utilized, and the graph convolution kernel size is set to 64” pg 6 “The proposed framework utilizes one Res-RGNN branch for near time, and one hop1 Res-RGNN branch for daily period.” Each branch is a recurrent layer)
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the graph neural network of Qui to incorporate recurrent layers as addressed by Chen. One would have been motivated to make such a combination because as noted by Chen, naïve combinations of CNNs and RNNs “cannot capture the connectivity and globality of traffic networks. In this paper, we first propose to adopt residual recurrent graph neural networks (Res-RGNN) that can capture graph-based spatial dependencies and temporal dynamics jointly” (Chen abstract) and “that [the system] can discover the periodic patterns among the time series signals.” (Chen pg 2)
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 9:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.R.G./
Examiner, Art Unit 2122
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122