Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 2/10/2026 has been entered.
Remarks
This Office Action is responsive to Applicants' Amendment filed on February 10, 2026, in which claims 1, 5, 11 and 15 are currently amended. Claims 2-3, 6-7, and 12-13 are cancelled. Claims 1, 4-5, 8-11, and 14-20 are currently pending.
Response to Arguments
Applicant’s arguments with respect to rejection of claims 1, 4-5, 8-11, and 14-20 under 35 U.S.C. 103 based on amendment have been considered. The arguments are moot in view of a new ground of rejection set forth below.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claims 1, 5, 11, and 15, “the data from the middle buffer” lacks antecedent basis. “Data from the middle buffer” is recommended.
The remaining claims are rejected with respect to their dependence on the rejected claims.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 4, 11, and 14 are rejected under U.S.C. §103 as being unpatentable over the combination of Canedo (US20190095806A1) and Baum (US11216717B2).
PNG
media_image1.png
588
688
media_image1.png
Greyscale
FIG. 1 of US20190095806A1
Regarding claim 1, Canedo teaches A neural network tiling method by a processor to reduce or eliminate access to external memory while performing processing operations, comprising: generating, by the processor, a neural network graph,([Abstract] "A computer-implemented method for learning structural relationships between nodes of a graph includes generating a knowledge graph comprising nodes representing a system and applying a graph-based convolutional neural network (GCNN) to the knowledge graph to generate feature vectors describing structural relationships between the nodes" [¶0071] "the processing of each subgraph is coded such that it primarily utilizes registers and shared memory and only utilizes device memory as necessary to move data in and out of a thread block")
wherein the neural network graph represents a neural network deployed in the processor,([¶0072] "thread blocks in the same grid can run on the same multiprocessor within the GPU at the same time [...] the individual thread blocks can be selected and configured to optimize training of the SGCNN [...] each thread block is assigned a subgraph of the knowledge graph" See also FIG. 1)
the neural network graph comprises a plurality of vertices, and each vertex represents a calculation unit in the neural network; and([¶0027] "a graph is defined as V=(V,E), where V is the set of vertices and E is the set of edges. The graph edges can be weighted and directed […] For each vi∈V, the features are defined to be fi" feature fi interpreted as synonymous with calculation unit.)
tiling, by the processor, the neural network graph to obtain a depth subgraph, wherein the depth subgraph represents a depth subnetwork, ([¶0027] "A subgraph is defined as Gs=(Vs,Es) where Vs∈V and Es∈E" See also FIG. 1 subgraphs.)
wherein a plurality of vertices comprised in the processor exchange data with each other by reading and writing an on-chip buffer, ([¶0071] "the processing of each subgraph is coded such that it primarily utilizes registers and shared memory and only utilizes device memory as necessary to move data in and out of a thread block" See also FIG. 12)
the processor is configured to cause the depth subgraph to successively process at least two groups of data obtained by tiling first input data,([¶0031] "NNs are generally divided into an input layer, one or more hidden layers, and an output layer. Various techniques can be used for implementing the SGCNN software. For example, in some embodiments, each layer is implemented as a separate software component configured to perform certain functions" [¶0032] "Continuing with reference to FIG. 1, each function is used as input to an input layer of the SGCNN" layer interpreted as synonymous with group of data. FIG. 1 shows two successive groups of data for processing the input data to obtain first output data.)
to obtain first output data, the first input data is input data of the depth subnetwork, and the first input data comprises one or more signals that can be processed by a computer, ([¶0031] "NNs are generally divided into an input layer, one or more hidden layers, and an output layer. Various techniques can be used for implementing the SGCNN software. For example, in some embodiments, each layer is implemented as a separate software component configured to perform certain functions" [¶0032] "Continuing with reference to FIG. 1, each function is used as input to an input layer of the SGCNN" layer interpreted as synonymous with group of data. FIG. 1 shows two successive groups of data for processing the input data to obtain first output data.)
wherein the depth subnetwork is a subnetwork obtained by tiling the neural network, ([¶0027] "A subgraph is defined as Gs=(Vs,Es) where Vs∈V and Es∈E" See also FIG. 1 subgraphs.)
wherein the depth subnetwork includes a part of the plurality of vertices of the neural network;([¶0027] "A subgraph is defined as Gs=(Vs,Es) where Vs∈V and Es∈E" See also FIG. 1 subgraphs.)
wherein when a quantity of vertices comprised in the depth subgraph is not less than a first threshold, tiling the depth subgraph to obtain a first second-order subgraph and a second second- order subgraph, wherein the first second-order subgraph is used to represent a first second-order subnetwork, the second second-order subgraph is used to represent a second second-order subnetwork, both the first second-order subnetwork and the second second-order subnetwork are comprised in the depth subnetwork, ([¶0054] "we randomly select n paths, and use each path as a row to form a neighbor feature matrix N. Thus N is an n by d matrix with each element being a feature vector of a neighbor node. An example of the relevant matrix operations is shown in FIG. 7 Notice that in this case the number of paths found in P^d is smaller than n, we can simply pad the P^d to make it at least have n number of paths. Next task extracts feature vectors from N as the output of this layer." See also FIG. 1. Caneda teaches that each vertex corresponds to a feature ([¶0021] "For each vi∈V, the features are defined to be fi"). Paths Pd0 and Pd1 interpreted as first and second second-order subgraphs.)
and vertices comprised in the first second-order subnetwork are all different from vertices comprised in the second second-order subnetwork, ([¶0030] "In order to classify structures, we utilize labelled examples of structures within the graph. We can then build up the neighborhood around such substructures and sample from these neighborhoods to provide context for the structure. This gives us two levels of context, context within a structure and context around the structure" Context substructures/subgraphs interpreted as second order subnetworks. FIG. 1 shows that GPS context subgraphs have completely different vertices than the "INS" and "HVAC" context subgraphs.)
wherein output data of the first second-order subnetwork is stored into a middle buffer, and the second second-order subnetwork reads the data from the middle buffer,([¶0059] "New fully-connected graph Gm with m number of vertices, and for xi k is the feature vector for node i. Gm which can be used to input into another subgraph convolution kernel layer 260 in the SGCNN architecture" [¶0074] "each thread block is assigned a subgraph of the knowledge graph").
However, Canedo does not explicitly teach wherein the middle buffer is an off-chip buffer whose reading and writing speed is lower than that of the on-chip buffer, and the reading and writing speed of the middle buffer is faster than that of an external memory.
Baum, in the same field of endeavor, teaches wherein the middle buffer is an off-chip buffer whose reading and writing speed is lower than that of the on-chip buffer, and the reading and writing speed of the middle buffer is faster than that of an external memory([Col. 16 l. 56-65] "the interconnect between the layers provides a dual buffer mechanism, so that one layer writes its output to one buffer as the second layer reads the previous output as its input from the second buffer." [Col. 12 l. 48-67] "The lowest hierarchical level is the processing element (PE) 76 with its own dedicated internal Layer 1 or L1 memory 78 in which individual neurons are implemented. A plurality of N PEs 76 along with dedicated Layer 2 or L2 memory 74 make up the next hierarchical level termed a subcluster 70. A plurality of M subclusters 70 along with dedicated Layer 3 or L3 memory 72, a plurality of activation function circuits 80, and a plurality of layer controller (LC) circuits 82 make up a cluster 66. A plurality of L clusters along with dedicated Layer 4 or L4 memory 64 are in the NN processor core 60 which also comprises NN manager circuit 62, and memory interface 68 to off-chip Layer 5 or L5 memory 98. A plurality of bus interfaces 86 (i.e. chip-to-chip interfaces) couple the NN processor to other off-chip NN processor chips for additional network capacity. Bus interface 84 (i.e. chip-to-chip interface) couples the NN processor to a conventional rule based machine (RBM) co-processor 88 comprising a CPU 90" See Table 1).
Canedo as well as Baum are directed towards neural network processing. Therefore, Canedo as well as Baum are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Canedo with the teachings of Baum by reading and writing neural network weights from a middle of memory hierarchy source/destination (such as RAM). While this would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, this obviousness is reinforced by Baum who provides as additional motivation for combination ([Col. 26 l. 60-65] “Limiting access to memory by the compute elements using a memory windowing scheme significantly improves the available bandwidth while greatly reducing the required address and control routing”). This motivation for combination also applies to the remaining claims which depend on this combination.
Regarding claim 4, the combination of Canedo, and Baum teaches The neural network tiling method according to claim 1, wherein after the tiling the neural network graph to obtain a depth subgraph, the method further comprises: generating, by the processor, a target instruction corresponding to the depth subgraph, wherein the target instruction is used to execute a target subtask,(Canedo [¶0006] "a Structured Graph Convolutional Neural Network (SGCNN) that is able to perform graph invariant learning tasks at a graph- and subgraph-level." [¶0054] "An example of the relevant matrix operations is shown in FIG. 7 Notice that in this case the number of paths found in P^d is smaller than n, we can simply pad the P^d to make it at least have n number of paths. Next task extracts feature vectors from N as the output of this layer." relevant matrix operation and feature vector extraction interpreted as synonymous with target instructions corresponding to target subtasks of the depth subgraph.)
the neural network is configured to execute a target task, and the target subtask is a part of the target task.(Canedo [0056] "The task of the subgraph convolution kernel layer 260 shown in FIG. 2 is to extract feature vectors from graphs or subgraphs. An example visualization of the operations performed by the subgraph convolution kernel is presented in FIG. 8" See also FIG. 2 and FIG. 8).
Regarding claims 11 and 14, claims 11 and 14 are directed towards an apparatus for performing the method of claims 1 and 4. Therefore, the rejections applied to claims 1 and 4 also apply to claims 11 and 14. Claims 11 and 14 recite additional elements A neural network graph tiling apparatus, comprising a memory and a processor, wherein the memory is configured to store code, and the processor is configured to perform the following operations by reading the code stored in the memory: ([¶0025] “FIG. 12 provides an example of a parallel processing memory architecture 1200 that may be utilized by to perform computations related to execution of SGCNN discussed herein, according to some embodiments of the present invention”).
Claims 5, 8, 15, 16, and 18 are rejected under U.S.C. §103 as being unpatentable over the combination of Canedo and Dai (“GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing”, 2019).
Regarding claim 5, Canedo teaches A neural network-based prediction method by a processor, comprising: obtaining, by the processor, original input data, wherein the original input data comprises one or more signals that can be processed by the processor;([¶0003] "classification for grid-structured Euclidean data (e.g., 1-D signals such as time series, and 2-D data sets such as images)" [¶0032] "Continuing with reference to FIG. 1, each function is used as input to an input layer of the SGCNN. In this example, the subgraphs are “GPS,” “INS” and “HVAC”; however, as indicated by the ellipses, many subgraphs may be identified from the complete graph. Techniques for identifying subgraphs are generally known in the art and, thus not described herein in detail. The input layer generates a feature vector and location information for each functional indicating features of interest and the location of these functional graphs in the complete graph. Using the feature vectors and location information as input, a plurality of hidden layers process the data to provide a functional score for the subgraph")
tiling, by the processor a neural network graph that represents a neural network to generate depth subgraphs;([¶0027] "A subgraph is defined as Gs=(Vs,Es) where Vs∈V and Es∈E" See also FIG. 1 subgraphs.)
inputting, by the processor, the original input data to a neural network deployed in the processor for prediction processing by the depth subgraphs tiling the input data to obtain a prediction result, wherein the prediction processing comprises:([¶0003] "classification for grid-structured Euclidean data (e.g., 1-D signals such as time series, and 2-D data sets such as images)" [¶0032] "Continuing with reference to FIG. 1, each function is used as input to an input layer of the SGCNN. In this example, the subgraphs are “GPS,” “INS” and “HVAC”; however, as indicated by the ellipses, many subgraphs may be identified from the complete graph. Techniques for identifying subgraphs are generally known in the art and, thus not described herein in detail. The input layer generates a feature vector and location information for each functional indicating features of interest and the location of these functional graphs in the complete graph. Using the feature vectors and location information as input, a plurality of hidden layers process the data to provide a functional score for the subgraph")
successively inputting, to a depth subnetwork for processing, at least two groups of data obtained by tiling first input data, ([¶0031] "NNs are generally divided into an input layer, one or more hidden layers, and an output layer. Various techniques can be used for implementing the SGCNN software. For example, in some embodiments, each layer is implemented as a separate software component configured to perform certain functions" [¶0032] "Continuing with reference to FIG. 1, each function is used as input to an input layer of the SGCNN" layer interpreted as synonymous with group of data. FIG. 1 shows two successive groups of data for processing the input data to obtain first output data.)
to obtain first output data, the first input data is input data of the depth subnetwork, and the first input data comprises one or more signals that can be processed by a computer, wherein the depth subnetwork is a subnetwork obtained by tiling the neural network, ([¶0031] "NNs are generally divided into an input layer, one or more hidden layers, and an output layer. Various techniques can be used for implementing the SGCNN software. For example, in some embodiments, each layer is implemented as a separate software component configured to perform certain functions" [¶0032] "Continuing with reference to FIG. 1, each function is used as input to an input layer of the SGCNN" layer interpreted as synonymous with group of data. FIG. 1 shows two successive groups of data for processing the input data to obtain first output data.)
wherein the depth subnetwork includes a part of the plurality of vertices of the neural network;([¶0027] "A subgraph is defined as Gs=(Vs,Es) where Vs∈V and Es∈E" See also FIG. 1 subgraphs.)
wherein when a quantity of vertices comprised in the depth subgraph is not less than a first threshold, tiling the depth subgraph to obtain a first second-order subgraph and a second second- order subgraph, wherein the first second-order subgraph is used to represent a first second-order subnetwork, the second second-order subgraph is used to represent a second second-order subnetwork, both the first second-order subnetwork and the second second-order subnetwork are comprised in the depth subnetwork,([¶0054] "we randomly select n paths, and use each path as a row to form a neighbor feature matrix N. Thus N is an n by d matrix with each element being a feature vector of a neighbor node. An example of the relevant matrix operations is shown in FIG. 7 Notice that in this case the number of paths found in P^d is smaller than n, we can simply pad the P^d to make it at least have n number of paths. Next task extracts feature vectors from N as the output of this layer." See also FIG. 1. Caneda teaches that each vertex corresponds to a feature ([¶0021] "For each vi∈V, the features are defined to be fi"). Paths Pd0 and Pd1 interpreted as first and second second-order subgraphs.)
and vertices comprised in the first second-order subnetwork are all different from vertices comprised in the second second-order subnetwork, ([¶0030] "In order to classify structures, we utilize labelled examples of structures within the graph. We can then build up the neighborhood around such substructures and sample from these neighborhoods to provide context for the structure. This gives us two levels of context, context within a structure and context around the structure" Context substructures/subgraphs interpreted as second order subnetworks. FIG. 1 shows that GPS context subgraphs have completely different vertices than the "INS" and "HVAC" context subgraphs.)
wherein output data of the first second-order subnetwork is stored into a middle buffer, and the second second- order subnetwork reads the data from the middle buffer,([¶0059] "New fully-connected graph Gm with m number of vertices, and for xi k is the feature vector for node i. Gm which can be used to input into another subgraph convolution kernel layer 260 in the SGCNN architecture" [¶0074] "each thread block is assigned a subgraph of the knowledge graph").
However, Canedo does not explicitly teach wherein the middle buffer is an off-chip buffer whose reading and writing speed is lower than that of the on-chip buffer, and the reading and writing speed of the middle buffer is faster than that of an external memory.
Dai, in the same field of endeavor, teaches wherein the middle buffer is an off-chip buffer whose reading and writing speed is lower than that of the on-chip buffer, and the reading and writing speed of the middle buffer is faster than that of an external memory.([p. 4] "we introduce OVBs as the bridge between the in-order core and the memory […] 16 HMCs can provide up to 16 GB/s×32×16 = 8192 GB/s bandwidth" [p. 10] "the bandwidth of DDR memory is 12.8 GB/s" [p. 9] "By implementing OVB in the logic layer, GraphH achieves 4.58× speedup compared with directly accessing DRAM layers on average" OVB interpreted as on-chip SRAM, HMC/DRAM interpreted as middle buffer, and DDR interpreted as external memory. See also FIG. 3).
Canedo as well as Dai are directed towards graph processing. Therefore, Canedo as well as Dai are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Canedo with the teachings of Dai by implementing the neural network graphs in Canedo on the graph accelerator in Dai. Dai provides as additional motivation for combination ([p. 1] “The essential way to improve the performance of large scale graph processing is to provide a higher bandwidth of data access” [p. 9] "By implementing OVB in the logic layer, GraphH achieves 4.58× speedup compared with directly accessing DRAM layers on average"). This motivation for combination also applies to the remaining claims which depend on this combination.
Regarding claim 8, the combination of Canedo and Dai teaches The method according to claim 5, wherein at least one vertex in the direct subnetwork performs one processing operation in a process of processing the second input data.(Canedo [¶0027] "a graph is defined as V=(V,E), where V is the set of vertices and E is the set of edges. The graph edges can be weighted and directed […] For each vi∈V, the features are defined to be fi" See also FIG. 1).
Regarding claim 15, claim 15 is directed towards an apparatus for performing the method of claim 5. Therefore, the rejection applied to claim 5 also applies to claim 15.
Regarding claim 16, the combination of Canedo and Dai teaches The apparatus according to claim 15, wherein the prediction processing further comprises: processing, by using a direct subnetwork, second input data as a whole, wherein the direct subnetwork is comprised in the neural network and comprises a part of vertices in the neural network, and the second input data is obtained in the process of inputting the original input data to the neural network for prediction processing.(Canedo [¶0027] "A subgraph is defined as Gs=(Vs,Es) where Vs∈V and Es∈E" See also FIG. 1 subgraphs. "INS" subgraph interpreted as synonymous with a directed subgraph.).
Regarding claim 18, claim 18 is directed towards an apparatus for performing the method of claim 8. Therefore, the rejection applied to claim 8 also applies to claim 18.
Claims 9, 17, and 19 are rejected under U.S.C. §103 as being unpatentable over the combination of Canedo and Dai and in further view of Peng (“Towards Efficient Learning of Neural Network Ensembles from Arbitrarily Large Datasets”, 2004).
Regarding claim 9, the combination of Canedo and Dai teaches The method according to claim 5.
However, the combination of Canedo and Dai doesn't explicitly teach wherein storage space required by the first input data is larger than the available storage space of the on-chip buffer
and storage space required by each of the at least two groups of data is not larger than the available storage space of the on-chip buffer.
Peng, in the same field of endeavor, teaches The method according to claim 5, wherein storage space required by the first input data is larger than the available storage space of the on-chip buffer, ([p. 1 1] "a dataset is too large to fit into computer main memory")
and storage space required by each of the at least two groups of data is not larger than the available storage space of the on-chip buffer.([p. 1 1] "This problem can be often avoided by reducing redundancies common in real-life data, i.e. selecting a data sample that is as small as possible but still sufficient for learning high-quality predictor" [p. 2] "This problem can be often avoided by reducing redundancies common in real-life data, i.e. selecting a data sample that is as small as possible but still sufficient for learning high-quality predictor" See also Algorithm in FIG. 1).
The combination of Canedo and Dai as well as Peng are directed towards neural networks. Therefore, the combination of Canedo and Dai as well as Peng are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Canedo and Dai with the teachings of Peng by chunking the dataset into smaller subsets to fit it in local memory. Peng provides as additional motivation for combination ([p. 5 §4] "In this study we proposed a procedure for cost-effective learning of an ensemble of single-layer feedforward neural network predictors from arbitrary large datasets. It builds a series of networks on samples much smaller than the original data and thus avoids the computational overhead associated with learning a complex network using all available data [...] As our experimental study suggested, the proposed approach could learn predictors with near-optimal accuracy with high probability while requiring only modest computational effort that is a function of the inherent complexity of the learning task at hand").
Regarding claim 17, the combination of Canedo and Dai teaches The apparatus according to claim 16.
However, the combination of Canedo and Dai doesn't explicitly teach wherein storage space required by the second input data is not larger than available storage space of the on-chip buffer.
Peng, in the same field of endeavor, teaches wherein storage space required by the second input data is not larger than available storage space of the on-chip buffer.([p. 1 1] "This problem can be often avoided by reducing redundancies common in real-life data, i.e. selecting a data sample that is as small as possible but still sufficient for learning high-quality predictor" [p. 2] "This problem can be often avoided by reducing redundancies common in real-life data, i.e. selecting a data sample that is as small as possible but still sufficient for learning high-quality predictor" See also Algorithm in FIG. 1).
The combination of Canedo and Dai as well as Peng are directed towards neural networks. Therefore, the combination of Canedo and Dai as well as Peng are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Canedo and Dai with the teachings of Peng by chunking the dataset into smaller subsets to fit it in local memory. Peng provides as additional motivation for combination ([p. 5 §4] "In this study we proposed a procedure for cost-effective learning of an ensemble of single-layer feedforward neural network predictors from arbitrary large datasets. It builds a series of networks on samples much smaller than the original data and thus avoids the computational overhead associated with learning a complex network using all available data [...] As our experimental study suggested, the proposed approach could learn predictors with near-optimal accuracy with high probability while requiring only modest computational effort that is a function of the inherent complexity of the learning task at hand").
Regarding claim 19, claim 19 is directed towards an apparatus for performing the method of claim 9. Therefore, the rejection applied to claim 9 also applies to claim 19.
Claims 10 and 20 are rejected under U.S.C. §103 as being unpatentable over the combination of Canedo and Dai and Lie (US11488004B2).
Regarding claim 10, the combination of Canedo and Dai teaches The method according to claim 5, wherein the prediction processing further comprises: processing, by the processor using a third second-order subnetwork, fourth input data to obtain second intermediate data; (Canedo [¶0031] "NNs are generally divided into an input layer, one or more hidden layers, and an output layer. Various techniques can be used for implementing the SGCNN software. For example, in some embodiments, each layer is implemented as a separate software component configured to perform certain functions" [¶0032] "Continuing with reference to FIG. 1, each function is used as input to an input layer of the SGCNN" See also FIG. 1 outputs)
and storing the second intermediate data into a middle buffer, wherein the middle buffer is not the on-chip buffer;(Canedo [¶0073] "Continuing with reference to FIG. 12, registers 1255, 1260, and 1265 represent the fast memory available to thread block 1230. Each register is only accessible by a single thread. Thus, for example, register 1255 may only be accessed by thread 1240. Conversely, shared memory is allocated per thread block, so all threads in the block have access to the same shared memory. Thus, shared memory 1235 is designed to be accessed, in parallel, by each thread 1240, 1245, and 1250 in thread block 1230. Threads can access data in shared memory 1235 loaded from device memory 1220 by other threads within the same thread block (e.g., thread block 1230). The device memory 1220 is accessed by all blocks of the grid and may be implemented by using, for example, Dynamic Random-Access Memory (DRAM)." Shared memory 1235 or Device memory 1220 interpreted as synonymous with middle buffer different from thread register (fast memory/buffer) 1255, 1260, and 1265.)
wherein both the third second-order subnetwork and the fourth second-order subnetwork are comprised in the neural network, (Canedo [¶0054] "we randomly select n paths, and use each path as a row to form a neighbor feature matrix N. Thus N is an n by d matrix with each element being a feature vector of a neighbor node. An example of the relevant matrix operations is shown in FIG. 7 Notice that in this case the number of paths found in P^d is smaller than n, we can simply pad the P^d to make it at least have n number of paths. Next task extracts feature vectors from N as the output of this layer." See also FIG. 1. Caneda teaches that each vertex corresponds to a feature ([¶0021] "For each vi∈V, the features are defined to be fi"). Paths Pd0 and Pd1 interpreted as first and second second-order subgraphs.)
vertices comprised in the third second-order subnetwork are all different from vertices comprised in the fourth second-order subnetwork, (Canedo [¶0030] "In order to classify structures, we utilize labelled examples of structures within the graph. We can then build up the neighborhood around such substructures and sample from these neighborhoods to provide context for the structure. This gives us two levels of context, context within a structure and context around the structure" Context substructures/subgraphs interpreted as second order subnetworks. FIG. 1 shows that GPS context subgraphs have completely different vertices than the "INS" and "HVAC" context subgraphs.)
and the fourth input data is obtained in the process of inputting the original input data to the neural network for prediction processing.(Canedo [¶0031] "NNs are generally divided into an input layer, one or more hidden layers, and an output layer. Various techniques can be used for implementing the SGCNN software. For example, in some embodiments, each layer is implemented as a separate software component configured to perform certain functions" [¶0032] "Continuing with reference to FIG. 1, each function is used as input to an input layer of the SGCNN" See also FIG. 1 outputs).
However, the combination of Canedo and Dai doesn't explicitly teach and processing, by a fourth second-order subnetwork, the second intermediate data obtained from the middle buffer to obtain fourth output data, .
Lie, in the same field of endeavor, teaches and processing, by a fourth second-order subnetwork, the second intermediate data obtained from the middle buffer to obtain fourth output data, ([Col. 103 l. 15-30] "Multiplier 2911 receives as operands Src A 2951 and Src B 2952 from the data storage locations specified by Source Bits 3024 of Instruction 2950 (see FIG. 30A) and performs an FP multiply (without normalizing and without rounding) of the operands to generate Intermediate Result 2953 (having exponent and mantissa portions). Accumulator 2912 is coupled to Multiplier 2911 and the data storage locations. Accumulator 2912 receives as operands Intermediate Result 2953 from Multiplier 2911 and Src C 2954 from the data storage location specified by Source Bits 3024 of Instruction 2950, and performs an FP add (without normalizing and without rounding) of the operands to generate Mantissa 2955 (as well as an exponent provided to Exponent DP 2915).").
The combination of Canedo and Dai as well as Lie are directed towards graph neural network processing. Therefore, the combination of Canedo and Dai as well as Lie are analogous art in the same field of endeavor. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Canedo and Dai with the teachings of Lie by processing the graph neural network in the combination of Canedo and Dai on a systolic array according to a dataflow graph. Lie provides as additional motivation for combination ([Col. 14 l. 4-20] "In an aspect conceptually related to neuron smearing for accelerated deep learning, techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data").
Regarding claim 20, claim 20 is directed towards an apparatus for performing the method of claim 10. Therefore, the rejection applied to claim 10 also applied to claim 20.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Gao (“TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory”, 2017) is directed towards a neural network accelerator with multiple levels of memory.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124