Last updated: April 19, 2026
Application No. 17/525,636
MEMORY ALLOCATION USING GRAPHS

Non-Final OA §103§112
Filed
Nov 12, 2021
Examiner
RICKS, DONNA J
Art Unit
2618
Tech Center
2600 — Communications
Assignee
Nvidia Corporation
OA Round
6 (Non-Final)
Interview Optional

— +8.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 502 resolved cases, 2023–2026
Examiner Intelligence

RICKS, DONNA J View full profile →
Grants 77% — above average
Career Allow Rate
387 granted / 502 resolved
+15.1% vs TC avg
Moderate +9% lift
Without
With
+8.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
30 currently pending
Career history
532
Total Applications
across all art units
Statute-Specific Performance

§101
11.1%
-28.9% vs TC avg
§103
58.3%
+18.3% vs TC avg
§102
13.7%
-26.3% vs TC avg
§112
8.5%
-31.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 502 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 3/27/2026 has been entered. 

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 34 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.  Claim 34 recites “wherein the one or more graph code nodes are added during execution of the graph.”  There is no support for adding graph code nodes during execution of the graph.  Therefore this claim limitation is considered to be NEW MATTER.  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 8, 14, 21 and 27;  2, 3, 5, 6, 7, 10, 12, 13, 16, 18, 20, 23, 26, 28, 31, 32 and 33 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gupta et al. U.S. Pub. No. 2020/0371761 in view of Norman U.S. Pub. No. 2020/0320367, Hutchison et al. U.S. Pub. No. 2016/0291942, 
Re:  claims 1, 8, 14, 21 and 27, Gupta teaches 
1. (Currently Amended) One or more processors comprising:  circuitry to, in response to a call of an application programming interface (API),  increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate memory, (“For example, the array 105 may include digital signal processing engines, cryptographic engines, graphic processing engines, and the like… the DPEs 110 are formed from software-configurable hardened logic – i.e., are hardened… using hardened logic circuitry to form the hardware elements in the DPE 110… can significantly reduce the footprint of the array 105 in the SoC 100.”; Gupta, [0035], [0036], Fig. 1)
Fig. 1 illustrates that the SoC includes multiple DPEs, formed from hardened logic circuitry (one or more circuits).  
(“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Via the SoC, the user controls the API, and in response to a call of the APIs to reconfigure, load and initialize dataflow graphs, which include nodes (graph code nodes).  
Gupta is silent regarding add one or more graph code nodes to one or more graphs to allocate memory, however, Norman teaches this limitation.
(“After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation… The result is thus to output a graph 502’ in a second format, which does include the tile-mapping, i.e., is tagged with the allocations of which nodes 512 and vertices 514 are to be implemented on while tiles 4, as determined by calling the memory and vertex allocation functions in the API 509. ”; Norman, [0052], [0053], Fig. 4)
The API includes allocation functions, which when called (in response to a call of the API), add respective data nodes (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) or vertex) into a version of the graph (to one or more graphs) represented in a second format.  
 (“Note that the API 509 itself does not contain the tile mappings.  Rather, it contains allocation functions which, when the API 509 is called, will generate the tile mappings.  A memory allocation function such as g.addVariable() declares that there is data node 512 to add to the second-format graph 502’.  A vertex function such as g.addVertex() declares that there is compute vertex 514 to add  to the second-format graph 502’.  When these functions are called they add the relevant nodes 512 and vertices 514 to the second-format graph 502’ and allocate the respective data and computations to tiles 4, tagging the graph 502’ with this information.”; Norman, [0055])
 A memory allocation function of the API declares that there is a data node to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  A vertex function of the API declares that there is a compute vertex to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  Adding nodes and vertices to the graph includes adding nodes or vertices that depend form the one or more graph code nodes to allocate memory.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry to, in response to a call of ([0053]).  
Gupta and Norman are silent, regarding, the circuitry is further to cause the one or more processors to allocate the memory, based at least in part, on the one or more graph code nodes, however, Hutchison teaches 
wherein the circuitry is to further cause the one or more processors to allocate the memory based, at least in part, on the one or more graph code nodes. (“Methods and systems for providing an integrated development environment are provided.  The method and systems describe an environment that can in some configurations display a DAG, nodes, and edges along with corresponding source code... Since a value cannot be computed before the values it depends upon are computed, a data dependency graph induces partial ordering, i.e., a data dependency graph is a DAG, and the DAG is a partial ordering.”; Hutchison, [0005], [0054])
The DAG (directed acyclic graph) has nodes (graph code nodes), edges with corresponding source code.  The DAG can be a data dependency graph (DDG), such that values cannot be computed before the values it depends upon are computed.  
(“As shown in Fig. 13B, a DDG analyzer (1315) may be configured to receive DDG nodes and arcs (1314) data concerning the arcs connecting a node in the DDG to its dependency nodes and dependent nodes; and generate 1316P memory allocation (1391) and deallocation logic (1392) for the nodes.  The memory allocation logic is configured to allocate memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node and for the final value or collection of values generated by the expression.”; Hutchison, [0098], Fig. 13B) 
The DDG analyzer (one or more processors) receives DDG nodes, where memory allocation logic is generated for the corresponding nodes.  The memory allocation logic of the nodes (graph code nodes) allocates memory (to allocate the memory based, at least in part, on the one or more graph code nodes).  Hutchison is combined with Gupta and Norman such that the data flow graphs of Gupta are the DAG-structured DDGs of Hutchison.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry is further to cause the one or more processors to allocate the memory, based at least in part, on the one or more graph code nodes, in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison. ([0080]).  
Re:  claim 8, Gupta teaches 
8. (Currently Amended) A system, comprising:  one or more computers having one or more processors to, in response to a call of an application programming interface (API),  increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to the one or more graphs to allocate memory, (“For example, the array 105 may include digital signal processing engines, cryptographic engines, graphic processing engines, and the like… the DPEs 110 are formed from software-configurable hardened logic – i.e., are hardened… using hardened logic circuitry to form the hardware elements in the DPE 110… can significantly reduce the footprint of the array 105 in the SoC 100.”; Gupta, [0035], [0036], Fig. 1) 
Fig. 1 illustrates that the SoC includes multiple DPEs, formed from hardened logic circuitry (one or more computers having one or more processors).  
(“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Via the SoC, the user controls the API, and in response to a call of the APIs to reconfigure, load and initialize dataflow graphs, which include nodes (graph code nodes).  
Gupta is silent regarding add one or more graph code nodes to one or more graphs to allocate memory, however, Norman teaches this limitation.
(“After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation… The result is thus to output a graph 502’ in a second format, which does include the tile-mapping, i.e., is tagged with the allocations of which nodes 512 and vertices 514 are to be implemented on while tiles 4, as determined by calling the memory and vertex allocation functions in the API 509. ”; Norman, [0052], [0053], Fig. 4)
The API includes allocation functions, which when called (in response to a call of the API), add respective data nodes (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) or vertex (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) into a version of the graph (to the one or more graphs) represented in a second format.  
 (“Note that the API 509 itself does not contain the tile mappings.  Rather, it contains allocation functions which, when the API 509 is called, will generate the tile mappings.  A memory allocation function such as g.addVariable() declares that there is data node 512 to add to the second-format graph 502’.  A vertex function such as g.addVertex() declares that there is compute vertex 514 to add  to the second-format graph 502’.  When these functions are called they add the relevant nodes 512 and vertices 514 to the second-format graph 502’ and allocate the respective data and computations to tiles 4, tagging the graph 502’ with this information.”; Norman, [0055])
 A memory allocation function of the API declares that there is a data node to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  A vertex function of the API declares that there is a compute vertex to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  Adding nodes and vertices to the graph includes adding nodes or vertices that depend from the one or more graph code nodes to allocate memory.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry to, in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate memory, in order to output a graph in a second format, which is tagged with allocations of which nodes and vertices are to be implemented on which tiles as determined by calling the memory and vertex allocation functions in the API, as taught by Norman ([0053]).  
Gupta and Norman are silent regarding, the one or more processors are to allocate the memory based, at least in part, on the one or more graph code nodes, however, Hutchison teaches 
wherein the one or more processors are to allocate the memory based, at least in part, on the one or more graph code nodes. (“Methods and systems for providing an integrated development environment are provided.  The method and systems describe an environment that can in some configurations display a DAG, nodes, and edges along with corresponding source code... Since a value cannot be computed before the values it depends upon are computed, a data dependency graph induces partial ordering, i.e., a data dependency graph is a DAG, and the DAG is a partial ordering.”; Hutchison, [0005], [0054])
The DAG (directed acyclic graph) has nodes (graph code nodes), edges with corresponding source code.  The DAG can be a data dependency graph (DDG), such that values cannot be computed before the values it depends upon are computed.  
(“As shown in Fig. 13B, a DDG analyzer (1315) may be configured to receive DDG nodes and arcs (1314) data concerning the arcs connecting a node in the DDG to its dependency nodes and dependent nodes; and generate 1316P memory allocation (1391) and deallocation logic (1392) for the nodes.  The memory allocation logic is configured to allocate memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node and for the final value or collection of values generated by the expression.”; Hutchison, [0098], Fig. 13B) 
The DDG analyzer (one or more processors) receives DDG nodes, where memory allocation logic is generated for the corresponding nodes.  The memory allocation logic of the nodes (graph code nodes) allocates memory (to allocate the memory based, at least in part, on the one or more graph code nodes).  Hutchison is combined with Gupta and Norman such that the data flow graphs of Gupta are the DAG-structured DDGs of Hutchison.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more processors are to allocate the memory based, at least in part, on the one or more graph code nodes, in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison. ([0080]).  
Re:  claim 14, Gupta teaches, 
14. (Currently Amended) A non-transitory machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: (“One example described herein is non-transitory computer readable storage medium comprising computer readable program code embodied thereon, the program code performs an operation when executed on a computer processor…”; Gupta, [0005]) 
In response to a call of an application programming interface (API),  increase the number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate memory (“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Via the SoC, the user controls the API to reconfigure, load and initialize dataflow graphs, which include nodes (graph code nodes) (generate one or more graph code nodes).  
Gupta is silent regarding add one or more graph code nodes to one or more graphs to allocate memory, however, Norman teaches this limitation.
(“After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation… The result is thus to output a graph 502’ in a second format, which does include the tile-mapping, i.e., is tagged with the allocations of which nodes 512 and vertices 514 are to be implemented on while tiles 4, as determined by calling the memory and vertex allocation functions in the API 509. ”; Norman, [0052], [0053], Fig. 4)
The API includes allocation functions, which when called (in response to a call of the API), add respective data nodes (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) or vertex (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) into a version of the graph (to the one or more graphs) represented in a second format.  
 (“Note that the API 509 itself does not contain the tile mappings.  Rather, it contains allocation functions which, when the API 509 is called, will generate the tile mappings.  A memory allocation function such as g.addVariable() declares that there is data node 512 to add to the second-format graph 502’.  A vertex function such as g.addVertex() declares that there is compute vertex 514 to add  to the second-format graph 502’.  When these functions are called they add the relevant nodes 512 and vertices 514 to the second-format graph 502’ and allocate the respective data and computations to tiles 4, tagging the graph 502’ with this information.”; Norman, [0055])
 A memory allocation function of the API declares that there is a data node to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  A vertex function of the API declares that there is a compute vertex to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  Adding nodes and vertices to the graph includes adding nodes or vertices that depend form the one or more graph code nodes to allocate memory.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry to, in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate memory, in order to output a graph in a second format, which is tagged with allocations of which nodes and vertices are to be implemented on which tiles as determined by calling the memory and vertex allocation functions in the API, as taught by Norman ([0053]).  
Gupta and Norman are silent regarding, the one or more processors are to allocate the memory based, at least in part, on the one or more graph code nodes, however Hutchison teaches 
wherein the one or more processors are to allocate the memory based, at least in part, on the one or more graph code nodes. (“Methods and systems for providing an integrated development environment are provided.  The method and systems describe an environment that can in some configurations display a DAG, nodes, and edges along with corresponding source code... Since a value cannot be computed before the values it depends upon are computed, a data dependency graph induces partial ordering, i.e., a data dependency graph is a DAG, and the DAG is a partial ordering.”; Hutchison, [0005], [0054])
The DAG (directed acyclic graph) has nodes (graph code nodes), edges with corresponding source code.  The DAG can be a data dependency graph (DDG), such that values cannot be computed before the values it depends upon are computed.  
(“As shown in Fig. 13B, a DDG analyzer (1315) may be configured to receive DDG nodes and arcs (1314) data concerning the arcs connecting a node in the DDG to its dependency nodes and dependent nodes; and generate 1316P memory allocation (1391) and deallocation logic (1392) for the nodes.  The memory allocation logic is configured to allocate memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node and for the final value or collection of values generated by the expression.”; Hutchison, [0098], Fig. 13B) 
The DDG analyzer (one or more processors) receives DDG nodes, where memory allocation logic is generated for the corresponding nodes.  The memory allocation logic of the nodes (graph code nodes) allocates memory (to allocate the memory based, at least in part, on the one or more graph code nodes).  Hutchison is combined with Gupta and Norman such that the data flow graphs of Gupta are the DAG-structured DDGs of Hutchison.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more processors are to allocate the memory based, at least in part, on the one or more graph code nodes, in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison. ([0080]).  
Re:  claim 21, Gupta teaches, 
21. (Currently Amended) One or more processors, comprising:  circuitry to, in response to a call of an application programming interface (API),  increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate or deallocate memory, (“For example, the array 105 may include digital signal processing engines, cryptographic engines, graphic processing engines, and the like… the DPEs 110 are formed from software-configurable hardened logic – i.e., are hardened… using hardened logic circuitry to form the hardware elements in the DPE 110… can significantly reduce the footprint of the array 105 in the SoC 100.”; Gupta, [0035], [0036], Fig. 1)
Fig. 1 illustrates that the SoC includes multiple DPEs, formed from hardened logic circuitry (one or more circuits).  
(“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Via the SoC, the user controls the API to reconfigure, load and initialize dataflow graphs, which include nodes (graph code nodes) (generate one or more graph code nodes).  
Gupta is silent regarding add one or more graph code nodes to one or more graphs to allocate or deallocate memory, however, Norman teaches this limitation.
(“After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation… The result is thus to output a graph 502’ in a second format, which does include the tile-mapping, i.e., is tagged with the allocations of which nodes 512 and vertices 514 are to be implemented on while tiles 4, as determined by calling the memory and vertex allocation functions in the API 509. ”; Norman, [0052], [0053], Fig. 4)
The API includes allocation functions, which when called (in response to a call of the API), add respective data nodes (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) or vertex (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) into a version of the graph (to the one or more graphs) represented in a second format.  
(“Note that the API 509 itself does not contain the tile mappings.  Rather, it contains allocation functions which, when the API 509 is called, will generate the tile mappings.  A memory allocation function such as g.addVariable() declares that there is data node 512 to add to the second-format graph 502’.  A vertex function such as g.addVertex() declares that there is compute vertex 514 to add  to the second-format graph 502’.  When these functions are called they add the relevant nodes 512 and vertices 514 to the second-format graph 502’ and allocate the respective data and computations to tiles 4, tagging the graph 502’ with this information.”; Norman, [0055])
 A memory allocation function of the API declares that there is a data node to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  A vertex function of the API declares that there is a compute vertex to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  Adding nodes and vertices to the graph includes adding nodes or vertices that depend form the one or more graph code nodes to allocate memory.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry to, in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate or deallocate memory, in order to output a graph in a second format, which is tagged with allocations of which nodes and vertices are to be implemented on which tiles as determined by calling the memory and vertex allocation functions in the API, as taught by Norman ([0053]).  
Gupta and Norman are silent regarding, the circuitry is further to generate a first graph code node to allocate the memory and generate a second graph code node to deallocate the memory, however, Hutchison teaches 
wherein the circuitry is further to add a first graph code node to allocate the memory or add a second graph code node to deallocate the memory according to the call. (“Methods and systems for providing an integrated development environment are provided.  The method and systems describe an environment that can in some configurations display a DAG, nodes, and edges along with corresponding source code.”; Hutchison, [0005])
The DAG (directed acyclic graph) has nodes (graph code nodes), edges with corresponding source code.  The DAG can be a data dependency graph (DDG), such that values cannot be computed before the values it depends upon are computed.  
(“As shown in Fig. 13B, a DDG analyzer (1315) may be configured to receive DDG nodes and arcs (1314) data concerning the arcs connecting a node in the DDG to its dependency nodes and dependent nodes; and generate 1316P memory allocation (1391) and deallocation logic (1392) for the nodes.  The memory allocation logic is configured to allocate memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node and for the final value or collection of values generated by the expression... The memory deallocation logic (1392) may be configured: to free memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node once the expression has been executed, and to free memory for the final value or collection of values generated by the expression once the final value or collection of values is no longer needed...”; Hutchison, [0098], Fig. 13B) 
The DDG analyzer (one or more processors) receives DDG nodes, where memory allocation logic and memory deallocation logic is generated for the corresponding nodes.  The memory allocation logic of the nodes (first graph code node) allocates memory (to allocate the memory based, at least in part, on the one or more graph code nodes).  The memory deallocation logic of the nodes (second graph code node) deallocates memory.  Hutchison is combined with Gupta and Norman such that the data flow graphs of Gupta are the DAG-structured DDGs of Hutchison.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry is further to generate a first graph code node to allocate the memory and generate a second graph code node to deallocate the memory, in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison ([0080]).  
Re:  claim 27, Gupta teaches, 
27. (Currently Amended) A non-transitory machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: (“One example described herein is non-transitory computer readable storage medium comprising computer readable program code embodied thereon, the program code performs an operation when executed on a computer processor… ”; Gupta, [0005])
 an application programming interface (API),  increase the number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate or deallocate memory (“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Based on the 35 U.S.C. 112(a) for claim 27 above, Examiner interprets this independent claim as “… to generate one or more graph code nodes to allocate and deallocate memory.”  Via the SoC, the user controls the API to reconfigure, load and initialize dataflow graphs, which include nodes (graph code nodes) (generate one or more graph code nodes).  
Gupta is silent regarding add one or more graph code nodes to one or more graphs to allocate memory, however, Norman teaches this limitation.
(“After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation… The result is thus to output a graph 502’ in a second format, which does include the tile-mapping, i.e., is tagged with the allocations of which nodes 512 and vertices 514 are to be implemented on while tiles 4, as determined by calling the memory and vertex allocation functions in the API 509. ”; Norman, [0052], [0053], Fig. 4)
The API includes allocation functions, which when called (in response to a call of the API), add respective data nodes (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) or vertex (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) into a version of the graph (to the one or more graphs) represented in a second format.  
 (“Note that the API 509 itself does not contain the tile mappings.  Rather, it contains allocation functions which, when the API 509 is called, will generate the tile mappings.  A memory allocation function such as g.addVariable() declares that there is data node 512 to add to the second-format graph 502’.  A vertex function such as g.addVertex() declares that there is compute vertex 514 to add  to the second-format graph 502’.  When these functions are called they add the relevant nodes 512 and vertices 514 to the second-format graph 502’ and allocate the respective data and computations to tiles 4, tagging the graph 502’ with this information.”; Norman, [0055])
 A memory allocation function of the API declares that there is a data node to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  A vertex function of the API declares that there is a compute vertex to add (increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes) to the second format graph (to the one or more graphs).  Adding nodes and vertices to the graph includes adding nodes or vertices that depend form the one or more graph code nodes to allocate memory.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry to, in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate memory, in order to output a graph in a second format, which is tagged with allocations of which nodes and vertices are to be implemented on which tiles as determined by calling the memory and vertex allocation functions in the API, as taught by Norman ([0053]).  
Gupta is are silent regarding, the one or more processors are further to generate a first graph code node to allocate the memory and generate a second graph code node to deallocate the memory, however, Norman and Hutchison teach
wherein the one or more processors are further to add a first graph code node to allocate the memory or add a second graph code node to deallocate the memory according to the call. (“Methods and systems for providing an integrated development environment are provided.  The method and systems describe an environment that can in some configurations display a DAG, nodes, and edges along with corresponding source code.”; Hutchison, [0005])
The DAG (directed acyclic graph) has nodes (graph code nodes), edges with corresponding source code.  The DAG can be a data dependency graph (DDG), such that values cannot be computed before the values it depends upon are computed.  
(“As shown in Fig. 13B, a DDG analyzer (1315) may be configured to receive DDG nodes and arcs (1314) data concerning the arcs connecting a node in the DDG to its dependency nodes and dependent nodes; and generate 1316P memory allocation (1391) and deallocation logic (1392) for the nodes.  The memory allocation logic is configured to allocate memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node and for the final value or collection of values generated by the expression... The memory deallocation logic (1392) may be configured: to free memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node once the expression has been executed, and to free memory for the final value or collection of values generated by the expression once the final value or collection of values is no longer needed...”; Hutchison, [0098], Fig. 13B) 
The DDG analyzer (one or more processors) receives DDG nodes, where memory allocation logic and memory deallocation logic is generated for the corresponding nodes.  The memory allocation logic of the nodes (first graph code node) allocates memory (to allocate the memory based, at least in part, on the one or more graph code nodes).  The memory deallocation logic of the nodes (second graph code node) deallocates memory.  Hutchison is combined with Gupta and Norman such that the data flow graphs of Gupta are the DAG-structured DDGs of Hutchison and the nodes of Hutchinson are the added nodes of Norman.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more processors are further to add a first graph code node to allocate the memory or add a second graph code node to deallocate the memory according to the call, in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison ([0080]).  
Re:  claim 2, Gupta in view of Norman and Hutchison teach 
2. (Previously Presented) The one or more processors of claim 1, wherein the circuitry is further to: obtain code indicating at least the API; and perform the API by at least executing the code. (“Fig. 8 is a kernel source code 425 for defining a kernel in a dataflow graph… the wrapper 610 in the source code of Fig. 6 permits the arguments of the function defined by the kernel to be accessed as ports.  In Fig. 8, the kernel source code 425 includes arguments 805 that specify a pointer… to the input data and a pointer... to the output data… the kernel operates on the input data provided by the arguments 805 using an application programming interface (API)… In Fig. 8, the kernel source code 425 includes window APIs for processing input data before it is outputted.  For example, the window_readincr is an API which reads the next window using the pointer inputw.  Once the operation is performed… another API can be used to output the processed data – e.g., window_writeincr.”; Gupta, [0081], [0082], Fig. 8)
The kernel of the dataflow graph (obtain code indicating at least the API) operates on the input data provided by the arguments using an API (perform the API by at least executing the code).  
Re:  claim 3, Gupta in view of Norman and Hutchison teach 
3. (Previously Presented) The one or more processors of claim 1, wherein the circuitry is further to:  generate a graph data structure; and add the one or more graph code nodes to graph data structure. (“After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation… The result is thus to output a graph 502’ in a second format, which does include the tile-mapping, i.e., is tagged with the allocations of which nodes 512 and vertices 514 are to be implemented on while tiles 4, as determined by calling the memory and vertex allocation functions in the API 509. ”; Norman, [0052], [0053], Fig. 4)
The API includes allocation functions, which when called, add respective data nodes (add the one or more graph code nodes) or vertex (add the one or more graph code nodes) into a version of the graph (generate a graph structure) represented in a second format (generate a graph data structure and add one or more graph code nodes the graph data structure).  
 (“Note that the API 509 itself does not contain the tile mappings.  Rather, it contains allocation functions which, when the API 509 is called, will generate the tile mappings.  A memory allocation function such as g.addVariable() declares that there is data node 512 to add to the second-format graph 502’.  A vertex function such as g.addVertex() declares that there is compute vertex 514 to add  to the second-format graph 502’.  When these functions are called they add the relevant nodes 512 and vertices 514 to the second-format graph 502’ and allocate the respective data and computations to tiles 4, tagging the graph 502’ with this information.”; Norman, [0055])
 A memory allocation function of the API declares that there is a data node to add (add the one or more graph code nodes to the graph data structure) to the second format graph (generate a graph data structure).  A vertex function of the API declares that there is a compute vertex to add (add the one or more graph code nodes to the graph data structure) to the second format graph (generate a graph data structure).  Adding nodes and vertices to the graph includes adding nodes or vertices that depend form the one or more graph code nodes generated to allocate memory.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry is further to:  generate a graph data structure; and add the one or more graph code nodes to graph data structure in order to output a graph in a second format, which is tagged with allocations of which nodes and vertices are to be implemented on which tiles as determined by calling the memory and vertex allocation functions in the API, as taught by Norman ([0053]).  
Re:  claim 5, Gupta in view of Norman and Hutchison teach
5. (Previously Presented) The one or more processors of claim 1, wherein the one or more graph code nodes to allocate the memory correspond to a set of graph code nodes to deallocate the memory. (“Methods and systems for providing an integrated development environment are provided.  The method and systems describe an environment that can in some configurations display a DAG, nodes, and edges along with corresponding source code.”; Hutchison, [0005])
The DAG (directed acyclic graph) has nodes (graph code nodes), edges with corresponding source code.  The DAG can be a data dependency graph (DDG), such that values cannot be computed before the values it depends upon are computed.  
(“As shown in Fig. 13B, a DDG analyzer (1315) may be configured to receive DDG nodes and arcs (1314) data concerning the arcs connecting a node in the DDG to its dependency nodes and dependent nodes; and generate 1316P memory allocation (1391) and deallocation logic (1392) for the nodes.  The memory allocation logic is configured to allocate memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node and for the final value or collection of values generated by the expression... The memory deallocation logic (1392) may be configured: to free memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node once the expression has been executed, and to free memory for the final value or collection of values generated by the expression once the final value or collection of values is no longer needed...”; Hutchison, [0098], Fig. 13B) 
The DDG analyzer (one or more processors) receives DDG nodes, where memory allocation logic and memory deallocation logic is generated for the corresponding nodes.  The memory allocation logic of the nodes (one or more graph code nodes) allocate memory (to allocate the memory) for values computed for the expression of the DDG nodes.  The memory deallocation logic of the nodes (correspond to a set of graph code nodes to deallocate the memory) deallocate or free memory once the expression has been executed or once the values generated by the expression are no longer needed.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more graph code nodes to allocate the memory correspond to a set of graph code nodes to deallocate the memory, in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison ([0080]).  
Re:  Claim 6, Gupta in view of Norman and Hutchison teach 
5. (Previously Presented) The one or more processors of claim 1, wherein the one or more processors comprise a graphics processing unit (GPU). (“The computer may be embodied as a single computer, as a  parallel computing system (a GPU or set of GPUs, a multicore system, etc.)... Any execution orderings, including parallel execution plans, that respect the partial ordering of the DAG are valid, creating opportunities for parallelization.  Parallel code may be generated, or parallel execution plans may be followed, across a wide range of potential parallel computing platform targets, including operating system or virtual machine threads/processes running on one or multiple cores on a  shared memory machine, a GPU, networked clusters of processors or GPUs,...”; Hutchison, [0069], [0176])
The one or more processors include, for example, a GPU.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more processors comprise a graphics processing unit (GPU), in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison ([0080]).  
Re:  claims 7 and 26 (which are rejected under the same rationale), Gupta in view of Norman and Hutchison teach 
7. (Previously Presented) The one or more processors of claim 1, wherein the circuitry is further to cause one or more devices to perform one or more operations using the memory. (“For each data node 512 in the graph 502, the graph interface 508a automatically inserts in the API 509 a corresponding memory allocation function for determining upon the memory of which tile or tiles 4 the data of the node 512 is to be stored.”; Norman, [0047], Fig. 5)
The graph interface (one or more circuits) inserts into the API a memory allocation function (one or more operations using memory) for determining which memory of which tile (one or more devices), of the multi-tile processor, to store the data of a node 512.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry is further to cause one or more devices to perform one or more operations using the memory, in order to make efficient use of the multi-tile processor by allocating data nodes and vertices across different tiles, as taught by Norman. ([0046]) 
Re:  claim 10,  Gupta in view of Norman and Hutchison teach 
10. (Previously Presented) The system of claim 8, wherein the one or more processors comprise a parallel processing unit (PPU). (“The computer may be embodied as a single computer, as a  parallel computing system (a GPU or set of GPUs, a multicore system, etc.)... Any execution orderings, including parallel execution plans, that respect the partial ordering of the DAG are valid, creating opportunities for parallelization.  Parallel code may be generated, or parallel execution plans may be followed, across a wide range of potential parallel computing platform targets, including operating system or virtual machine threads/processes running on one or multiple cores on a  shared memory machine, a GPU, networked clusters of processors or GPUs,...”; Hutchison, [0069], [0176])
The computing system is, for example, a parallel computing system that includes a set of GPUs or a multicore system (one or more processors comprise a parallel processing unit (PPU)).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more processors comprise a parallel processing unit (PPU), in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison ([0080]).  
Re:  claim 12, Gupta in view of Norman and Hutchison teach 
12. (Original) The system of claim 8, wherein the one or more processors are further to: obtain a graph data structure indicating one or more operations; and cause one or more devices to use the graph data structure to perform the one or more operations using the allocated memory. (“The input graph 502 comprises a plurality of data nodes 512, a plurality of compute vertices 514, and a plurality of directional edges 516 each connecting from a data node to a vertex 514 or between vertices 514… To make efficient use of the multi-tile processor 2, tis requires that the data nodes 512 and vertices 514 are allocated across different tiles 4, determining which nodes 513 and vertices 514 are to be stored and run on the memory of which tiles… For each data node 512 in the graph 502, the graph interface 508a automatically inserts in the API 509 a corresponding memory allocation function for determining upon the memory of which tile or tiles 4 the data of the node 512 is to be stored.”; Norman, [0043], [0046], [0047], Fig. 5)
The multi-tile processor receives the input graph and determines which nodes and vertices are to be stored and run on the memory of which tiles (obtain a graph data structure indicating one or more operations, cause one or more devices to use the graph data structure to perform the one or more operations using the allocated memory).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of obtain a graph data structure indicating one or more operations, cause one or more devices to use the graph data structure to perform the one or more operations using the allocated memory, in order to make efficient use of the multi-tile processor by allocating data nodes and vertices across different tiles, as taught by Norman. ([0046])  
Re:  claim 13, Gupta in view of Norman and Hutchison teach 
13. (Original) The system of claim 8, wherein the API is a runtime API. (“The update( ) APIs permit the programmer to update runtime parameters in the dataflow graph by specifying a graph object…”; Gupta, [0143])
The update( ) APIs update runtime parameters for the dataflow graph and are considered to be runtime APIs.  
Re:  claim 16, Gupta in view of Norman and Hutchison teach 
16. (Previously Presented) The non-transitory machine-readable medium of claim 14, wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to obtain code comprising parameter values for the API. (“In general, the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100… embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).”; Gupta, [0137])
The APIs are used to change parameters of the dataflow graph that is executed on the SoC (cause the one or more processors to obtain code comprising parameter values for the API).
Re:  claim 18, Gupta in view of Norman and Hutchison teach 
18.  (Previously Presented) The non-transitory machine-readable medium of claim 14, wherein the API is a driver API. (“… the compiler 435 can configure the drivers 1910 in response to detecting the corresponding API 1905 in the control source code 430.”; Gupta, [0140])
The compiler configures drivers that correspond to the API (the API is a driver API).  
Re:  claim 20, Gupta in view of Norman and Hutchison teach 
20. (Previously Presented) The non-transitory machine readable medium of claim 14, wherein the one or more processors comprise a general-purpose graphics processing unit (GPGPU). (“The computer may be embodied as a single computer, as a  parallel computing system (a GPU or set of GPUs, a multicore system, etc.)... Any execution orderings, including parallel execution plans, that respect the partial ordering of the DAG are valid, creating opportunities for parallelization.  Parallel code may be generated, or parallel execution plans may be followed, across a wide range of potential parallel computing platform targets, including operating system or virtual machine threads/processes running on one or multiple cores on a  shared memory machine, a GPU, networked clusters of processors or GPUs,...”; Hutchison, [0069], [0176])
The computing system is, for example, a parallel computing system that includes a set of GPUs or a multicore system (one or more processors comprise a general purpose graphics processing unit (GPGPU)).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more processors comprise a general purpose graphics processing unit (GPGPU), in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison ([0080]).  
Re:  claim 23, Gupta in view of Norman and Hutchison teach 
23. (Previously Presented) The one or more processors of claim 21, wherein the one or more graph code nodes are part of a first graph data structure. (“Examples herein describe techniques for generating dataflow graphs using source code for defining kernels and communication links between those kernels… the graph is formed using nodes (e.g., kernels) which are communicatively coupled by edges (e.g., the communication links between the kernels).”; Gupta, [0032])
A dataflow graph (graph data structure) is generated.  The dataflow graph is generated by defining kernels (graph code nodes) and edges as part of the data structure (first graph data structure).  
Re:  claim 28, Gupta in view of Norman and Hutchison teach
28. (Previously Presented) The non-transitory machine readable medium of claim 27, wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to:  obtain code indicating at least the API to generate the one or more graph code nodes to allocate or  deallocate the memory; (“In general, the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100… embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).”; Gupta, [0137])
The APIs are used to change parameters of the dataflow graph that is executed on the SoC (cause the one or more processors to obtain code indicating at least one API to generate the one or more graph code nodes).
(“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Via the SoC, the user controls the API to reconfigure, load and initialize dataflow graphs, which include nodes (graph code nodes) (generate one or more graph code nodes).  
execute the code to perform the API to generate the one or more graph code nodes to allocate or deallocate the memory. (“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Via the SoC, the user controls the API to reconfigure, load and initialize dataflow graphs, which include nodes (graph code nodes) (generate one or more graph code nodes).  Gupta is silent, regarding allocating and deallocating memory, however, Hutchison teaches this limitation.
(“As shown in Fig. 13B, a DDG analyzer (1315) may be configured to receive DDG nodes and arcs (1314) data concerning the arcs connecting a node in the DDG to its dependency nodes and dependent nodes; and generate 1316P memory allocation (1391) and deallocation logic (1392) for the nodes.  The memory allocation logic is configured to allocate memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node and for the final value or collection of values generated by the expression... The memory deallocation logic (1392) may be configured: to free memory for the intermediate values generated during the computation of the expression corresponding to a specific DDG node once the expression has been executed, and to free memory for the final value or collection of values generated by the expression once the final value or collection of values is no longer needed...”; Hutchison, [0098], Fig. 13B) 
The DDG analyzer receives DDG nodes, where memory allocation logic and memory deallocation logic is generated for the corresponding nodes (generating graph code nodes).  The memory allocation logic of the nodes (graph code nodes) allocates memory (generate one or more graph code nodes to allocate memory).  The memory deallocation logic of the nodes (graph code nodes) deallocates memory (generate one or more graph code nodes to deallocate memory).  Hutchison is combined with Gupta and Norman such that the data flow graphs of Gupta are the DAG-structured DDGs of Hutchison.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of execute the code to perform the API to generate the one or more graph code nodes to allocate or deallocate the memory, in order to enable the DAG-structured DDG to include partial ordering such that the value computed by a dependency node in a DAG-structured DDG to be computed before it can be used as a parameter to the expression corresponding with a dependent node, as taught by Hutchison ([0080]).  
Re:  claim 31, Gupta in view of Norman and Hutchison teach 
31. (Previously Presented) The non-transitory machine-readable medium of claim 27, wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to: calculate a first set of operations; and cause one or more devices to use the memory to perform the first set of operations. (“The input graph 502 comprises a plurality of data nodes 512, a plurality of compute vertices 514, and a plurality of directional edges 516 each connecting from a data node to a vertex 514 or between vertices 514… To make efficient use of the multi-tile processor 2, tis requires that the data nodes 512 and vertices 514 are allocated across different tiles 4, determining which nodes 513 and vertices 514 are to be stored and run on the memory of which tiles… For each data node 512 in the graph 502, the graph interface 508a automatically inserts in the API 509 a corresponding memory allocation function for determining upon the memory of which tile or tiles 4 the data of the node 512 is to be stored.”; Norman, [0043], [0046], [0047], Fig. 5)
The multi-tile processor receives the input graph and determines which nodes and vertices are to be stored and run on the memory of which tiles (calculate a first set of operations and cause one or more devices to use the memory to perform the set of operations).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to: calculate a first set of operations; and cause one or more devices to use the memory to perform the first set of operations, in order to make efficient use of the multi-tile processor by allocating data nodes and vertices across different tiles, as taught by Norman. ([0046])  
Re:  claim 32, Gupta in view of Norman and Hutchison teach 
32. (Previously Presented) The non-transitory machine-readable medium of claim 27, wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to cause one or more devices to perform one or more operations indicated by a graph data structure comprising at least one of the one or more graph code nodes. (“The input graph 502 comprises a plurality of data nodes 512, a plurality of compute vertices 514, and a plurality of directional edges 516 each connecting from a data node to a vertex 514 or between vertices 514… To make efficient use of the multi-tile processor 2, tis requires that the data nodes 512 and vertices 514 are allocated across different tiles 4, determining which nodes 513 and vertices 514 are to be stored and run on the memory of which tiles… For each data node 512 in the graph 502, the graph interface 508a automatically inserts in the API 509 a corresponding memory allocation function for determining upon the memory of which tile or tiles 4 the data of the node 512 is to be stored.”; Norman, [0043], [0046], [0047], Fig. 5)
The multi-tile processor receives the input graph and determines which nodes and vertices are to be stored and run on the memory of which tiles (perform one or more operations indicated by a graph data structure comprising at least one or more graph code nodes).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to cause one or more devices to perform one or more operations indicated by a graph data structure comprising at least one of the one or more graph code nodes, in order to make efficient use of the multi-tile processor by allocating data nodes and vertices across different tiles, as taught by Norman. ([0046])  
Re:  claim 33, Gupta in view of Fan and Hutchison are silent regarding the one or more circuits are further to perform the API to add one or more graph node codes that depend from the one or more graph code nodes generated to allocate memory, however, Norman teaches 
33. (Previously Presented) The processor of claim 1, wherein the circuitry is further to add other one or more graph code nodes that depend from the one or more graph code nodes generated to allocate memory. (“After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation… The result is thus to output a graph 502’ in a second format, which does include the tile-mapping, i.e., is tagged with the allocations of which nodes 512 and vertices 514 are to be implemented on while tiles 4, as determined by calling the memory and vertex allocation functions in the API 509. ”; Norman, [0052], [0053], Fig. 4)
The API includes allocation functions, which when called (perform the API), add respective data nodes (add one or more graph code nodes) or vertex (add one or more graph code node) into a version of the graph represented in a second format (perform the API to add one or more graph code nodes that depend from the one or more graph code nodes generated to allocate memory).  
 (“Note that the API 509 itself does not contain the tile mappings.  Rather, it contains allocation functions which, when the API 509 is called, will generate the tile mappings.  A memory allocation function such as g.addVariable() declares that there is data node 512 to add to the second-format graph 502’.  A vertex function such as g.addVertex() declares that there is compute vertex 514 to add to the second-format graph 502’.  When these functions are called they add the relevant nodes 512 and vertices 514 to the second-format graph 502’ and allocate the respective data and computations to tiles 4, tagging the graph 502’ with this information.”; Norman, [0055])
 A memory allocation function of the API declares that there is a data node to add (add one or more graph code nodes) to the second format graph.  A vertex function of the API declares that there is a compute vertex to add (add one or more graph code nodes) to the second format graph.  Adding nodes and vertices to the graph includes adding nodes or vertices that depend form the one or more graph code nodes generated to allocate memory.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry is further to perform the API to add one or more graph node codes that depend from the one or more graph code nodes generated to allocate memory, in order to output a graph in a second format, which is tagged with allocations of which nodes and vertices are to be implemented on which tiles as determined by calling the memory and vertex allocation functions in the API, as taught by Norman, ([0053]).  
Claim(s) 4, 9, 24 and 25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Norman and Hutchison as applied to claim 1 above, and further in view of Fan et al. U.S. Patent No. 9,753,813.
Re:  claim 4, Gupta, Norman and Hutchison are silent regarding the circuitry is further to perform the API based at least in part on one or more parameter values indicating at least properties of the memory to be allocated, however, Fan teaches  
4. (Previously Presented) The one or more processors of claim 1, wherein the circuitry is further to perform the API based at least in part on one or more parameter values indicating at least properties of the memory to be allocated. (“In one example, a customer with at least one provisioned instance can call a “CreateVolume” or similar API, via Web services, which enables the customer to specify the amount of storage to be allocated, such as a value between gigabyte (GB) and 1 terabyte(TB), in 1 GB increments.”; Fan, col. 6, lines 48-53)
The CreateVolume API is used to allocate 1GB to 1TB of memory in 1GB increments (one or more parameter values indicating at least properties of the memory to be allocated).  The amount of memory to be allocated in 1GB increments is considered to be parameter values indicating at least properties of the memory to be allocated.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry is further to perform the API based at least in part on one or more parameter values indicating at least properties of the memory to be allocated, in order to allocate the desired amount of resources from the available resources, as taught by Fan. (col 6, lines 53-57) 
Re:  claim 9, Gupta, Norman and Hutchison are silent regarding the one or more processors are further to perform the API based at least in part on a set of parameter values indicating at least a size of the memory to be allocated, however, Fan teaches 
9. (Original) The system of claim 8, wherein the one or more processors are further to perform the API based at least in part on a set of parameter values indicating at least a size of the memory to be allocated. (“In one example, a customer with at least one provisioned instance can call a “CreateVolume” or similar API, via Web services, which enables the customer to specify the amount of storage to be allocated, such as a value between gigabyte (GB) and 1 terabyte(TB), in 1 GB increments.”; Fan, col. 6, lines 48-53)
The CreateVolume API is used to allocate 1GB to 1TB of memory in 1GB increments (a set of parameter values indicating at least a size of the memory to be allocated).  The amount of memory to be allocated in 1 GB increments is considered to be parameter values indicating at least a size of the memory to be allocated.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more processors are further to perform the API based at least in part on a set of parameter values indicating at least a size of the memory to be allocated, in order to allocate the desired amount of resources from the available resources, as taught by Fan. (col 6, lines 53-57) 
Re:  claim 24, Gupta, Norman and Hutchison are silent regarding the circuitry is further to cause a device to allocate the memory based at least in part on an identified memory region, however, Fan teaches 
24. (Previously Presented) The one or more processors of claim 21, wherein the circuitry is further to cause a device to allocate the memory based at least in part on an identified memory region. (“… a customer can make a Web service call into an appropriate Application Programming Interface (API) of a Web service layer in the system to provision a data volume and attach that volume to a data instance for that customer… a customer with at least one provisioned instance can call a “CreateVolume” or similar API, via Web services, which enables the customer to specify the amount of storage to be allocated, such as a value between 1 gigabyte (GB) and 1 terabyte (TB), in 1GB increments.  Components of the control plane, such as a BDS system manager module, can call into the data plane to allocate the desired amount of storage from the available resources, and can provide the customer with an identifier for the data volume.”; Fan, col. 6, lines 40-57, Fig. 2) 
Memory is allocated in 1 GB increments to a memory region and identified with an indicator (allocate the memory based at least in part on an identified memory region).  Fan is combined with Gupta such that the API generating the graphs of Gupta for allocation of the memory of Fan.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry is further to cause a device to allocate the memory based at least in part on an identified memory region, in order to allocate the desired amount of resources from the available resources, as taught by Fan. (col 6, lines 53-57)
Re:  claim 25, Gupta, Norman and Hutchison are silent regarding the circuitry is 
25. (Previously Presented) The one or more processors of claim 21, wherein the circuitry is further to perform the API based at least in part on parameter values indicating constraints for allocating and deallocating the memory. (“In one example, a customer with at least one provisioned instance can call a “CreateVolume” or similar API, via Web services, which enables the customer to specify the amount of storage to be allocated, such as a value between gigabyte (GB) and 1 terabyte(TB), in 1 GB increments.”; Fan, col. 6, lines 48-53)
The CreateVolume API is used to allocate 1GB to 1TB of memory in 1GB increments (a set of parameter values indicating at least a size of the memory to be allocated).  The amount of memory to be allocated in 1GB increments is considered to be parameter values indicating constraints for allocating and deallocating memory. Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the circuitry is further to perform the API based at least in part on parameter values indicating constraints for allocating and deallocating the memory, in order to allocate the desired amount of resources from the available resources, as taught by Fan. (col 6, lines 53-57) 
Claim(s) 11, 15 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Norman and Hutchison as applied to claims 8 and 14 above, and further in view of Garcia-Morchon EP 3 534 253 A1.  
Re:  claim 11 and 15 (which are rejected under the same rationale), Gupta in view of Norman and Hutchison are silent regarding, the one or more graph code nodes encode properties of the allocated memory, however, Garcia-Morchon teaches 
11. (Previously Presented) The system of claim 8, wherein the one or more graph code nodes are to encode properties of the allocated memory. (“For example, operation nodes in the dataflow graph may have an associated encoding memory requirement.”; Garcia-Morchon, [0126])
Operation nodes in a dataflow graph (graph code nodes) have an associated encoding memory requirement (encode properties of the allocated memory).  Garcia-Morchon is combined with Gupta, Norman and Hutchison such that the method of Gupta includes the encoding of Garcia-Morchon.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more graph code nodes are to encode properties of the allocated memory, in order to assign different encoding from the smallest associated size until the available memory for encodings is exhausted, as taught by Garcia-Morchon. ([0126])  
Re:  claim 17, Gupta in view of Norman and Hutchison are silent regarding, the one or more graph code nodes are data objects that encode information regarding memory allocation, and further wherein the information is calculated based at least in part on one or more parameter values, however, Garcia-Morchon teaches 
17. (Previously Presented) The non-transitory machine-readable medium of claim 14, wherein the one or more graph code nodes are data objects that encode information regarding memory allocation, and further wherein the information is calculated based at least in part on one or more parameter values. (“For example, operation nodes in the dataflow graph may have an associated encoding memory requirement.  For example operator may be implemented as an encoded operator in the form of a polynomial over a finite field, or in the form of a look-up table.  The associated encoding memory requirement may be tow to the power of the total bit size of the input(s) times the number of bits in the output of the operator.”; Garcia-Morchon, [0126])
The operator in an operation node in a dataflow graph (one or more graph code nodes) may be implemented as an encoded operator (data objects that encode information regarding memory allocation) in the form of a look-up table.  The associated encoding memory requirement (memory allocation) may be two times the power of the total bit size of the input times the number of bits in the output of the operator For example, operation nodes in the dataflow graph may have an associated encoding memory requirement (calculated based at least in part on one or more parameter values).  Garcia-Morchon is combined with Gupta, Norman and Hutchison such that the method of Gupta includes the encoding of Garcia-Morchon.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the one or more graph code nodes are data objects that encode information regarding memory allocation, and further wherein the information is calculated based at least in part on one or more parameter values, in order to assign different encoding from the smallest associated size until the available memory for encodings is exhausted, as taught by Garcia-Morchon. ([0126])  
Claim(s) 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Norman and Hutchison as applied to claim 14 above, and further in view of Fan and Lin U.S. Pub. No. 2020/0202246.   
Re:  claim 19, Gupta in view of Norman and Hutchison teach
19. (Previously Presented) The non-transitory machine-readable medium of claim 14, wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to:  add the one or more graph code nodes as part of a first graph data structure; (“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Via the SoC, the user controls the API to reconfigure, load and initialize dataflow graphs, which include nodes (graph code nodes).  
(“Fig. 6 is a graph source code 420 for defining a dataflow graph…  the graph source code 420 can be thought of establishing a data structure in the heterogeneous programming environment which the programmer builds using the kernels 605 and communication links 620.”; Gupta, [0070], Fig. 6)
The graph source code defines a data flow graph and establishes a data structure (first graph data structure).  The dataflow graph is built using kernels (generate graph code nodes as part of a first graph data structure).   
Gupta and Hutchison are silent regarding add the one or more graph code nodes, however Norman teaches this limitation.
“After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation… The result is thus to output a graph 502’ in a second format, which does include the tile-mapping, i.e., is tagged with the allocations of which nodes 512 and vertices 514 are to be implemented on while tiles 4, as determined by calling the memory and vertex allocation functions in the API 509. ”; Norman, [0052], [0053], Fig. 4)
The API includes allocation functions, which when called, add respective data nodes (add the one or more graph code nodes) or vertex (add the one or more graph code nodes) into a version of the graph (as part of a first graph data structure) represented in a second format.  
 (“Note that the API 509 itself does not contain the tile mappings.  Rather, it contains allocation functions which, when the API 509 is called, will generate the tile mappings.  A memory allocation function such as g.addVariable() declares that there is data node 512 to add to the second-format graph 502’.  A vertex function such as g.addVertex() declares that there is compute vertex 514 to add  to the second-format graph 502’.  When these functions are called they add the relevant nodes 512 and vertices 514 to the second-format graph 502’ and allocate the respective data and computations to tiles 4, tagging the graph 502’ with this information.”; Norman, [0055])
 A memory allocation function of the API declares that there is a data node to add (add the one or more graph code nodes as part of the first data structure) to the second format graph.  A vertex function of the API declares that there is a compute vertex to add (add the one or more graph code nodes as part of the first graph data structure) to the second format graph.  Adding nodes and vertices to the graph includes adding nodes or vertices that depend form the one or more graph code nodes generated to allocate memory.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of add the one or more graph code nodes as part of a first graph data structure, in order to output a graph in a second format, which is tagged with allocations of which nodes and vertices are to be implemented on which tiles as determined by calling the memory and vertex allocation functions in the API, as taught by Norman ([0053]).  
Gupta, Norman and Hutchison teach cause the memory to be allocated based at least in part on the one or more graph code nodes; (“… the programmer can use the control APIs 1905 to change parameters that control the execution of the dataflow graph 440 on the SoC 100.  That is, embodiments herein use the APIs 1905 and corresponding methods to control, interact, and at least partially reconfigure a user application (e.g., the dataflow graph 440) executing on the heterogeneous processing system of the SoC 100 through a local control program compiled from the control source code 430, or by executing the control source code on the PS itself).  Using the control APIs 1905, users can manipulate such remotely executing graphs directly as local objects and perform control operations on them, (e.g., for loading and initializing the graphs;…)”; Gupta, [0137], Fig. 4) 
Via the SoC, the user controls the API to reconfigure, load and initialize dataflow graphs, which include nodes (one or more graph code nodes).  Gupta is silent regarding allocating memory, however, Fan teaches this limitation.  
(“… a customer can make a Web service call into an appropriate Application Programming Interface (API) of a Web service layer in the system to provision a data volume and attach that volume to a data instance for that customer… a customer with at least one provisioned instance can call a “CreateVolume” or similar API, via Web services, which enables the customer to specify the amount of storage to be allocated, such as a value between 1 gigabyte (GB) and 1 terabyte (TB), in 1GB increments.”; Fan, col. 6, lines 40-53, Fig. 2) 
The user uses an API to specify an amount of storage to be allocated.  Fan is combined with Gupta such that the API generating the graphs of Gupta for allocation of the memory of Fan.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of cause the memory to be allocated based at least in part on the one or more graph code nodes, in order to allocate the desired amount of resources from the available resources, as taught by Fan. (col 6, lines 53-57)
Gupta in view of Norman, Hutchison and Fan are silent regarding, obtain a second graph data structure indicating one or more operations, however, Lin teaches obtain a second graph data structure indicating one or more operations; (“Data flow graph:  A data flow graph is a data structure, in a graph form, that represents a flow direction and a computation relationship of data in computational logic to reflect a design principle and an embodiment of the computational logic… the data flow graph is preloaded onto the platform before computation.  This preloading process includes defining a node, an edge, and a parameter at the edge that are included in the data flow graph.”; Lin, [0065])
A data flow graph includes at least one node (a first node of the one or more graph code nodes)
(“The second graph data structure in the second computing node stores the name, the size, and the communication peer side identifier of the first data flow graph parameter in the second data flow graph.”; Lin, [0118])
The second graph data structure is obtained and includes a second data flow graph that indicates a flow direction and a computation relationship of data in computational logic (obtain a second graph data structure indicating one or more operations).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of obtain a second graph data structure indicating one or more operations, in order to include peer side identifiers to resolve a problem that a process of a communication peer side is unknown in a data flow graph running process, as taught by Lin. ([0129]) 
Gupta, Hutchison, Fan and Lin are silent regarding, cause one or more devices to perform the one or more operations utilizing the allocated memory, however, Norman teaches 
cause one or more devices to perform the one or more operations utilizing the allocated memory. (“The input graph 502 comprises a plurality of data nodes 512, a plurality of compute vertices 514, and a plurality of directional edges 516 each connecting from a data node to a vertex 514 or between vertices 514… To make efficient use of the multi-tile processor 2, tis requires that the data nodes 512 and vertices 514 are allocated across different tiles 4, determining which nodes 513 and vertices 514 are to be stored and run on the memory of which tiles… For each data node 512 in the graph 502, the graph interface 508a automatically inserts in the API 509 a corresponding memory allocation function for determining upon the memory of which tile or tiles 4 the data of the node 512 is to be stored.”; Norman, [0043], [0046], [0047], Fig. 5)
The multi-tile processor receives the input graph and determines which nodes and vertices are to be stored and run on the memory of which tiles (cause one or more devices to perform the one or more operations using the allocated memory).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of cause one or more devices to perform the one or more operations utilizing the allocated memory, in order to make efficient use of the multi-tile processor by allocating data nodes and vertices across different tiles, as taught by Norman. ([0046])  
Claim(s) 29 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Norman and Hutchison as applied to claims 27 above, and further in view of Lin.  
Re:  claim 29, Gupta in view of Norman and Hutchison are silent regarding, a first node of the one or more graph code nodes is part of a first graph data structure and a second node of the one or more graph code nodes is part of a second graph data structure, however, Lin teaches 
29. (Previously Presented) The non-transitory machine-readable medium of claim 27, wherein the first graph code node of the one or more graph code nodes is part of a first graph data structure and the second graph code node of the one or more graph code nodes is part of a second graph code node data structure. (“Data flow graph:  A data flow graph is a data structure, in a graph form, that represents a flow direction and a computation relationship of data in computational logic to reflect a design principle and an embodiment of the computational logic… the data flow graph is preloaded onto the platform before computation.  This preloading process includes defining a node, an edge, and a parameter at the edge that are included in the data flow graph.”; Lin, [0065])
A data flow graph includes at least one node (first graph code node of the one or more graph code nodes)
(“The first computing node generates a first triplet based on a name, a size and a communication peer side identifier of a first data flow graph parameter in a first graph data structure according to a first interface parameter generation algorithm… The second computing node generates a second triplet based on the name, the size, and a communication peer side identifier of the first data flow graph parameter in the second graph data structure according to a second interface parameter generation algorithm…”; Lin, [0107], [0117])
The first graph data structure includes a first data flow graph, which includes the first graph code node (first graph code node of the one or more graph code nodes is part of a first graph data structure).  And, the second graph data structure includes a second data flow graph, which includes a second node (second graph code node of the one or more graph code nodes is part of a second graph code node data structure).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta by adding the feature of the first graph code node of the one or more graph code nodes is part of a first graph data structure and the second graph code node of the one or more graph code nodes is part of a second graph data structure, in order to include peer side identifiers to resolve a problem that a process of a communication peer side is unknown in a data flow graph running process, as taught by Lin. ([0129]) 
Claim(s) 30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Norman and Hutchison as applied to claims 27 above, and further in view of Kemisetti et al. U.S. Pub. No. 2022/0083464.  
Re:  claim 30, Gupta in view of Fan and Hutchison are silent, regarding, the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to cause a central processing unit (CPU) to use the one or more graph code nodes to allocate or deallocate the memory, however, Kemisetti teaches 
30. (Previously Presented) The non-transitory machine-readable medium of claim 27, wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to cause a central processing unit (CPU) to use the one or more graph code nodes to allocate or deallocate the memory. (“… the memory pool management 198 executes on a central processing unit (CPU) of a system on a chip (SoC)… the request is a request for user space system memory of the SoC to be used by one or more processes of a graphics processing unit (GPU) of the SoC… A binary search tree is a data structure formed as a rooted tree wherein each node stores a key i) that is greater than all keys in the node’s left subtree, and ii) that is less than all the keys in the node’s right subtree… Binary search tree 500 is a self-balancing red-black tree for logical pages memory of 4kB page order… The color of a node is used to facilitate balancing of the binary search tree, e.g., upon insertion (memory block free/available) or deletion (memory block allocated) of a node…”; Kemisetti, [0034], [0051], [0052], Fig. 5)
The binary search tree (graph code nodes) is used to allocate memory blocks and free (deallocate) memory blocks.  
(“… the memory management process executes on an CPU 215 of an SoC device 104 as a kernel graphics support layer (KGSL)… The KGSL maintains separate binary search trees (such as binary search tree 500 for memory blocks of 4 kB page order) for each page order.”; Kemisetti, [0054])
The memory management process executes on a CPU as a KGSL, which includes separate binary search trees (graph code nodes) for each page order (CPU to use one or more graph code nodes to allocate and deallocate memory).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta, by adding the feature of the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to cause a central processing unit (CPU) to use the one or more graph code nodes to allocate or deallocate the memory, in order to enable the memory management process to execute on a CPU of an SoC such that memory is provided for use by one or more processes of the GPU, as taught by Kemisetti ([0006]).  
Claim(s) 34, 35, 36 and 37 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gupta in view of Norman and Hutchison as applied to claim 1 above, and further in view of Yu C. et al. OpenMP to CUDA graphs: a compiler-based transformation to enhance the programmability of NVIDIA devices. InProceedings of the 23th International Workshop on Software and Compilers for Embedded Systems 2020 May 25 (pp. 42-47).  
Re:  claim 34, Gupta, Norman and Hutchison are silent regarding the one or more graph code nodes are added during execution of the graph, however, Yu teaches
34. (New) The one or more processors of claim 1, wherein the one or more graph code nodes are added during execution of the graph. (“A CUDA graph is a set of nodes representing operations, i.e., memory operations and kernel launches, connected by edges representing run-after dependencies... CUDA 10 includes explicit APIs for creating graphs, e.g., cudaGraphCreate, to create a graph; cudaGraphAddKernelNode/ cudaGraphAddHostNode, to add a new node to the graph with the corresponding run-after dependencies with previous nodes, to be executed on the host/GPU”; Yu, p. 4, 1st para under “3.2 CUDA Graphs”) 
The API uses ; cudaGraphAddKernelNode/ cudaGraphAddHostNode to add a new node to the graph for execution on the GPU.     
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta, by adding the feature of the one or more graph code nodes are added during execution of the graph, in order to leverage the benefits of CUDA graphs while managing tedious tasks such as data movements, dependencies, and synchronizations, as taught by Yu (p. 3, 7th para under “2 Converging performance and programmability”).
Re:  claim 35, Gupta, Norman and Hutchison are silent regarding the one or more graph code nodes are added according to one or more input parameters provided to the API, however, Yu teaches
35. (New) The one or more processors of claim 1, wherein the one or more graph code nodes are added according to one or more input parameters provided to the API. (“A CUDA graph is a set of nodes representing operations, i.e., memory operations and kernel launches, connected by edges representing run-after dependencies... CUDA 10 includes explicit APIs for creating graphs, e.g., cudaGraphCreate, to create a graph; cudaGraphAddKernelNode/ cudaGraphAddHostNode, to add a new node to the graph with the corresponding run-after dependencies with previous nodes, to be executed on the host/GPU”; Yu, p. 4, 1st para under “3.2 CUDA Graphs”) 
New nodes are added to the graph using parameters such as, cudaGraphAddKernelNode/ cudaGraphAddHostNode (input parameters) provided to the API.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta, by adding the feature of the one or more graph code nodes are added according to one or more input parameters provided to the API, in order to leverage the benefits of CUDA graphs while managing tedious tasks such as data movements, dependencies, and synchronizations, as taught by Yu (p. 3, 7th para under “2 Converging performance and programmability”).  
Re:  claim 36, Gupta, Norman and Hutchison are silent regarding one or more input parameters of the API specify one or more locations to add the one or more graph code nodes to the one or more graphs, however, Yu teaches
36. (New) The one or more processors of claim 1, wherein one or more input parameters of the API specify one or more locations to add the one or more graph code nodes to the one or more graphs. (“A CUDA graph is a set of nodes representing operations, i.e., memory operations and kernel launches, connected by edges representing run-after dependencies... CUDA 10 includes explicit APIs for creating graphs, e.g., cudaGraphCreate, to create a graph; cudaGraphAddKernelNode/ cudaGraphAddHostNode, to add a new node to the graph with the corresponding run-after dependencies with previous nodes, to be executed on the host/GPU”; Yu, p. 4, 1st para under “3.2 CUDA Graphs”) 
New nodes are added to the graph using input parameters such as, cudaGraphAddKernelNode/ cudaGraphAddHostNode, which include run-after dependencies with previous nodes (specify one or more locations to add the one or more graph code nodes to the one or more graphs).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta, by adding the feature of one or more input parameters of the API specify one or more locations to add the one or more graph code nodes to the one or more graphs, in order to leverage the benefits of CUDA graphs while managing tedious tasks such as data movements, dependencies, and synchronizations, as taught by Yu (p. 3, 7th para under “2 Converging performance and programmability”).  
Re:  claim 37, Gupta, Norman and Hutchison are silent regarding the one or more graphs include one or more other graph code nodes corresponding to one or more operations and one or more edges corresponding to one or more dependencies between the one or more other graph code nodes, and wherein the one or more graph code nodes are to be added as being dependent from the other graph code nodes, however, Yu teaches
37. (New) The one or more processors of claim 1, wherein the one or more graphs include one or more other graph code nodes corresponding to one or more operations and one or more edges corresponding to one or more dependencies between the one or more other graph code nodes, and wherein the one or more graph code nodes are to be added as being dependent from the other graph code nodes. (“A CUDA graph is a set of nodes representing operations, i.e., memory operations and kernel launches, connected by edges representing run-after dependencies... CUDA 10 includes explicit APIs for creating graphs, e.g., cudaGraphCreate, to create a graph; cudaGraphAddKernelNode/ cudaGraphAddHostNode, to add a new node to the graph with the corresponding run-after dependencies with previous nodes, to be executed on the host/GPU”; Yu, p. 4, 1st para under “3.2 CUDA Graphs”) 
A CUDA graph is a set of nodes representing operations (one or more graphs include one or more other graph code nodes corresponding to one or more operations) connected by edges representing run-after dependencies (one or more edges corresponding to one or more dependencies between the one or more other graph code nodes).  New nodes are added to the graph using input parameters such as, cudaGraphAddKernelNode/ cudaGraphAddHostNode, which include run-after dependencies with previous nodes (one or more graph code nodes are to be added as being dependent from other graph code nodes).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date, to modify the method of Gupta, by adding the feature of the one or more graphs include one or more other graph code nodes corresponding to one or more operations and one or more edges corresponding to one or more dependencies between the one or more other graph code nodes, and wherein the one or more graph code nodes are to be added as being dependent from the other graph code nodes, in order to leverage the benefits of CUDA graphs while managing tedious tasks such as data movements, dependencies, and synchronizations, as taught by Yu (p. 3, 7th para under “2 Converging performance and programmability”).  

Response to Arguments
Applicant's arguments filed 3/19/2026 have been fully considered but they are not persuasive.  Applicant argues regarding claim 1:  
“Norman fails to remedy the deficiencies of Gupta. For example, paragraph [0046] of Norman states: "To generate the API 509, the graph interface 508 a systematically searches through the input graph 502 and allocates a memory allocation function to each of the nodes 502 and compute vertices 514." Paragraph [0052] of Norman further states "[a]fter it is generated, the API 509 is then automatically called, which comprises calling the allocation functions. When called, each allocation functions adds its respective data node 512 or vertex 514 into a version of the graph 512 represented in a second format 512', and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation." As stated, Norman at best searches an input graph that already exists, and then "annotates the graph with this allocation." As also shown in FIGs. 5 and 6 of Norman, the input data node 512 and compute vertex 514 already exist in the input graph. Thus, for at least the reasons above, Norman fails to teach or suggest "circuitry to, in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to the one or more graphs to allocate memory, wherein the circuitry is to further cause the one or more processors to allocate the memory based, at least in part, on the one or more graph code nodes," as recited in claim 1. 
For at least reasons discussed above, Applicant respectfully submits that the proposed combination of Gupta, Norman, and Hutchison does not teach such subject matter as recited in claims 8 and 14. Therefore, Applicant respectfully submits that claims 8 and 14 are allowable under 35 U.S.C. § 103 over Gupta in view of Norman and further in view of Hutchison.“
Examiner disagrees.  Norman teaches this amended limitation of claim 1.  Norman teaches in [0046], “To generate the API 509, the graph interface 508a systematically searches through an input graph 502 and allocates a memory allocation function to each of the nodes 502 and compute vertices 514.”  Thus, the API searches through an input graph and allocates a memory allocation function to each of the nodes and compute vertices.  And, Norman teaches in [0052], “After it is generated, the API 509 is then automatically called, which comprises calling the allocation functions.  When called each allocation function adds its respective data nodes 512 or vertex 514 into a version of the graph 512 represented in a second format 512’, and also determines which tile or tiles 4 that node or vertex 512, 514 is to be implemented on in the final program 506, and annotates the graph with this allocation…” (Norman, [0052], Fig. 4).  Thus, after the memory allocation functions are added to each of the nodes and compute vertices, the allocation functions are called, which causes each of the allocation functions to add its respective data nodes or vertex to a version of the graph.  Adding data nodes or vertices to the graph teaches the amended limitation of “increases a number of graph code nodes in one or more graphs to allocate memory.“
Applicant's arguments filed 3/19/2026 have been fully considered but they are not persuasive.  Applicant argues regarding claims 8 and 14:  
“Applicant respectfully submits that claims 8 and 14 are allowable at least for reasons including some of those discussed above in connection with claim 1.  For example, claim 8 recites "one or more computers having one or more processors to, in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate memory, wherein the one or more processors are to allocate the memory based, at least in part, on the one or more graph code nodes."  For example, claim 14 recites in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate memory, wherein the one or more processors are to allocate the memory based, at least in part, on the one or more graph code nodes." ”
Examiner disagrees.  As discussed immediately above regarding claim 1, Norman teaches the amended limitations of claims 8 and 14.  
Applicant's arguments filed 3/19/2026 have been fully considered but they are not persuasive.  Applicant argues regarding claim 21:  
“Without acquiescing to the rejection, claim 21 has been amended to further expedite prosecution.  Applicant respectfully submits that claims 21 is allowable at least for reasons including some of those discussed above in connection with claim 1.  For example, claim 21 recites "circuitry to, in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate or deallocate memory, wherein the circuitry is further to add a first graph code node to allocate the memory or add a second graph code node to deallocate the memory according to the call." ”
Examiner disagrees.  As discussed above regarding claim 1, this amended limitation is taught by Norman.  Please see the rejection for claim 21.  
Applicant's arguments filed 3/19/2026 have been fully considered but they are not persuasive.  Applicant argues regarding claim 27:  
“Applicant respectfully submits that claims 27 is allowable at least for reasons including some of those discussed above in connection with claim 21. For example, claim 27 recites "in response to a call of an application programming interface (API), increase a number of graph code nodes in one or more graphs by adding one or more graph code nodes to one or more graphs to allocate or deallocate memory, wherein the one or more processors are further to add a first graph code node to allocate the memory or add a second graph code node to deallocate the memory according to the call." ”
Examiner disagrees.  As discussed above regarding claim 1, this amended limitation is taught by Norman.  Please see the rejection for claim 27.  
Applicant's arguments filed 3/19/2026 have been fully considered but they are not persuasive.  Applicant argues regarding dependent claims 2-7, 9-13, 15-20, 22-26 and 28-32:  
“Claims 2-7, 9-13, 15-20, 22-26 and 28-32 each depend from one of claims 1, 8, 14, 21 and 27 described above. Accordingly, Applicant respectfully submits that claims 2-7, 9-13, 15- 20, 22-26 and 28-32 are allowable at least for depending from an allowable independent claim. In addition, Applicant respectfully submits that at least some of claims 2-7, 9-13, 15-20, 22-26 and 28-32 additionally recite patentable subject matter not taught or otherwise rendered obvious by Gupta, Fan, Hutchison, Norman, Garcia-Morchon, Lin, and Kemisetti, individually or in combination.”
Examiner disagrees.  Claims 1, 8, 14, 21, 27 as well as claims 2-7, 9-13, 15-20, 22-26 and 28-32 have been rejected, please see the corresponding rejections.  

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DONNA J RICKS whose telephone number is (571)270-7532.  The examiner can normally be reached on M-F 7:30am-5pm EST (alternate Fridays off).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Devona Faulk can be reached on 571-272-7776.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Donna J. Ricks/Examiner, Art Unit 2612 



/DEVONA E FAULK/Supervisory Patent Examiner, Art Unit 2618
Read full office action
Prosecution Timeline

Nov 12, 2021
Application Filed
Oct 18, 2023
Non-Final Rejection — §103, §112
Mar 04, 2024
Interview Requested
Mar 12, 2024
Examiner Interview Summary
Mar 12, 2024
Applicant Interview (Telephonic)
Apr 23, 2024
Response Filed
Jul 18, 2024
Final Rejection — §103, §112
Jan 27, 2025
Request for Continued Examination
Jan 29, 2025
Response after Non-Final Action
Feb 10, 2025
Non-Final Rejection — §103, §112
Apr 04, 2025
Interview Requested
May 06, 2025
Examiner Interview Summary
May 06, 2025
Applicant Interview (Telephonic)
Jun 13, 2025
Response Filed
Jul 11, 2025
Non-Final Rejection — §103, §112
Aug 22, 2025
Interview Requested
Oct 02, 2025
Examiner Interview Summary
Oct 02, 2025
Applicant Interview (Telephonic)
Oct 16, 2025
Response Filed
Jan 20, 2026
Final Rejection — §103, §112
Feb 13, 2026
Interview Requested
Mar 19, 2026
Response after Non-Final Action
Mar 27, 2026
Request for Continued Examination
Mar 28, 2026
Response after Non-Final Action
Apr 01, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/520,089
Patent 12602751
SAMPLE DISTRIBUTION-INFORMED DENOISING & RENDERING
2y 5m to grant Granted Apr 14, 2026
18/483,972
Patent 12592021
GRAPHICS PROCESSING
2y 5m to grant Granted Mar 31, 2026
17/128,708
Patent 12579726
HIERARCHICAL TILING MECHANISM
2y 5m to grant Granted Mar 17, 2026
18/191,978
Patent 12573133
Reprojection method of generating reprojected image data, XR projection system, and machine-learning circuit
2y 5m to grant Granted Mar 10, 2026
17/980,479
Patent 12555281
MANAGING MULTIPLE DATASETS FOR DATA BOUND OBJECTS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

6-7
Expected OA Rounds
77%
Grant Probability
86%
With Interview (+8.8%)
2y 9m
Median Time to Grant
High
PTA Risk
Based on 502 resolved cases by this examiner. Grant probability derived from career allow rate.
MEMORY ALLOCATION USING GRAPHS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email