Last updated: May 29, 2026
Application No. 17/234,477
SYSTEMS AND METHODS FOR DYNAMICALLY UPDATING A NEURAL NETWORK HAVING A PLURALITY OF KERNELS

Non-Final OA §103
Filed
Apr 19, 2021
Examiner
MILLS, FRANK D
Art Unit
2194
Tech Center
2100 — Computer Architecture & Software
Assignee
Nvidia Corporation
OA Round
3 (Non-Final)
Interview Optional

— +22.8% interview lift. Examiner has a relatively high allowance rate (69%); +22.8% interview lift. A written response may suffice.
Based on 600 resolved cases, 2023–2026
Examiner Intelligence

MILLS, FRANK D View full profile →
Grants 69% — above average
Career Allowance Rate
415 granted / 600 resolved
+14.2% vs TC avg
Strong +23% interview lift
Without
With
+22.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
10 currently pending
Career history
623
Total Applications
across all art units
Statute-Specific Performance

§101
2.2%
-37.8% vs TC avg
§103
88.9%
+48.9% vs TC avg
§102
3.7%
-36.3% vs TC avg
§112
3.0%
-37.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 600 resolved cases
Office Action

§103
DETAILED ACTION
This action is in response to the request for continuing examination received 12/30/2025. After consideration of applicant's amendments and/or remarks:
Applicant amends claims 1, 2, 4-11, 17-18, 26-27, 29, 50-51, 53-54, and 56.
Claims 1-33 and 50-65 rejected under 35 USC § 103.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5-8, 10, 12-13, 17-18, 22-24, 26-29, 33, 50, 55-59, and 61-62 are rejected under 35 U.S.C. 103 as being unpatentable over Shafiq et al., U.S. PG-Publication No. 2021/0182036 A1, in view of Chen et al., US PG-Publication No. 2021/0081769 A1, further in view of Sekiyama et al., U.S. PG-Publication No. 2018/0365558 A1.

Claim 1
	Shafiq discloses a method. Shafiq discloses methods “to enable hardware platform specific operator fusions in a machine learning neural network,” wherein the “operator fusions may be performed on a computation graph (e.g., representing a Neural Network) during an optimization process.” Shafiq, ¶ 5.
Shafiq discloses the method comprising: determining one or more characteristics of kernels of a neural network, the one or more characteristics indicating types of operations performed by the kernels. A computation graph is a “representation of the neural network, in which ops are computation blocks and the connections between ops represents how data flow therebetween” (i.e., determining characteristics of a neural network). Id. at ¶ 27. The computation graph “indicates a set of operations to be performed” (i.e., indicating types of operations performed), wherein the “operations are performed by respective operations” (respective operations → types of operations). Each node in the computation graph “is associated with an operator of the neural network.” Id. at ¶ 31. Further, for “each op supported by the target hardware, a corresponding execution kernel is provided” (i.e., operations performed by the kernels of a neural network). A compiler “maps the ops specified therein (in the computation graph) to their corresponding execution kernels.” Id. at ¶ 38.
	Shafiq discloses comparing the one or more characteristics to a dynamic rule set to determine that the types of operations performed by a set of the kernels are replaceable with a smaller set of one or more kernels. Shafiq discloses a pattern file 204 indicating “a list of fusion patterns associated with a … target hardware platform,” wherein the fusion patterns “represent sets of operators that can be performed … by the hardware execution device as a unitary operation.” The unitary operation “can implement the fused operations based on a single instruction, rather than a series of instructions.” Each pattern in the pattern file “is assigned a corresponding fused operator name, which also corresponding to the underlying execution kernel” (i.e. fused operator → different operation type). Id. at ¶¶ 36-37. Compiler 201 includes a pattern matcher 205 that “analyzes the computation graph 202 using the pattern file 204,” for “identifying pattern matches” that occur “when a portion of the computation graph 202 matches with one of the patterns in the pattern file 204.” (i.e. comparing characteristics to a dynamic rule set). When a “pattern match is identifier, the compiler replaces the matches portion with a single node … referred to as a fused operator.” (i.e., replaceable with different operation type). Id. at ¶ 41.
Shafiq discloses updating the instantiated neural network to combine the set of the kernels into the smaller set of one or more kernels. Shafiq discloses “the pattern matcher outputs a new computation graph 203 that corresponds to the input computation graph 202, but in which at least some and typically all instances of those patterns in the pattern file 204 which occur in the input computation graph 202 are replaced with their corresponding fused operators” (i.e., generating instruction to combine the set of kernels into a smaller set of kernels). Each fused operator “includes at least two operators, of the input computation graph, which can be fused together, in the sense that the target hardware platform can implement the two operators together in a particular way that reflects the operator interdependencies given in the computation graph 202” (fused operator → smaller set of kernels that implement one or more operation types). Id. at ¶ 42. . The method uses a “fused kernel library 206” containing “supported underlying fused kernels corresponding to each fused operator” and a “kernel library 208” containing “kernels corresponding to each individual operator that may potentially occur in the new computation graph 203.” Id. at ¶ 44. The method generates “a new host source code module 207” that “reflects the new computation graph … in terms of launching and implementing, in order, those kernels which correspond to nodes in the new computation graph 203” (i.e., updating the neural network based on the instructions). The implemented kernels are “obtained from the libraries 206, 208.” Id. at ¶ 45.
	Shafiq does not expressly disclose instantiating the neural network using a first quantity of a hardware resource allocated to the kernels based at least on the one or more determined characteristics of the kernels.
	Chen discloses instantiating the neural network using a first quantity of a hardware resource allocated to the kernels based at least on the one or more determined characteristics of the kernels. Chen is directed to technology for “allocating available physical compute units (PCUs) and/or physical memory units (PMUs) of a reconfigurable data processor to operation unit of an operation unit graph for execution thereof.” Chen, Abstract. Chen illustrates fusion 200, wherein fuser 214 takes as input operation unit graph 204 (e.g., code implementing “convolutional neural network … processing”) and architectural hints 202 (e.g., list of node patterns “that are fused into one operation which can be executed on … the processor 100”) to produce “a fused operation unit graph 224” (e.g., code implementing CNN based on characteristics of the kernels). Id. at ¶¶ 45-46, 71, FIG. 2. Fusion 200 further implements an allocator 234 that “allocates the physical compute units and/or physical memory units of the reconfigurable data processor 100 to the fused operation unit graph 224” (i.e., first quantity of a hardware resource allocated to the kernels) and an executer 244 that “executes the fused operation unit graph 224 on the reconfigurable data processor 100 based on the allocation.” Id. at ¶¶ 86-87. In one embodiment, “performance estimates are used for allocating available physical compute unit and/or physical memory units of the reconfigurable data processor 100 to operation units of the operation unit graph for execution thereof.” Id. at ¶ 90.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of performing hardware specific fusions in an operator graph representing a neural network of Shafiq to incorporate allocating hardware for a fused operator graph representing a neural network as taught by Chen. One of ordinary skill in the art would be motivated to integrate allocating hardware for a fused operator graph representing a neural network into Shafiq, with a reasonable expectation of success, in order to optimize hardware usable through “[i]mproved parallelization and hardware utilization.” Chen, ¶ 17.
	Shafiq-Chen does not expressly disclose based at least on the updating, reducing, for the instantiated neural network, the first quantity of the hardware resource to a second quantity that corresponds to the smaller set of the one or more kernels.
	Sekiyama discloses based at least on the updating, reducing, for the instantiated neural network, the first quantity of the hardware resource to a second quantity that corresponds to the smaller set of the one or more kernels. Sekiyama discloses methods  for “artificial neural network performance improvements” through “algorithm identification and replacement during artificial neural network operation.” Sekiyama, ¶ 14. Method 200 illustrates method for “reducing memory utilization of an artificial neural network.” Id. at ¶ 40, FIG. 2. At 224, potential “candidate algorithms that may replace the generated algorithms of the network may be selected” that “may perform operations in a different manner than those of the define-by-run network.” (e.g., candidate algorithm → fused operation graph). At 232, testing is performed on the potential candidate algorithms 232, and the results of “memory usage” are recorded and ranked at 244. Id. at ¶¶ 43-46. At 250, a better performing candidate algorithm is used to update the neural network by replacing an algorithm with the better performing candidate. Id. at ¶ 48.
Sekiyama discloses that the “neural network may begin to use less memory during monitoring at 226 (e.g., the neural network may use less memory as a result of using a candidate algorithm instead of a generated algorithm after a network update which occurred at 252).” At 260, the unused allocated memory is flagged 262. At 270, “the flagged memory may be deallocated from use by the neural network” (i.e., reducing first quantity of hardware resource to a second quantity). Id. at ¶ 49.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of performing hardware specific fusions in an operator graph representing a neural network of Shafiq-Chen to incorporate deallocating memory resources by updating a neural network as taught by Sekiyama. One of ordinary skill in the art would be motivated to integrate deallocating memory resources by updating a neural network into Shafiq-Chen, with a reasonable expectation of success, in order to optimize computer performance by enabling the “unallocated memory” to be “utilized … for additional tasks (e.g., running any overrating environments, loading drivers, etc.” ). Sekiyama, ¶ 50.

Claim 5
	Chen discloses wherein the instantiated neural network is deployed in an execution environment. Chen is directed to technology for “allocating available physical compute units (PCUs) and/or physical memory units (PMUs) of a reconfigurable data processor to operation unit of an operation unit graph for execution thereof.” Chen, Abstract. Chen illustrates fusion 200, wherein fuser 214 takes as input operation unit graph 204 (e.g., code implementing “convolutional neural network … processing”) and architectural hints 202 (e.g., list of node patterns “that are fused into one operation which can be executed on … the processor 100”) to produce “a fused operation unit graph 224” (e.g., code implementing CNN based on characteristics of the kernels). Id. at ¶¶ 45-46, 71, FIG. 2. Fusion 200 further implements an allocator 234 that “allocates the physical compute units and/or physical memory units of the reconfigurable data processor 100 to the fused operation unit graph 224” (i.e., first quantity of a hardware resource allocated to the kernels) and an executer 244 that “executes the fused operation unit graph 224 on the reconfigurable data processor 100 based on the allocation.” Id. at ¶¶ 86-87. In one embodiment, “performance estimates are used for allocating available physical compute unit and/or physical memory units of the reconfigurable data processor 100 to operation units of the operation unit graph for execution thereof.” Id. at ¶ 90.

Claim 6
	Shafiq discloses wherein the dynamic rule set includes an input count rule, and the one or more characteristics indicate an input count to at least one kernel of the kernels and comparing includes comparing the input count to the input count rule. Shafiq discloses that the computation graphs comprise nodes that “can feed into one another, such that output of a node is provided as input to another node.” Neural network operators refer to “computational functions which process one or more given inputs … to produce one or more outputs.” A fused operator “operates the same as a corresponding structured collection of component operators, including accepting all inputs of the collection of component operators,” wherein the input behavior “is substantially the same … to that of the structured collection of component operators” (i.e., comparing the input count to the input count rule). Shafiq, ¶¶ 32-33. In one embodiment, fused operators specify “ a dataflow of computations” that includes “the dataflow of computations performed by those nodes which the fused operator present.” The data flow includes “a specification of the ordering of computations being performed, and how output of some computations are provided as inputs to other computations.” Id. at ¶ 67.

Claim 7
Sekiyama discloses wherein the first quantity and the second quantity are of memory of cache. Sekiyama discloses that the “neural network may begin to use less memory during monitoring at 226 (e.g., the neural network may use less memory as a result of using a candidate algorithm instead of a generated algorithm after a network update which occurred at 252).” At 260, the unused allocated memory is flagged 262. At 270, “the flagged memory may be deallocated from use by the neural network” (i.e., reducing first quantity of memory to a second quantity). Sekiyama, ¶ 49. Further, the method is implemented using a processor 410 including “one or more memory … caches … that provide temporary storage of instructions and data,” wherein the cores 412 “may perform instructions on input provided from the caches … and output the result to caches.” Id. at ¶ 56.

Claim 8
	Chen discloses after the instantiating of the neural network, generate the dynamic rule set based at least on availability of the hardware resource. Chen discloses that the method “generates performance estimates for execution of an operation unit graph on the reconfigurable data processor 100,” wherein the “performance estimates are used for allocating available physical compute units and/or physical memory units of the reconfigurable data processor 100 to operation units of the operation unit graph for execution thereof.” Chen, ¶ 90, emphasis added.  For example, performance estimates “can be used for comparative analysis to compare performance estimates of a first fused operation unit graph against … a second fused operation unit graph” (i.e., a dynamic rule set). Id. at ¶ 93.

Claim 10
	Sekiyama discloses wherein the hardware resource level comprises one or more of a memory quantity, a processing circuitry, a graphical processing unit circuitry, a cache quantity, a number of discrete processing modules, or a hard disk space. Sekiyama discloses methods  for “artificial neural network performance improvements” through “algorithm identification and replacement during artificial neural network operation.” Sekiyama, ¶ 14. Method 200 illustrates method for “reducing memory utilization of an artificial neural network.” Id. at ¶ 40, FIG. 2. At 224, potential “candidate algorithms that may replace the generated algorithms of the network may be selected” that “may perform operations in a different manner than those of the define-by-run network.” (e.g., candidate algorithm → fused operation graph). At 232, testing is performed on the potential candidate algorithms 232, and the results of “memory usage” are recorded and ranked at 244. Id. at ¶¶ 43-46. At 250, a better performing candidate algorithm is used to update the neural network by replacing an algorithm with the better performing candidate. Id. at ¶ 48.

Claim 12
	Shafiq discloses generating one or more instructions to perform multiple executions of the smaller set of one or more kernels, each execution being performed using a subset of a full set of inputs to the smaller set of one or more kernels. The method uses a “fused kernel library 206” containing “supported underlying fused kernels corresponding to each fused operator” and a “kernel library 208” containing “kernels corresponding to each individual operator that may potentially occur in the new computation graph 203.” The implemented kernels are “obtained from the libraries 206, 208.” Shafiq, ¶¶ 44- 45. The fused kernel library 206 contains instructions to perform multiple executions of the smaller set of the one or more kernels (i.e., multiple executions of fused kernel corresponding to fused operator). Further, Shafiq describes an example wherein two functions f(a,b) and g(c,d) wherein input value c is the output of f(a,b). The method replaces the two functions with fused operator g(f(a,b),d). In this example, the full set of inputs (a, b, c, and d) is reduced to the subset of inputs (a, b, and d). See Id. at ¶ 34.

Claim 13
	Shafiq discloses generating one or more instructions to combine output of the multiple executions. The method uses a “fused kernel library 206” containing “supported underlying fused kernels corresponding to each fused operator” and a “kernel library 208” containing “kernels corresponding to each individual operator that may potentially occur in the new computation graph 203.” Shafiq, ¶ 44. The method generates “a new host source code module 207” that “reflects the new computation graph … in terms of launching and implementing, in order, those kernels which correspond to nodes in the new computation graph 203” (i.e., updating the neural network based on the instructions). The implemented kernels are “obtained from the libraries 206, 208.” Id. at ¶ 45.

Claim 17
	Shafiq discloses wherein the updating is based at least one determining the smaller set of one or more kernels will include a smaller number of memory access operations relative to the set of kernels. Shafiq discloses that operator fusion “combines multiple operators into a single kernel without saving the intermediate results in memory.” Shafiq, ¶ 3; See Also ¶ 34 (describing operator fusion example avoiding memory storage of intermediate value c). Operator fusion patterns are applied in a priority order, wherein the patterns are sorted with criteria such as “maximum memory optimization; maximum compute utilization; minimum number of operations in the new computation graph; etc.” Id. at ¶¶ 48, 51.

Claim 18
Claim 18 is rejected utilizing the aforementioned rationale for Claim 1; the claim is directed to circuitry performing the method.

Claim 22
	Shafiq discloses wherein the one or more characteristics indicate a layer pattern of the set of kernels and the comparing is of the one or more characteristics to one or more layer patterns. Shafiq discloses that “each fusion pattern in the list of the fusion pattern may be associated with a condition for generating a fused operator,” wherein the condition may include “a size of a feature map input to a layer of the neural network, and a size of a filter of a layer of the neural network … a shape of inputs of the convolution layer,” and “a size of the inputs of convolution layer.” Shafiq, ¶ 65.

Claim 23
Claim 23 is rejected utilizing the aforementioned rationale for Claim 6; the claim is directed to circuity for performing the method.

Claim 24
	Shafiq discloses wherein the combining further comprises combining a subset of kernels from the set of kernels according to an execution order having one or more of a reduced number of memory fetch operations or a reduced number of memory store operations. Shafiq discloses that operator fusion “combines multiple operators into a single kernel without saving the intermediate results in memory.” Shafiq, ¶ 3; See Also ¶ 34 (describing operator fusion example avoiding memory storage of intermediate value c). Operator fusion patterns are applied in a priority order, wherein the patterns are sorted with criteria such as “maximum memory optimization; maximum compute utilization; minimum number of operations in the new computation graph; etc.” Id. at ¶¶ 48, 51.

Claim 26
Sekiyama discloses wherein the operations further include adjusting, based at least on the updating, one or more of a memory quantity, a processing circuitry, a graphical processing unit circuitry, a cache quantity, a number of discrete processing modules, or a hard disk space associated with the neural network. Sekiyama discloses methods  for “artificial neural network performance improvements” through “algorithm identification and replacement during artificial neural network operation.” Sekiyama, ¶ 14. Method 200 illustrates method for “reducing memory utilization of an artificial neural network.” Id. at ¶ 40, FIG. 2. At 224, potential “candidate algorithms that may replace the generated algorithms of the network may be selected” that “may perform operations in a different manner than those of the define-by-run network.” (e.g., candidate algorithm → fused operation graph). At 232, testing is performed on the potential candidate algorithms 232, and the results of “memory usage” are recorded and ranked at 244. Id. at ¶¶ 43-46. At 250, a better performing candidate algorithm is used to update the neural network by replacing an algorithm with the better performing candidate. Id. at ¶ 48.
Sekiyama discloses that the “neural network may begin to use less memory during monitoring at 226 (e.g., the neural network may use less memory as a result of using a candidate algorithm instead of a generated algorithm after a network update which occurred at 252).” At 260, the unused allocated memory is flagged 262. At 270, “the flagged memory may be deallocated from use by the neural network” (i.e., reducing first quantity of hardware resource to a second quantity). Id. at ¶ 49.
	
Claim 27
	Sekiyama discloses wherein the reducing is during execution of the updated neural network. Sekiyama discloses methods for “artificial neural network performance improvements” through “algorithm identification and replacement during artificial neural network operation.” Sekiyama, ¶ 14. Method 200 illustrates method for “reducing memory utilization of an artificial neural network.” Id. at ¶ 40, FIG. 2. At 224, potential “candidate algorithms that may replace the generated algorithms of the network may be selected” that “may perform operations in a different manner than those of the define-by-run network.” (e.g., candidate algorithm → fused operation graph). At 232, testing is performed on the potential candidate algorithms 232, and the results of “memory usage” are recorded and ranked at 244. Id. at ¶¶ 43-46. At 250, a better performing candidate algorithm is used to update the neural network by replacing an algorithm with the better performing candidate. Id. at ¶ 48.
Sekiyama discloses that the “neural network may begin to use less memory during monitoring at 226 (e.g., the neural network may use less memory as a result of using a candidate algorithm instead of a generated algorithm after a network update which occurred at 252).” At 260, the unused allocated memory is flagged 262. At 270, “the flagged memory may be deallocated from use by the neural network” (i.e., reducing first quantity of hardware resource to a second quantity). Id. at ¶ 49.

Claims 28-29
Claims 28-29 are rejected utilizing the aforementioned rationale for Claims 12-13; the claims are directed to circuity for performing the method.

Claim 33
	Shafiq discloses wherein the dynamic rule set includes one or more rules for reducing a number of memory access operations. Shafiq discloses a pattern file 204 indicating “a list of fusion patterns associated with a … target hardware platform,” wherein the fusion patterns “represent sets of operators that can be performed … by the hardware execution device as a unitary operation.” The unitary operation “can implement the fused operations based on a single instruction, rather than a series of instructions.” Each pattern in the pattern file “is assigned a corresponding fused operator name, which also corresponding to the underlying execution kernel” (i.e. fused operator → different operation type). Shafiq, ¶¶ 36-37. Shafiq discloses that operator fusion “combines multiple operators into a single kernel without saving the intermediate results in memory.” Id. at ¶ 3; See Also ¶ 34 (describing operator fusion example avoiding memory storage of intermediate value c). Operator fusion patterns are applied in a priority order, wherein the patterns are sorted with criteria such as “maximum memory optimization; maximum compute utilization; minimum number of operations in the new computation graph; etc.” Id. at ¶¶ 48, 51.

Claims 50 and 55-57
Claims 50 and 55-57 are rejected utilizing the aforementioned rationale for Claims 1 and 6-8; the claims are directed to a system performing the method.

Claim 58
	Sekiyama discloses wherein the operations further include adjusting a hardware resource level based at least on the updated neural network. Sekiyama discloses methods for “artificial neural network performance improvements” through “algorithm identification and replacement during artificial neural network operation.” Sekiyama, ¶ 14. Method 200 illustrates method for “reducing memory utilization of an artificial neural network.” Id. at ¶ 40, FIG. 2. At 224, potential “candidate algorithms that may replace the generated algorithms of the network may be selected” that “may perform operations in a different manner than those of the define-by-run network.” (e.g., candidate algorithm → fused operation graph). At 232, testing is performed on the potential candidate algorithms 232, and the results of “memory usage” are recorded and ranked at 244. Id. at ¶¶ 43-46. At 250, a better performing candidate algorithm is used to update the neural network by replacing an algorithm with the better performing candidate. Id. at ¶ 48.
Sekiyama discloses that the “neural network may begin to use less memory during monitoring at 226 (e.g., the neural network may use less memory as a result of using a candidate algorithm instead of a generated algorithm after a network update which occurred at 252).” At 260, the unused allocated memory is flagged 262. At 270, “the flagged memory may be deallocated from use by the neural network” (i.e., reducing first quantity of hardware resource to a second quantity). Id. at ¶ 49.

Claim 59
Sekiyama discloses wherein the hardware resource level comprises one or more of a memory quantity, a processing circuitry, a graphical processing unit circuitry, a cache quantity, a number of discrete processing modules, or a hard disk space. Sekiyama discloses methods for “artificial neural network performance improvements” through “algorithm identification and replacement during artificial neural network operation.” Sekiyama, ¶ 14. Method 200 illustrates method for “reducing memory utilization of an artificial neural network.” Id. at ¶ 40, FIG. 2. At 224, potential “candidate algorithms that may replace the generated algorithms of the network may be selected” that “may perform operations in a different manner than those of the define-by-run network.” (e.g., candidate algorithm → fused operation graph). At 232, testing is performed on the potential candidate algorithms 232, and the results of “memory usage” are recorded and ranked at 244. Id. at ¶¶ 43-46. At 250, a better performing candidate algorithm is used to update the neural network by replacing an algorithm with the better performing candidate. Id. at ¶ 48.
Sekiyama discloses that the “neural network may begin to use less memory during monitoring at 226 (e.g., the neural network may use less memory as a result of using a candidate algorithm instead of a generated algorithm after a network update which occurred at 252).” At 260, the unused allocated memory is flagged 262. At 270, “the flagged memory may be deallocated from use by the neural network” (i.e., reducing first quantity of hardware resource to a second quantity). Id. at ¶ 49.

Claim 60
Sekiyama discloses wherein the operations further include generating one or more instructions to dynamically allocate a memory during execution of the neural network. Sekiyama discloses methods for “artificial neural network performance improvements” through “algorithm identification and replacement during artificial neural network operation.” Sekiyama, ¶ 14. Method 200 illustrates method for “reducing memory utilization of an artificial neural network.” Id. at ¶ 40, FIG. 2. At 224, potential “candidate algorithms that may replace the generated algorithms of the network may be selected” that “may perform operations in a different manner than those of the define-by-run network.” (e.g., candidate algorithm → fused operation graph). At 232, testing is performed on the potential candidate algorithms 232, and the results of “memory usage” are recorded and ranked at 244. Id. at ¶¶ 43-46. At 250, a better performing candidate algorithm is used to update the neural network by replacing an algorithm with the better performing candidate. Id. at ¶ 48.
Sekiyama discloses that the “neural network may begin to use less memory during monitoring at 226 (e.g., the neural network may use less memory as a result of using a candidate algorithm instead of a generated algorithm after a network update which occurred at 252).” At 260, the unused allocated memory is flagged 262. At 270, “the flagged memory may be deallocated from use by the neural network” (i.e., reducing first quantity of hardware resource to a second quantity). Id. at ¶ 49.

Claims 61-62
Claims 61-62 are rejected utilizing the aforementioned rationale for Claims 12-13; the claims are directed to a system performing the method.

Claims 2, 19, 25, 51, and 54 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Shafiq-Chen-Sekiyama, further in view of Jain et al., U.S. Patent No. 11,809,981 B1.

Claim 2
	Jain discloses wherein updating includes copying two or more tensors from multiple memory blocks to a single memory block prior to performance of a concatenation operation1. Jain discloses “a method of generating execution instructions for fused neural network operators,” comprising steps of “receiving a kernel of a first operator (‘a first kernel’) and “receiving a kernel of a second operator (‘a second kernel’), and “generating an instruction file representing a fused operator of the first operator and the second operator.” Jain, 3:7-24. Jain discloses that the “write instruction in the first kernel can include the tensor addresses to which the output data elements are to be stored” and the “read instructions in the second kernel can include the tensor addresses of the output data elements (of the first operator) to be included in each input data element to the second operator.” Accordingly, the tensor addresses of two kernels are copied to a virtual data node representing a logical tensor. Id. at 3:25-45. Each virtual data node “can represent a logical tensor which can store output data elements of an operator, and from which input data elements can be fetched to another operator.” Id. at 14:27-30. This enables a “logical tensor” to “provide a generic interface for data transfer between two kernels.” Id. at 14:46-47.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network kernel fusion techniques of Shafiq-Chen-Sekiyama to incorporate the neural network kernel fusion techniques taught by Jain. One of ordinary skill in the art would be motivated to integrate the neural network kernel fusion techniques into Shafiq-Chen-Sekiyama, with a reasonable expectation of success, in order to “reduce the latency of data transfer between the operators/layers and improve performance.” Jain, 1:11-21.

Claim 19
Claim 19 is rejected utilizing the aforementioned rationale for Claim 2; the claim is directed to circuitry for performing the method.

Claim 25
	Jain discloses wherein the types of operation include a convolution operations and a batch normalization operation, and the one or more operation types include a summation programming function. Jain discloses wherein the operations include “a convolution operation that layer 209 may perform.” Jain, 7:25-8:61. The operations include “summation operations.” Id. at 1:59-61; 7:51-61; 10:27-41; 19:63-67; 24:13-17. Further, combining outputs “can include … computing a maximum value, a minimum value, an average value, a median value, a summation, a multiplication, or another logical or mathematical combination” (e.g., normalization). See Id. at 24:11-28.

Claim 51
Claim 51 is rejected utilizing the aforementioned rationale for Claim 2; the claim is directed to a system performing the method.

Claim 54
	Jain discloses wherein the combination is into a summation programming function. Jain discloses “a method of generating execution instructions for fused neural network operators,” comprising steps of “receiving a kernel of a first operator (‘a first kernel’) and “receiving a kernel of a second operator (‘a second kernel’), and “generating an instruction file representing a fused operator of the first operator and the second operator.” Jain, 3:7-24. The operations include “summation operations.” Id. at 1:59-61; 7:51-61; 10:27-41; 19:63-67; 24:13-17.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network kernel fusion techniques of Shafiq to incorporate the neural network kernel fusion techniques taught by Jain. One of ordinary skill in the art would be motivated to integrate the neural network kernel fusion techniques into Shafiq, with a reasonable expectation of success, in order to “reduce the latency of data transfer between the operators/layers and improve performance.” Jain, 1:11-21.

Claims 3, 20, and 52 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Shafiq-Chen-Sekiyama, further in view of Temam et al., U.S. Patent No. 10,108,538 B1.

Claim 3
Temam discloses wherein the types of operations include at least two of: prolog operation; a main operation; or an epilog operation. Temam discloses a method for “determining memory addresses for prologue and/or epilogue data and accessing the data for use in machine learning computations using a special purpose computational unit.”  A processor is “configured to identify … a prologue or epilogue loop having a corresponding data array.” Temam, 1:28-45. A tensor traversal unit “can use the memory address to access the data elements and the tensor elements, e.g., to perform neural network computations using the values of the elements.” Id. at 5:1-6. In one embodiment, the tensor traversal unit “determines a memory address offset value of each tensor element and/or each data element of a prologue data array and/or an epilogue data array.” Id. at 7:23-33.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network kernel fusion techniques of Shafiq-Chen-Sekiyama to incorporate accessing prologue and epilogue data used in a neural network as taught by Temam. One of ordinary skill in the art would be motivated to integrate accessing prologue and epilogue data into Shafiq-Chen-Sekiyama, with a reasonable expectation of success, in order to access neural network prologue and epilog data using “a single instruction” that provides the benefits of “denser encoding, fewer memory resources used, and/or fewer required memory resources.” See Temam, 3:16-21.

Claim 20
Claim 20 is rejected utilizing the aforementioned rationale for Claim 3; the claim is directed to circuitry for performing the method.

Claim 52
Claim 52 is rejected utilizing the aforementioned rationale for Claim 3; the claim is directed to a system performing the method.

Claims 4, 21, and 53 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Shafiq-Chen-Sekiyama, further in view of Brothers et al., U.S. PG-Publication No. 2016/0358070 A1.

Claim 4
	Brothers discloses wherein updating reduces a numerical precision of the set of the kernels. The disclosure of Brothers is directed “to automated tuning of artificial neural networks.” Brothers, ¶ 20. A neural network analyzer may “identify a portion of the first neural network for modification to improve performance” (portion of neural network → first subset of kernels). Id. at ¶ 46. Network modifications include “pruning, decomposition, precision and/or numerical format modification, convolution kernel substitution, activation function substitution, kernel fusion, and/or scaling.” Id. at ¶ 23.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network kernel fusion techniques of Shafiq-Chen-Sekiyama to incorporate the neural network optimization techniques taught by Jain. One of ordinary skill in the art would be motivated to integrate the neural network kernel fusion techniques into Shafiq-Chen-Sekiyama, with a reasonable expectation of success, in order to obtain “improved computations efficiency” that “can result in improved, or reduced, runtime of the neural network and/or reduced power consumption.” Brothers, ¶ 28.

Claim 21
Claim 21 is rejected utilizing the aforementioned rationale for Claim 4; the claim is directed to circuitry for performing the method.

Claim 53
Claim 53 is rejected utilizing the aforementioned rationale for Claim 4; the claim is directed to a system performing the method.

Claims 9, 11, 14-16, 30-32, and 63-65 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Shafiq-Chen-Sekiyama, further in view of Dhurandhar et al., U.S. PG-Publication No. 2021/0342685 A1.

Claim 9
	Dhurandhar discloses wherein the updating further includes inserting one or more analysis nodes into the instantiated neural network, and verifying, using the one or more analysis nodes, accuracy of the instantiated neural network. Dhurandhar discloses a method that uses “probes to judge the hardness of predictability of a data point in a neural network.” Probes are attached to a respective layers in the neural network “to determine the hardness of each layer,” wherein hardness is “a probability of the accuracy of a predictive input” (probes → analysis nodes). Dhurandhar, ¶¶ 46-48.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network kernel fusion techniques of Shafiq-Chen-Sekiyama to incorporate linear neural network probes for measuring prediction accuracy as taught by Dhurandhar. One of ordinary skill in the art would be motivated to integrate linear neural network probes for measuring prediction accuracy into Shafiq-Chen-Sekiyama, with a reasonable expectation of success, in order to “provide an improvement in the accuracy of predictive modeling and an improvement in the efficiency of computer operations,” because “the performance of simple modeling as reweighted” leads to “reduced storage and processing considerations.” Dhurandhar, ¶ 43.

Claim 11
	Dhurandhar discloses based at least on the verifying, disabling the one or more analysis nodes in the instantiated neural network. Dhurandhar discloses a re-weighting process for the neural network, wherein when the ratio of the hardness value for a data point to the confidence value is less than β, “a re-weighted training set is created by discarding one or more data points of the training data set with a relatively low hardness value to increase the ratio of hardness” (discard data point → disable linear probe → disable analysis node). Dhurandhar, ¶¶ 69, 71; FIG. 8B (step 845).

Claim 14
	Dhurandhar discloses inspecting a predetermine portion of the updated neural network during execution of the updated neural network. Dhurandhar discloses a method that uses “probes to judge the hardness of predictability of a data point in a neural network.” Probes are attached to a respective layers in the neural network “to determine the hardness of each layer,” wherein hardness is “a probability of the accuracy of a predictive input” (layer → portion of neural network). Dhurandhar, ¶¶ 46-48. Further, Dhurandhar discloses that the method provides “the received re-weighting training data including the hardness value,” and a simple model (i.e., updated neural network) “is retrained utilizing the re-weighted training data set.” Id.at ¶¶ 14, 21.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network kernel fusion techniques of Shafiq-Chen-Sekiyama to incorporate linear neural network probes for measuring prediction accuracy as taught by Dhurandhar. One of ordinary skill in the art would be motivated to integrate linear neural network probes for measuring prediction accuracy into Shafiq-Chen-Sekiyama, with a reasonable expectation of success, in order to “provide an improvement in the accuracy of predictive modeling and an improvement in the efficiency of computer operations,” because “the performance of simple modeling as reweighted” leads to “reduced storage and processing considerations.” Dhurandhar, ¶ 43.

Claim 15
	Dhurandhar discloses inserting one or more analysis nodes at portions of the updated neural network, each analysis node configured to generate an output of a corresponding portion of the portions of the updated neural network. Dhurandhar discloses a method that uses “probes to judge the hardness of predictability of a data point in a neural network.” Probes are attached to a respective layers in the neural network “to determine the hardness of each layer,” wherein hardness is “a probability of the accuracy of a predictive input” (layer → portion of neural network). Dhurandhar, ¶¶ 46-48. Further, Dhurandhar discloses that the method provides “the received re-weighting training data including the hardness value,” and a simple model (i.e., updated neural network) “is retrained utilizing the re-weighted training data set.” Id.at ¶¶ 14, 21.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network kernel fusion techniques of Shafiq-Chen-Sekiyama to incorporate linear neural network probes for measuring prediction accuracy as taught by Dhurandhar. One of ordinary skill in the art would be motivated to integrate linear neural network probes for measuring prediction accuracy into Shafiq-Chen-Sekiyama, with a reasonable expectation of success, in order to “provide an improvement in the accuracy of predictive modeling and an improvement in the efficiency of computer operations,” because “the performance of simple modeling as reweighted” leads to “reduced storage and processing considerations.” Dhurandhar, ¶ 43.

Claim 16
	Dhurandhar discloses dynamically enabling or disabling one or more of the analysis nodes during execution of the updated neural network. Dhurandhar discloses that the “model is re-trained with the re-weighted data set,” and the “performance of the first model is enhanced.” The first model “utilizes the re-weighted training data set, while discarding data points so that the ratio if higher than a value of β.” Dhurandhar, ¶¶ 70-71. The discarding of data points (i.e., disabling of analysis nodes) is performed during utilization of a re-weighted training set (i.e., dynamically while the model is updated with ne weighting data).

Claims 30-32
Claims 30-32 are rejected utilizing the aforementioned rationale for Claims 14-16; the claims are directed to circuitry for performing the method.

Claims 63-65
Claims 63-65 are rejected utilizing the aforementioned rationale for Claims 14-16; the claims are directed to a system performing the method.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FRANK D MILLS whose telephone number is (571)270-3194. The examiner can normally be reached M-F 10-6 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KAVITA PADMANABHAN can be reached at (571)272-8352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/FRANK D MILLS/Primary Examiner, Art Unit 2194                                                                                                                                                                                                        March 20, 2026

        1 The limitation “prior to performance of a concatenation operation” employs functional language because it recites a feature by what it does rather than by what it is. MPEP 2173.05(g). The broadest reasonable interpretation of this limitation are instructions having structure to “copy two or more tensors to a single memory block” The recitation of an intention for performing the action “prior to performance of a concatenation operation” has no patentable weight because it merely states an intended use for the copying. See MPEP 2111.04. The claim fails to expressly recite a step of concatenating as part of the method, nor does the claim specify exactly what performs said concatenation operation.
Read full office action
Prosecution Timeline

Apr 19, 2021
Application Filed
Mar 27, 2025
Non-Final Rejection mailed — §103
Jun 27, 2025
Response Filed
Oct 07, 2025
Final Rejection mailed — §103
Dec 30, 2025
Request for Continued Examination
Jan 20, 2026
Response after Non-Final Action
Mar 24, 2026
Non-Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/106,265
Patent 12632811
BUSINESS PROCESS DEFINITION AND CONTROL SERVICE
3y 3m to grant Granted May 19, 2026
18/105,297
Patent 12613754
DATABASE MANAGEMENT METHODS AND ASSOCIATED APPARATUS
3y 2m to grant Granted Apr 28, 2026
18/919,126
Patent 12613900
Document Summarizer
1y 6m to grant Granted Apr 28, 2026
18/158,824
Patent 12596575
DATA STREAMING PIPELINE FOR COMPUTE MAPPING SYSTEMS AND APPLICATIONS
3y 2m to grant Granted Apr 07, 2026
17/963,716
Patent 12591453
METHOD AND SYSTEM FOR MULTI-CORE LOAD SCHEDULING IN AN OPERATING SYSTEM (OS) LESS COMMUNICATION NETWORK
3y 5m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
69%
Grant Probability
92%
With Interview (+22.8%)
3y 4m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 600 resolved cases by this examiner. Grant probability derived from career allowance rate.