Last updated: April 19, 2026
Application No. 18/135,100
HARDWARE REALIZATION OF NEURAL NETWORKS USING BUFFERS

Non-Final OA §101§103
Filed
Apr 14, 2023
Examiner
SHINE, NICHOLAS B
Art Unit
2126
Tech Center
2100 — Computer Architecture & Software
Assignee
Polyn Technology Limited
OA Round
1 (Non-Final)
This examiner grants 38% of cases after interview

— +44.6% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 37 resolved cases, 2023–2026
Examiner Intelligence

SHINE, NICHOLAS B View full profile →
Grants only 38% of cases
Career Allow Rate
14 granted / 37 resolved
-17.2% vs TC avg
Strong +45% interview lift
Without
With
+44.6%
Interview Lift
resolved cases with interview
Typical timeline
5y 1m
Avg Prosecution
25 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
34.9%
-5.1% vs TC avg
§103
46.0%
+6.0% vs TC avg
§102
5.3%
-34.7% vs TC avg
§112
13.4%
-26.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 37 resolved cases
Office Action

§101 §103
DETAILED ACTION
	This action is responsive to claims filed 04/14/2023.
	Claims 1–20 are pending for examination.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title

Claims 1–13, 15, and 17–20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding Claim 1:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 1 is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“computing a measure of locality for tensors of the trained convolutional neural network based on dependencies between the set of input tensors and the set of intermediate tensors”
“generating a schematic model for implementing the equivalent buffered neural network, including selecting component parameter values for neurons of the equivalent buffered neural network and connections between the neurons”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can compute a measure of distance based on data dependencies, and generate a schematic model that includes parameters. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“obtaining a neural network topology for a trained convolutional neural network that transforms a set of input tensors and generates a set of intermediate tensors” — This limitation is insignificant extra-solution activity and is merely data gathering. See MPEP 2106.05(g).
“transforming the trained convolutional neural network into an equivalent buffered neural network” — This limitation is reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished such that it amounts no more than mere instructions to apply. See MPEP 2106.05(f); See also Electric Power Group, LLC v. Alstom, S.A., 830 F.3d 1350, 1356, 119 USPQ2d 1739 (Fed. Cir. 2016).
“that includes a left subnetwork and a right subnetwork, based on the neural network topology and the measure of locality, wherein the left subnetwork and the right subnetwork are interconnected via a buffer” — This limitation is reciting generic computer components at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. See MPEP 2106.05(f).
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception.
“obtaining a neural network topology for a trained convolutional neural network that transforms a set of input tensors and generates a set of intermediate tensors” — This limitation is directed to the activity of data gathering which is not an inventive concept because it is insignificant extra-solution activity of mere data gathering. See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015); MPEP 2106.05(g)(3). This limitation is well-understood, routine, and conventional because it involves transmitting information over a network. MPEP 2106.05(d)(II).
Regarding Claim 2:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 2 depends from claim 1 (see analysis of claim 1 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“computing the measure of locality for an intermediate tensor comprises computing a distance between a minimum spatial index and a maximum spatial index of input tensors that the intermediate tensor depends on”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can compute a measure of distance based on data dependencies. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional limitation:
“each intermediate tensor is associated with a corresponding spatial index range for input tensors that the respective intermediate tensor depends on” — This limitation is insignificant extra-solution activity and is merely data gathering. See MPEP 2106.05(g). Examiner notes this limitation merely adds additional limits to the obtaining limitation which is directed to insignificant extra-solution activity of mere data gathering.
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. 
“each intermediate tensor is associated with a corresponding spatial index range for input tensors that the respective intermediate tensor depends on” — This limitation is directed to the activity of data gathering which is not an inventive concept because it is insignificant extra-solution activity of mere data gathering. See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015); MPEP 2106.05(g)(3). This limitation is well-understood, routine, and conventional because it involves transmitting information over a network. MPEP 2106.05(d)(II).
Regarding Claim 3:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 3 depends from claim 1 (see analysis of claim 1 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“computing the measure of locality for an intermediate tensor comprises computing a distance between a minimum temporal index and a maximum temporal index of input tensors that the intermediate tensor depends on”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can compute a measure of distance based on data dependencies. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional limitation:
“each intermediate tensor is associated with a corresponding temporal index range for input tensors that the respective intermediate tensor depends on” — This limitation is insignificant extra-solution activity and is merely data gathering. See MPEP 2106.05(g). Examiner notes this limitation merely adds additional limits to the obtaining limitation which is directed to insignificant extra-solution activity of mere data gathering.
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. 
“each intermediate tensor is associated with a corresponding temporal index range for input tensors that the respective intermediate tensor depends on” — This limitation is directed to the activity of data gathering which is not an inventive concept because it is insignificant extra-solution activity of mere data gathering. See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015); MPEP 2106.05(g)(3). This limitation is well-understood, routine, and conventional because it involves transmitting information over a network. MPEP 2106.05(d)(II).
Regarding claim 4:
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1 above). This claim merely recites a further limitation on the computing limitation which is directed to an abstract idea that can be performed in the human mind. The additional limitation:
“wherein computing the measure of locality is further based on parameters of convolution operations, including kernels, strides, padding, and dilation, for a predetermined set of layers of the trained convolutional neural network” — This limitation is directed to the field of use (see MPEP 2106.05(h)) as it merely limiting the data obtained for computation to the field of convolution operations.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d)I.), failing Step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding claim 5:
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1 above). This claim merely recites a further limitation on the transforming limitation which is directed to mere instructions to implement the abstract ideas. The additional limitation:
“wherein a size of the buffer is the size of an input tensor for the right subnetwork” — This limitation is reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished such that it amounts no more than mere instructions to apply. See MPEP 2106.05(f); See also Electric Power Group, LLC v. Alstom, S.A., 830 F.3d 1350, 1356, 119 USPQ2d 1739 (Fed. Cir. 2016).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d)I.), failing Step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 6:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 6 depends from claim 1 (see analysis of claim 1 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“selecting an intermediate tensor from the set of intermediate tensors based on the measure of locality for intermediate tensors, thereby determining a size of the buffer based on the size of the selected intermediate tensor”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can select a tensor from a set of tensors based on a computed measure. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. 
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. 
Regarding claim 7:
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1 above). This claim merely recites a further limitation on the transforming limitation which is directed to mere instructions to implement the abstract ideas. The additional limitation:
“wherein the buffer is a rotating FIFO queue having a fixed length” — This limitation is reciting generic computer components at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. See MPEP 2106.05(f).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d)I.), failing Step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 8:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 8 depends from claim 1 (see analysis of claim 1 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“selecting an input size of the left subnetwork based on input shape of a reduced dimension shape of a portion of the trained convolutional neural network that corresponds to the left subnetwork, wherein the reduced dimension shape corresponds to a dimension of data along which convolution or a pooling kernel is applied”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can select an input tensor size to fit a neural network. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. 
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. 
Regarding Claim 9:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 9 depends from claim 1 (see analysis of claim 1 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“wherein input shape in reduced dimension for the trained convolutional neural network is X, the measure of locality for an intermediate tensor of the set of intermediate tensors selected as buffer tensor is Z, and the method further comprises: selecting an input reduced dimension shape for the left subnetwork between Z and X, wherein the reduced dimension shape corresponds to a dimension of data along which convolution or a pooling kernel is applied”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can select an input shape corresponding to a particular dimension for neural network computation. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. 
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. 
Regarding Claim 10:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 10 depends from claim 1 (see analysis of claim 1 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“defining an input shape W for of a reduced dimension shape of the left subnetwork, wherein the reduced dimension shape corresponds to a dimension of data along which convolution or a pooling kernel is applied”
“defining an approximate number of data points M that the left subnetwork generates using an equation M = 1 + (W - Z) * (N - 1)/(X - Z)”
“computing an aggregation rate of operation for the left subnetwork based on M”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can select an input shape corresponding to a particular dimension for neural network computation. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. 
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. 
Regarding claim 11:
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 11 which included an abstract idea (see rejection for claim 1 above). This claim merely recites a further limitation on the transforming limitation which is directed to mere instructions to apply the abstract ideas that can be performed in the human mind. The additional limitation:
“wherein transforming the trained convolutional neural network comprises: connecting the left subnetwork to a recurrent neural network (RNN) layer and connecting an output of the RNN layer to the right subnetwork” — This limitation amounts to no more than mere instructions to apply the exception and is the equivalent to mere instruction to implement the abstract idea on a computer. See MPEP 2106.05(f). Configuring a neural networks merely invokes computers or other machinery as a tool to perform an existing process.
“wherein a reduced dimension index of the trained convolutional neural network is defined as a time series dimension for the RNN layer” — This limitation amounts to no more than mere instructions to apply the exception and is the equivalent to mere instruction to implement the abstract idea on a computer. See MPEP 2106.05(f). Configuring a neural networks merely invokes computers or other machinery as a tool to perform an existing process.
“the output of the RNN layer is flattened in time series dimension before being input to the right subnetwork” — This limitation amounts to no more than mere instructions to apply the exception and is the equivalent to mere instruction to implement the abstract idea on a computer. See MPEP 2106.05(f). Configuring a neural networks merely invokes computers or other machinery as a tool to perform an existing process.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d)I.), failing Step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding claim 12:
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 12 which included an abstract idea (see rejection for claim 1 above). This claim merely recites a further limitation on the transforming limitation which is directed to mere instructions to apply the abstract ideas that can be performed in the human mind. The additional limitation:
“wherein transforming the trained convolutional neural network to the equivalent buffered neural network is performed in accordance with a determination that the measure of locality of a selected buffer tensor is below a predetermined threshold” — This limitation amounts to no more than mere instructions to apply the exception and is the equivalent to mere instruction to implement the abstract idea on a computer. See MPEP 2106.05(f). Configuring a neural networks merely invokes computers or other machinery as a tool to perform an existing process.
“wherein a reduced dimension index of the trained convolutional neural network is defined as a time series dimension for the RNN layer” — This limitation amounts to no more than mere instructions to apply the exception and is the equivalent to mere instruction to implement the abstract idea on a computer. See MPEP 2106.05(f). Configuring a neural networks merely invokes computers or other machinery as a tool to perform an existing process.
“the output of the RNN layer is flattened in time series dimension before being input to the right subnetwork” — This limitation amounts to no more than mere instructions to apply the exception and is the equivalent to mere instruction to implement the abstract idea on a computer. See MPEP 2106.05(f). Configuring a neural networks merely invokes computers or other machinery as a tool to perform an existing process.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d)I.), failing Step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 13:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 13 depends from claim 1 (see analysis of claim 1 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“computing a first measure of locality for a first portion of the trained convolutional neural network and a second measure of locality for a second portion of the trained convolutional neural network, based on the dependencies between the set of input tensors and the set of intermediate tensors”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can compute a measure locality between neural network based on dependencies between inputs. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional limitations:
“in accordance with a determination that the first measure of locality is below a predetermined threshold, transforming the first portion of the trained convolutional neural network into an equivalent neural network that includes the left subnetwork and the right subnetwork interconnected via the buffer” — This limitation is reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished such that it amounts no more than mere instructions to apply. See MPEP 2106.05(f); See also Electric Power Group, LLC v. Alstom, S.A., 830 F.3d 1350, 1356, 119 USPQ2d 1739 (Fed. Cir. 2016).
“in accordance with a determination that the second measure of locality is above the predetermined threshold, interconnecting the second portion of the trained convolutional network with the equivalent neural network to obtain the equivalent buffered neural network” — This limitation amounts to no more than mere instructions to apply the exception and is the equivalent to mere instruction to implement the abstract idea on a computer. See MPEP 2106.05(f). Integrating neural network components merely invokes computers or other machinery as a tool to perform an existing process.
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. 
Regarding Claim 15:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 15 depends from claim 1 (see analysis of claim 1 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“computing a respective measure of locality for each layer of the trained convolutional neural network based on dependencies between the set of input tensors and a respective subset of the set of intermediate tensors”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can compute a measure locality between neural network based on dependencies between inputs. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional limitations:
“transforming the trained convolutional neural network into the equivalent buffered neural network further based on the respective measure of locality for each layer of the trained convolutional neural network” — This limitation is reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished such that it amounts no more than mere instructions to apply. See MPEP 2106.05(f); See also Electric Power Group, LLC v. Alstom, S.A., 830 F.3d 1350, 1356, 119 USPQ2d 1739 (Fed. Cir. 2016).
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. 
Regarding claim 17:
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 17 which included an abstract idea (see rejection for claim 1 above). This claim merely recites a further limitation on the transforming limitation which is directed to mere instructions to apply the abstract ideas that can be performed in the human mind. The additional limitation:
“wherein transforming the trained convolutional neural network further comprises: associating the left subnetwork with an aggregation rate of operation that indicates a number of times the left subnetwork should be run for each time the right subnetwork is run” — This limitation amounts to no more than mere instructions to apply the exception and is the equivalent to mere instruction to implement the abstract idea on a computer. See MPEP 2106.05(f). Associating neural networks with data merely invokes computers or other machinery as a tool to perform an existing process.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d)I.), failing Step 2A Prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 18:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 18 depends from claim 17 (see analysis of claim 17 above) which is directed to a method i.e., a process. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“associating the buffer with the intermediate tensor T and defining the buffer to have size equal to N”
“defining the left subnetwork to generate M data points each time it operates and to have an aggregation rate of approximately X * N/M”
“defining the right subnetwork to have an aggregation rate of X”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can buffers with data, define the size, and define configurations. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional limitations:
“wherein the trained convolutional neural network has an aggregation rate of operation X, generates an intermediate tensor T of N data points each time it operates, and the method further comprises” — This limitation is insignificant extra-solution activity and is merely data gathering. See MPEP 2106.05(g).
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception.
“wherein the trained convolutional neural network has an aggregation rate of operation X, generates an intermediate tensor T of N data points each time it operates, and the method further comprises” — This limitation is directed to the activity of data gathering which is not an inventive concept because it is insignificant extra-solution activity of mere data gathering. See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015); MPEP 2106.05(g)(3). This limitation is well-understood, routine, and conventional because it involves transmitting information over a network. MPEP 2106.05(d)(II).
Regarding Claim 19:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim 19 is directed to a system i.e., a machine. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“computing a measure of locality for tensors of the trained convolutional neural network based on dependencies between the set of input tensors and the set of intermediate tensors”
“generating a schematic model for implementing the equivalent buffered neural network, including selecting component parameter values for neurons of the equivalent buffered neural network and connections between the neurons”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can compute a measure of distance based on data dependencies, and generate a schematic model that includes parameters. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“one or more processors; and memory storing one or more programs for execution by the one or more processors, the one or more programs including instructions for” — This limitation is reciting generic computer components at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. See MPEP 2106.05(f).
“obtaining a neural network topology for a trained convolutional neural network that transforms a set of input tensors and generates a set of intermediate tensors” — This limitation is insignificant extra-solution activity and is merely data gathering. See MPEP 2106.05(g).
“transforming the trained convolutional neural network into an equivalent buffered neural network” — This limitation is reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished such that it amounts no more than mere instructions to apply. See MPEP 2106.05(f); See also Electric Power Group, LLC v. Alstom, S.A., 830 F.3d 1350, 1356, 119 USPQ2d 1739 (Fed. Cir. 2016).
“that includes a left subnetwork and a right subnetwork, based on the neural network topology and the measure of locality, wherein the left subnetwork and the right subnetwork are interconnected via a buffer” — This limitation is reciting generic computer components at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. See MPEP 2106.05(f).
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception.
“obtaining a neural network topology for a trained convolutional neural network that transforms a set of input tensors and generates a set of intermediate tensors” — This limitation is directed to the activity of data gathering which is not an inventive concept because it is insignificant extra-solution activity of mere data gathering. See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015); MPEP 2106.05(g)(3). This limitation is well-understood, routine, and conventional because it involves transmitting information over a network. MPEP 2106.05(d)(II).
Regarding Claim 20:
Step 1 — Is the claim to a process, machine, manufacture, or composition of matter?
Yes, claim  is directed to a non-transitory computer-readable storage medium i.e., a machine. 
Step 2A — Prong 1 Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites an abstract idea. 
“compute a measure of locality for tensors of the trained convolutional neural network based on dependencies between the set of input tensors and the set of intermediate tensors”
“generate a schematic model for implementing the equivalent buffered neural network, including selecting component parameter values for neurons of the equivalent buffered neural network and connections between the neurons”
These limitations, under their broadest reasonable interpretation, cover mental processes, concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a)(2). In particular, with the aid of pen and paper, a human can compute a measure of distance based on data dependencies, and generate a schematic model that includes parameters. 
Step 2A — Prong 2 — Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“non-transitory computer-readable storage medium, storing one or more programs configured for execution by one or more processors of a server system, the one or more programs including instructions, which when executed by the one or more processors cause the server system to” — This limitation is reciting generic computer components at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. See MPEP 2106.05(f).
“obtain a neural network topology for a trained convolutional neural network that transforms a set of input tensors and generates a set of intermediate tensors” — This limitation is insignificant extra-solution activity and is merely data gathering. See MPEP 2106.05(g).
“transform the trained convolutional neural network into an equivalent buffered neural network” — This limitation is reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished such that it amounts no more than mere instructions to apply. See MPEP 2106.05(f); See also Electric Power Group, LLC v. Alstom, S.A., 830 F.3d 1350, 1356, 119 USPQ2d 1739 (Fed. Cir. 2016).
“that includes a left subnetwork and a right subnetwork, based on the neural network topology and the measure of locality, wherein the left subnetwork and the right subnetwork are interconnected via a buffer” — This limitation is reciting generic computer components at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. See MPEP 2106.05(f).
Step 2B — Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception.
“obtain a neural network topology for a trained convolutional neural network that transforms a set of input tensors and generates a set of intermediate tensors” — This limitation is directed to the activity of data gathering which is not an inventive concept because it is insignificant extra-solution activity of mere data gathering. See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015); MPEP 2106.05(g)(3). This limitation is well-understood, routine, and conventional because it involves transmitting information over a network. MPEP 2106.05(d)(II).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1–3, 5–6, and 8–20 are rejected under 35 U.S.C. 103 as being unpatentable over Ko et al., (US-12159214-B1), hereinafter “Ko”, in view of Timofejevs et al., (US-20210406661-A1), hereinafter “Timofejevs”.
Regarding claim 1, Ko teaches:
a method for hardware realization of neural networks, comprising (Ko Abstract; Claim 1]):
obtaining a neural network topology for a trained convolutional neural network that transforms a set of input tensors and generates a set of intermediate tensors (Ko Col. 11, Lines 28–47: “the firmware loads a neural network program object”); 
computing a measure of locality for tensors of the trained convolutional neural network based on dependencies between the set of input tensors and the set of intermediate tensors (Ko Fig. 6, Col. 15, Lines 31–64; Col. 28, Line 66 – Col. 29, Line 16: “The system memory interface 640 also includes a virtual address to physical address mapping table 645 … the physical address for the first SRAM for a particular core is based on the number of the particular core and the total amount of memory per core, with the addresses for subsequent memory banks in the core assigned consecutively” and “The configuration data from the cluster controller 1605 specifies whether to send these dot products in one direction or the other along the global channel for each dot product bus lane, or to aggregate the dot products from the neighboring channels locally, depending on where post-processing will occur for each dot product”—[wherein the system maps the address based on spatial (i.e., physical address) and temporal (e.g., subsequent and consecutively) dependencies between inputs and outputs (i.e., tensors)]); 
transforming the trained convolutional neural network into an equivalent buffered neural network that includes a left subnetwork and a right subnetwork, based on the neural network topology and the measure of locality (Ko Figs. 3–4, Col. 16, Line 59 – Col. 17, Line 39: “The process 700 then assigns (at 710) cores of the IC for each layer of the neural network … As described in more detail below, some embodiments assign two sets of memory banks for storing network inputs and/or outputs, to allow for double buffering of the inputs and outputs (i.e., so that the network can execute for a first input while a second input is being stored in the unified memory)”—[wherein the system assigns cores for each layer (i.e., transforming the network) including two sets of memory banks (i.e., right and left subnetwork) for inputs and outputs]), 
wherein the left subnetwork and the right subnetwork are interconnected via a buffer (Ko Figs. 3–4, Col. 15, Lines 18–30: “This system memory interface, in some embodiments, includes a crossbar circuit that enables various read/write access mechanisms from the CPU 630 and input processing circuit 635 to share the interface and access the various different clusters 605-620 of unified memory banks”—[(emphasis added)]); and 
[generating a schematic model for implementing the equivalent buffered neural network], including selecting component parameter values for neurons of the equivalent buffered neural network and connections between the neurons (Ko Figs. 3–4, Col. 16, Lines 30–65; Col. 18, Lines 20–33: “In some embodiments, a compiler program generates the configuration data for the IC that enables the neural network computation fabric to execute a particular neural network for a particular function. This compiler receives a specification of the network and determines which memory banks of which cores will be involved in the execution of each network layer. Based on the assignment of core memories to the various neural network data (weights, input data, intermediate activations), the compiler determines the virtual to physical address mapping table that is used by the system memory interface”—[Examiner notes Ko indeed teaches generating the neural network instructions program that includes the selected component parameters and connections required but does not explicitly teach generating a schematic model]).
Ko does not appear to explicitly teach: 
generating a schematic model for implementing the equivalent buffered neural network, [including selecting component parameter values for neurons of the equivalent buffered neural network and connections between the neurons.]
However, Timofejevs teaches: 
generating a schematic model for implementing the equivalent buffered neural network [including selecting component parameter values for neurons of the equivalent buffered neural network and connections between the neurons] (Timofejevs ¶0379: “The method also includes generating (2714) a schematic model for implementing the equivalent analog network based on the weight matrix, including selecting component values for the analog components”).
The methods of Ko, the teachings of Timofejevs, and the instant application are analogous art because they pertain to optimizing neural network structures by incorporating buffers.
It would be obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the methods of Ko with the teachings of Timofejevs to provide a structure map of the optimized model implementing the neural network computations. One would be motivated to do so to in order to select the specific hardware best suited to operate the neural network (Timofejevs ¶¶0006–0011).
Regarding claim 2, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
each intermediate tensor is associated with a corresponding spatial index range for input tensors that the respective intermediate tensor depends on (Ko Col. 19, Line 61 – Col. 20, Line 8: “The neural network instructions that specify, e.g., from which memory locations to read weight and activation values, and to which memory location to write intermediate activation values, for a given layer of the network, are unchanged each time the computation fabric executes the network”—[wherein the neural network includes instructions associating memory locations (i.e., spatial index range) with intermediate activations (i.e., tensors)]); and 
computing the measure of locality for an intermediate tensor comprises computing a distance between a minimum spatial index and a maximum spatial index of input tensors that the intermediate tensor depends on (Ko Col. 9, lines 37–53; Col. 12, Lines 23–34: “Traditionally, the sigmoid function and the tanh function have been the activation functions of choice. More recently, the ReLU function (ƒ(x)=max(0, x)) has been proposed for the activation function in order to make it easier to compute the activation function” and “The computation fabric provides a set of circuits for performing the various computations required for neural networks (e.g., dot product computations, scaler and bias operations, activation functions, etc.), with the network parameters (weight values, bias values, node arrangement, filter size, etc.) configurable. In some embodiments, the computation fabric imposes certain requirements on the networks, such as a maximum size of the network (i.e., a maximum size of the dot product computations), that the weight values be ternary (e.g., 0, α, and −α for each layer of the network), and/or that at least a particular percentage of the weight values be equal to zero”).
Regarding claim 3, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
each intermediate tensor is associated with a corresponding temporal index range for input tensors that the respective intermediate tensor depends on (Ko Col. 13, Line 66 – Col. 14, Line 14; Col. 15, Line 31–52; Col. 20, Lines 9–28: “In the simplest case, all of the partial dot products are computed in the same clock cycle and provided at the same time to the global channel 410” and “Rather, the physical address for the first SRAM for a particular core is based on the number of the particular core and the total amount of memory per core, with the addresses for subsequent memory banks in the core assigned consecutively” and “To solve this problem, the IC firmware of some embodiments programs a logical address to physical address translation table in the computation fabric before each execution of the network for a new input. The instructions for the cores to read weight and activation values (and write activation values) are given as logical memory addresses, which the computation fabric converts to physical addresses before accessing the unified memory to perform the read/write operations”); and 
computing the measure of locality for an intermediate tensor comprises computing a distance between a minimum temporal index and a maximum temporal index of input tensors that the intermediate tensor depends on (Ko Col. 9, lines 37–53; Col. 13, Line 66 – Col. 14, Line 14; Col. 15, Lines 31–52: “Traditionally, the sigmoid function and the tanh function have been the activation functions of choice. More recently, the ReLU function (ƒ(x)=max(0, x)) has been proposed for the activation function in order to make it easier to compute the activation function” and “In the simplest case, all of the partial dot products are computed in the same clock cycle and provided at the same time to the global channel 410” and “Rather, the physical address for the first SRAM for a particular core is based on the number of the particular core and the total amount of memory per core, with the addresses for subsequent memory banks in the core assigned consecutively”).
Regarding claim 5, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
wherein a size of the buffer is the size of an input tensor for the right subnetwork (Ko Col. 13, Lines 14–22; Col. 17, Lines 20–39; Col. 23, Lines 3–19; Col. 31, Lines 4–19: “As described in more detail below, some embodiments assign two sets of memory banks for storing network inputs and/or outputs, to allow for double buffering of the inputs and outputs (i.e., so that the network can execute for a first input while a second input is being stored in the unified memory)” and “The input values, in some embodiments, are quantized to have a fixed size (e.g., 4 bits), or set of fixed sizes (e.g., 4 bits or 8 bits) for ease and simplicity of computation” and “the allocation of the memory banks is the same first configuration for all of the even network inputs (InputA0 and InputB0, InputA2 and InputB2, etc.) and the same second configuration for all of the odd network inputs (InputA1 and InputB1, InputA3 and InputB3, etc.)” and “The input values, in some embodiments, are quantized to have a fixed size (e.g., 4 bits), or set of fixed sizes (e.g., 4 bits or 8 bits) for ease and simplicity of computation”).
Regarding claim 6, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
selecting an intermediate tensor from the set of intermediate tensors based on the measure of locality for intermediate tensors, thereby determining a size of the buffer based on the size of the selected intermediate tensor (Ko Col. 23, Line 53 – Col. 24, Line 5: “As shown in the first stage 1405, memory banks 0-2 are again allocated to weight and/or intermediate activation values. The network inputs for the first network A require two memory banks, so memory banks 3 and 4 are allocated to network input for network A. The network inputs for the second network B require a single memory bank B, so memory bank 5 is allocated to network input for network B. Network A is therefore the network that requires the maximum number of memory banks for its input (two), so two memory banks are allocated as “swap” banks (without a specific network designation). The two remaining memory banks 8 and 9 are allocated to the cpu cache. At this point, the logical addresses and physical addresses match up, with the input for network A being written to memory banks 3 and 4 and the input for network B being written to memory bank 5 (assuming inputs are received at respective sensors for both networks). While the first network input (e.g., InputA0 and InputB0) for each of these networks is being written, the computation fabric is not yet actually executing either network”).
Regarding claim 8, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
selecting an input size of the left subnetwork based on input shape of a reduced dimension shape of a portion of the trained convolutional neural network that corresponds to the left subnetwork, wherein the reduced dimension shape corresponds to a dimension of data along which convolution or a pooling kernel is applied (Ko Col. 7, Lines 8–45: “Pooling layers combine clusters of outputs from one layer into a single node at the next layer, as part of the process of reducing an image (which may have a large number of pixels) or other input item down to a smaller size (e.g., a vector output). In some embodiments, pooling layers can use max pooling (in which the maximum value among the clusters of node outputs is selected) or average pooling (in which the clusters of node outputs are averaged)”).
Regarding claim 9, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
wherein input shape in reduced dimension for the trained convolutional neural network is X, the measure of locality for an intermediate tensor of the set of intermediate tensors selected as buffer tensor is Z, and the method further comprises (Ko Col. 7, Lines 8–45: “the intermediate layers (referred to as “hidden” layers) may include convolutional layers, pooling layers, element-wise operation layers, fully-connected layers, and/or normalization layers. The convolutional layers of some embodiments use a small kernel (e.g., 2×2, 3×3, 5×5, etc.) to process blocks of input values (output values from a previous layer) in a set of two-dimensional grids (e.g., channels of pixels of an image, input feature maps) with the same set of parameters. The kernels (also referred to as filters) are three-dimensional, and multiple kernels are used to process each group of input values in a layer (resulting in a set of three-dimensional output grids, also referred to as output feature maps)”): 
selecting an input reduced dimension shape for the left subnetwork between Z and X, wherein the reduced dimension shape corresponds to a dimension of data along which convolution or a pooling kernel is applied (Ko Col. 7, Lines 8–45: “The convolutional layers of some embodiments use a small kernel (e.g., 2×2, 3×3, 5×5, etc.) to process blocks of input values (output values from a previous layer) in a set of two-dimensional grids (e.g., channels of pixels of an image, input feature maps) with the same set of parameters. The kernels (also referred to as filters) are three-dimensional, and multiple kernels are used to process each group of input values in a layer (resulting in a set of three-dimensional output grids, also referred to as output feature maps). Pooling layers combine clusters of outputs from one layer into a single node at the next layer, as part of the process of reducing an image (which may have a large number of pixels) or other input item down to a smaller size (e.g., a vector output). In some embodiments, pooling layers can use max pooling (in which the maximum value among the clusters of node outputs is selected) or average pooling (in which the clusters of node outputs are averaged)”).
Regarding claim 10, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
defining an input shape W for of a reduced dimension shape of the left subnetwork, wherein the reduced dimension shape corresponds to a dimension of data along which convolution or a pooling kernel is applied (Ko Col. 7, Lines 8–45: “the intermediate layers (referred to as “hidden” layers) may include convolutional layers, pooling layers, element-wise operation layers, fully-connected layers, and/or normalization layers. The convolutional layers of some embodiments use a small kernel (e.g., 2×2, 3×3, 5×5, etc.) to process blocks of input values (output values from a previous layer) in a set of two-dimensional grids (e.g., channels of pixels of an image, input feature maps) with the same set of parameters. The kernels (also referred to as filters) are three-dimensional, and multiple kernels are used to process each group of input values in a layer (resulting in a set of three-dimensional output grids, also referred to as output feature maps)”); 
Timofejevs teaches:
defining an approximate number of data points M that the left subnetwork generates using an equation M = 1 + (W - Z) * (N - 1)/(X - Z) (Timofejevs ¶0042: “In some implementations, performing the trapezium transformation further includes: in accordance with a determination that K·L≥L·NI+K·NO: (i) splitting the layer Lp to obtain a sub-layer Lp1 with K′ neurons and a sub-layer Lp2 with (K−K′) neurons such that K′·L≥L·NI+K′·NO; (ii) for the sub-layer Lp1 with K′ neurons, performing the constructing, and generating steps; and (iii) for the sub-layer Lp2 with K−K′ neurons, recursively performing the splitting, constructing, and generating steps”—[wherein W, Z, N, and X are simply undefined variables and the BRI of W, Z, N, and X is any suitable mathematical substitution for the variable]); and 
computing an aggregation rate of operation for the left subnetwork based on M (Timofejevs ¶0043: “In some implementations, the neural network topology includes a multilayer perceptron network. In such cases, the method further includes, for each pair of consecutive layers of the multilayer perceptron network, iteratively performing the trapezium transformation and computing the weight matrix for the equivalent sparsely connected network”).
The same motivation that was utilized for combining Ko with Timofejevs, as set forth in claim 1, is equally applicable to claim 10.
Regarding claim 11, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
wherein transforming the trained convolutional neural network comprises: connecting the left subnetwork to a recurrent neural network (RNN) layer and connecting an output of the RNN layer to the right subnetwork (Ko Col. 6, Line 63 – Col. 7, Line 45: “Other neural networks of other embodiments have several output nodes that provide more than one output value. Furthermore, while the network 100 includes only a few nodes 102 per layer, a typical neural network may include a varying number of nodes per layer (with some layers having several thousand nodes) and significantly more layers than shown (e.g., several dozen layers). In addition, the neural networks of other embodiments may be types of networks other than feed forward networks (e.g., recurrent networks, regulatory feedback networks, radial basis function networks, etc.) … However, as mentioned, the neural networks of some embodiments are convolutional feed-forward neural networks. In this case, the intermediate layers (referred to as “hidden” layers) may include convolutional layers, pooling layers, element-wise operation layers, fully-connected layers, and/or normalization layers. The convolutional layers of some embodiments use a small kernel (e.g., 2×2, 3×3, 5×5, etc.) to process blocks of input values (output values from a previous layer) in a set of two-dimensional grids (e.g., channels of pixels of an image, input feature maps) with the same set of parameters. The kernels (also referred to as filters) are three-dimensional, and multiple kernels are used to process each group of input values in a layer (resulting in a set of three-dimensional output grids, also referred to as output feature maps) … he convolutional layer receives a set of input activation values 200 organized as a three-dimensional array. This three-dimensional array is typically either (i) a set of input values for the network, if the convolutional layer is the first layer of the network, or (ii) a set of output values of a previous layer of the network (e.g., a previous convolutional layer, a pooling layer, etc.). The array can be conceptualized as a set of two-dimensional grids, also referred to as input feature maps or input channels for the layer, as shown in the figure. In this example, the dimensions of the input values are 6×6×3 (i.e., three 6×6 input channels)”—[wherein other neural networks include feed forward recurrent networks (i.e., RNNs) which is integrated with convolutional layers (i.e., connected)]).
Timofejevs teaches:
wherein a reduced dimension index of the trained convolutional neural network is defined as a time series dimension for the RNN layer (Timofejevs ¶0104, ¶0444: “In some implementations, the trained neural network is a recurrent neural network (RNN), and the analog network further includes one or more analog components other than the plurality of operational amplifiers, and the plurality of resistors” and “Referring next to FIG. 31F, in some implementations, the trained neural network is a long short-term memory (LSTM) network, AND the integrated circuit further includes one or more clock modules to synchronize signal tacts and to allow time series processing”), and 
the output of the RNN layer is flattened in time series dimension before being input to the right subnetwork (Timofejevs ¶0173: “The output of the pooling layer 436 is flattened by the layer 438 and input to a fully connected neural network with one or more layers (e.g., the layers 440 and 442)”).
The same motivation that was utilized for combining Ko with Timofejevs, as set forth in claim 1, is equally applicable to claim 11.
Regarding claim 12, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
wherein transforming the trained convolutional neural network to the equivalent buffered neural network is performed in accordance with a determination that the measure of locality of a selected buffer tensor [is below a predetermined threshold] (Ko Fig. 7, Col. 16, lines 59–65: “The process 700 then assigns (at 710) cores of the IC for each layer of the neural network. In some embodiments, this core assignment involves various optimizations to determine how to minimize the power and/or memory usage. Optimized core assignment is described in more detail in U.S. patent application Ser. No. 16/525,445, which is incorporated herein by reference”).
Timofejevs teaches:
[wherein transforming the trained convolutional neural network to the equivalent buffered neural network is performed in accordance with a determination that the measure of locality of a selected buffer tensor] is below a predetermined threshold (Timofejevs ¶0069: “in accordance with a determination that the respective bias value is below a predetermined minimum bias threshold, replacing the respective analog neuron with a linear junction in the equivalent analog network”).
The same motivation that was utilized for combining Ko with Timofejevs, as set forth in claim 1, is equally applicable to claim 12.
Regarding claim 13, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
computing a first measure of locality for a first portion of the trained convolutional neural network and a second measure of locality for a second portion of the trained convolutional neural network, based on the dependencies between the set of input tensors and the set of intermediate tensors (Ko Fig. 6, Col. 15, Lines 31–64; Col. 28, Line 66 – Col. 29, Line 16; Col. 2, Line 19–33; Col. 21, Line 28–41: “The system memory interface 640 also includes a virtual address to physical address mapping table 645 … the physical address for the first SRAM for a particular core is based on the number of the particular core and the total amount of memory per core, with the addresses for subsequent memory banks in the core assigned consecutively” and “The configuration data from the cluster controller 1605 specifies whether to send these dot products in one direction or the other along the global channel for each dot product bus lane, or to aggregate the dot products from the neighboring channels locally, depending on where post-processing will occur for each dot product” and “Once the first portion of the network is completed by the fabric, the system controller provides the fabric with the instructions for the second portion (either a second portion of the first layer, or the second layer of the network), and so on until the network has been fully executed” and “This allows the computation fabric to read the second network input (Input1) from physical memory banks 5 and 6 using logical memory bank addresses 5 and 6 while the input processing circuit is instructed to write the second network input”—[wherein the system maps the address based on spatial (i.e., physical address) and temporal (e.g., subsequent and consecutively) dependencies between inputs and outputs (i.e., tensors) for more than one portion]).
transforming the first portion of the trained convolutional neural network into an equivalent neural network that includes the left subnetwork and the right subnetwork interconnected via the buffer (Ko Figs. 3–4, Col. 16, Line 59 – Col. 17, Line 39; Col. 15, Lines 18–30: “The process 700 then assigns (at 710) cores of the IC for each layer of the neural network … As described in more detail below, some embodiments assign two sets of memory banks for storing network inputs and/or outputs, to allow for double buffering of the inputs and outputs (i.e., so that the network can execute for a first input while a second input is being stored in the unified memory)” and “This system memory interface, in some embodiments, includes a crossbar circuit that enables various read/write access mechanisms from the CPU 630 and input processing circuit 635 to share the interface and access the various different clusters 605-620 of unified memory banks”),
interconnecting the second portion of the trained convolutional network with the equivalent neural network to obtain the equivalent buffered neural network (Ko Figs. 3–4, Col. 15, Lines 18–30: “This system memory interface, in some embodiments, includes a crossbar circuit that enables various read/write access mechanisms from the CPU 630 and input processing circuit 635 to share the interface and access the various different clusters 605-620 of unified memory banks”—[(emphasis added)]). 
 Timofejevs teaches:
in accordance with a determination that the first measure of locality is below a predetermined threshold (Timofejevs ¶0069: “in accordance with a determination that the respective bias value is below a predetermined minimum bias threshold, replacing the respective analog neuron with a linear junction in the equivalent analog network”—[Examiner notes this is a contingent limitation pursuant to MPEP § 2111.04(II). Thus, this limitation is not required and holds no patentable weight. However, in the interest of compact prosecution, examiner is including these limitations in the examination]); and 
in accordance with a determination that the second measure of locality is above the predetermined threshold (Timofejevs ¶0069: “In some implementations, the method further includes, for each analog neuron of the equivalent analog network: (i) computing a respective bias value for the respective analog neuron based on the weights of the trained neural network, while computing the weight matrix; (ii) in accordance with a determination that the respective bias value is above a predetermined maximum bias threshold”—[Examiner notes this is a contingent limitation pursuant to MPEP § 2111.04(II). Thus, this limitation is not required and holds no patentable weight. However, in the interest of compact prosecution, examiner is including these limitations in the examination]).
The same motivation that was utilized for combining Ko with Timofejevs, as set forth in claim 1, is equally applicable to claim 13.
Regarding claim 14, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
wherein the neural network topology includes a first portion of the trained convolutional neural network and a second portion of the trained convolutional neural network, wherein the first portion transforms a first set of input tensors and generates a first set of intermediate tensors, the second portion transforms a second set of input tensors and generates a second set of intermediate tensors, and the method further comprises (Ko Fig. 18, Col. 2, Line 19–33; Col. 32, Line 40 – Col. 33, Line 8: “Once the first portion of the network is completed by the fabric, the system controller provides the fabric with the instructions for the second portion (either a second portion of the first layer, or the second layer of the network), and so on until the network has been fully executed” and “FIG. 18 conceptually illustrates a process 1800 of some embodiments for executing a set of instructions (or a portion of a set of instructions) to compute the output of a neural network node (specifically, a convolutional or fully-connected node). The process 1800 is executed by the computation fabric of an IC, such as that described above. Typically, the process 1800 is executed simultaneously for multiple nodes, and operations 1810-1840 are performed repeatedly for multiple activation windows (i.e., multiple groups of layer input values loaded into the activation window buffer) in order to completely execute a layer (or portion of a layer) of the neural network. In the case of the process 1800, the dot product can be computed in a single cycle and does not involve any split filter slices (i.e., no time-multiplexing is required)”—[wherein the instructions (i.e., topology) include more than one portion to transform a first input and a second input respectively to generate intermediate activations (i.e., intermediate tensors)]):
computing (i) a first measure of locality for the first portion of the trained convolutional neural network based on the dependencies between the first set of input tensors and the set of intermediate tensors, and (ii) a second measure of locality for the second portion of the trained convolutional neural network, based on the dependencies between the second set of input tensors and the second set of intermediate tensors (Ko Col. 33, Lines 9–40: “As shown, the process begins (at 1805) by loading the weights for a node into filter slice buffers of one or more cores. In addition, the process loads (at 1810) the input (activation) values for the node into the activation window buffer of these cores … The process 1800 then provides (at 1825) the aggregated dot product to an activation post-processor specified by configuration data. This configuration data, in some embodiments, is generated by a compiler and parsed by the hierarchical controller circuits of the neural network chip fabric, and indicates which channel segment will perform the post-processing. Each of the channel segments has an equal number of post-processing units, and the post-processing unit in the selected channel that corresponds to the dot product bus lane that aggregates the dot product is the post-processing unit that receives the aggregated dot product”—[wherein the configuration data includes the measure of locality (e.g., generated by the compiler) to indicate which portions of the neural network will be processed according to the dependencies between the inputs]); 
transforming the first portion of the trained convolutional neural network into a first equivalent neural network that includes a third subnetwork and a fourth subnetwork interconnected via a first buffer (Ko Col. 4, Lines 12 – Col. 5, Line 9: “For the IC to execute a single network using this synchronous execution/I/O scheme, in some embodiments the compiler allocates the first set of memory banks for the input processing circuit to write the first input (and third input, fifth input, etc.) and the second set of memory banks for the input processing circuit to write the second input (and fourth input, sixth input, etc.). While the computation fabric executes the network for the first input, the first set of memory banks are allocated for storing the current input (which the fabric reads in at least the first layer of the network), and in some embodiments can be used as intermediate activation storage for subsequent layers once the input is no longer needed”); 
transforming the second portion of the trained convolutional neural network into a second equivalent neural network that includes a fifth subnetwork and a sixth subnetwork interconnected via a second buffer (Ko Col. 4, Lines 12 – Col. 5, Line 9: “For the IC to execute a single network using this synchronous execution/I/O scheme, in some embodiments the compiler allocates the first set of memory banks for the input processing circuit to write the first input (and third input, fifth input, etc.) and the second set of memory banks for the input processing circuit to write the second input (and fourth input, sixth input, etc.). While the computation fabric executes the network for the first input, the first set of memory banks are allocated for storing the current input (which the fabric reads in at least the first layer of the network), and in some embodiments can be used as intermediate activation storage for subsequent layers once the input is no longer needed”); and
generating the equivalent buffered neural network based on the first equivalent neural network and the second equivalent neural network (Ko Figs. 3–4, Col. 16, Line 59 – Col. 17, Line 39: “The process 700 then assigns (at 710) cores of the IC for each layer of the neural network … As described in more detail below, some embodiments assign two sets of memory banks for storing network inputs and/or outputs, to allow for double buffering of the inputs and outputs (i.e., so that the network can execute for a first input while a second input is being stored in the unified memory)”—[wherein the system assigns cores for each layer (i.e., generates the network) including two sets of memory banks (i.e., first and second) to process inputs and outputs]).
Timofejevs teaches:
in accordance with a determination that the first measure of locality is below a predetermined threshold (Timofejevs ¶0069: “in accordance with a determination that the respective bias value is below a predetermined minimum bias threshold, replacing the respective analog neuron with a linear junction in the equivalent analog network”—[Examiner notes this is a contingent limitation pursuant to MPEP § 2111.04(II). Thus, this limitation is not required and holds no patentable weight. However, in the interest of compact prosecution, examiner is including these limitations in the examination]), and
in accordance with a determination that the second measure of locality is below the predetermined threshold (Timofejevs ¶0069: “In some implementations, the method further includes, for each analog neuron of the equivalent analog network: (i) computing a respective bias value for the respective analog neuron based on the weights of the trained neural network, while computing the weight matrix; (ii) in accordance with a determination that the respective bias value is above a predetermined maximum bias threshold”—[Examiner notes this is a contingent limitation pursuant to MPEP § 2111.04(II). Thus, this limitation is not required and holds no patentable weight. However, in the interest of compact prosecution, examiner is including these limitations in the examination]).
The same motivation that was utilized for combining Ko with Timofejevs, as set forth in claim 1, is equally applicable to claim 14.
Regarding claim 15, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
computing a respective measure of locality for each layer of the trained convolutional neural network based on dependencies between the set of input tensors and a respective subset of the set of intermediate tensors (Ko Col. 33, Lines 9–40: “As shown, the process begins (at 1805) by loading the weights for a node into filter slice buffers of one or more cores. In addition, the process loads (at 1810) the input (activation) values for the node into the activation window buffer of these cores … The process 1800 then provides (at 1825) the aggregated dot product to an activation post-processor specified by configuration data. This configuration data, in some embodiments, is generated by a compiler and parsed by the hierarchical controller circuits of the neural network chip fabric, and indicates which channel segment will perform the post-processing. Each of the channel segments has an equal number of post-processing units, and the post-processing unit in the selected channel that corresponds to the dot product bus lane that aggregates the dot product is the post-processing unit that receives the aggregated dot product”—[wherein the configuration data includes the measure of locality (e.g., generated by the compiler) to indicate which portions (e.g., each layer or portion of each layer) of the neural network will be processed according to the dependencies between the inputs and outputs]); and 
transforming the trained convolutional neural network into the equivalent buffered neural network further based on the respective measure of locality for each layer of the trained convolutional neural network (Ko Col. 4, Lines 12 – Col. 5, Line 9: “For the IC to execute a single network using this synchronous execution/I/O scheme, in some embodiments the compiler allocates the first set of memory banks for the input processing circuit to write the first input (and third input, fifth input, etc.) and the second set of memory banks for the input processing circuit to write the second input (and fourth input, sixth input, etc.). While the computation fabric executes the network for the first input, the first set of memory banks are allocated for storing the current input (which the fabric reads in at least the first layer of the network), and in some embodiments can be used as intermediate activation storage for subsequent layers once the input is no longer needed”).
Regarding claim 16, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
wherein transforming the trained convolutional neural network into the equivalent buffered neural network comprises: splitting the trained convolutional neural network into the left subnetwork and the right subnetwork, based on the neural network topology and the measure of locality (Ko Figs. 3–4, Col. 18, Lines 20–33; Col. 16, Line 30 – Col. 17, Line 39: “The process 700 then assigns (at 710) cores of the IC for each layer of the neural network … As described in more detail below, some embodiments assign two sets of memory banks for storing network inputs and/or outputs, to allow for double buffering of the inputs and outputs (i.e., so that the network can execute for a first input while a second input is being stored in the unified memory)” and “In some embodiments, a compiler program generates the configuration data for the IC that enables the neural network computation fabric to execute a particular neural network for a particular function. This compiler receives a specification of the network and determines which memory banks of which cores will be involved in the execution of each network layer. Based on the assignment of core memories to the various neural network data (weights, input data, intermediate activations), the compiler determines the virtual to physical address mapping table that is used by the system memory interface”); and
reducing a size of the buffer to a value that is the greater of the output size of the left subnetwork and the input size of the right-left subnetwork (Ko Col. 32, Lines 17–39: “To reduce the circuit area and power required for dot product computations (which use the majority of resources for neural network inference), the partial dot product computation circuits (e.g., the adder trees 1735) of some embodiments map each of a first number of input values to a second number (e.g., 25% of the first number) of dot product inputs, such that each dot product input only receives at most one input value with a non-zero corresponding weight value. Specifically, in some embodiments, the partial dot product computation circuit includes at least two sets of wires for each input (activation) value, with each of the sets of wires for a given input value connected to at least two different dot product inputs (so that each input value can be provided to at least two different inputs). With a guarantee of at least 75% weight sparsity (i.e., at least 75% of the weight values for any set of input values are zero), the number of dot product inputs is set at 25% (or slightly more than 25%, for redundancy) of the number of input values loaded in an activation window for the dot product computation circuit. In some embodiments, the weight sparsity is guaranteed by the training algorithm used to train the weights to perform a specific purpose, and the IC is adaptable for any set of weights that meets the guarantee”—[wherein the BRI of a value that is the greater of the output size of the left subnetwork and the input size of the right-left subnetwork is a buffer larger than the output size and the input size of the sub-networks]).
Timofejevs teaches:
recursively splitting the right subnetwork further into a right-left subnetwork and a right-right subnetwork based on another measure of locality for a layer in the right subnetwork and the neural network topology (Timofejevs ¶0042: “In some implementations, performing the trapezium transformation further includes: in accordance with a determination that K.Math.L≥L.Math.N.sub.I+K.Math.N.sub.O: (i) splitting the layer L.sub.p to obtain a sub-layer L.sub.p1 with K′ neurons and a sub-layer L.sub.p2 with (K−K′) neurons such that K′.Math.L≥L.Math.N.sub.I+K′.Math.N.sub.O; (ii) for the sub-layer L.sub.p1 with K′ neurons, performing the constructing, and generating steps; and (iii) for the sub-layer L.sub.p2 with K−K′ neurons, recursively performing the splitting, constructing, and generating steps”—[(emphasis added)]). 
The same motivation that was utilized for combining Ko with Timofejevs, as set forth in claim 1, is equally applicable to claim 16.
Regarding claim 17, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
wherein transforming the trained convolutional neural network further comprises: associating the left subnetwork with an aggregation rate of operation that indicates a number of times the left subnetwork should be run for each time the right subnetwork is run (Ko Col. 28, Lines 51–65: “These partial dot products are output to the dot product bus 1610, which aggregates the partial dot products from the cores 1630 of the local cluster. The dot product bus 1610, in some embodiments, includes a number of independent dot product bus lanes that each receives partial dot products from the cores, aggregates these together, and provides the aggregated dot products to the post-processing circuits. In some embodiments, each lane of the dot product bus corresponds to (i) one of the adder trees in each of the cores (i.e., dot product bus lane N receives the partial dot products from each of the adder trees of index N in the cores), and (ii) one of the post-processing units in each of the clusters (i.e., dot product bus lane N provides its aggregated output to the post-processing unit Nin one of the clusters, as specified by the configuration data)”—[(emphasis added)]).
Regarding claim 18, Ko in view of Timofejevs teaches all the limitations of claim 17. 
Ko teaches:
wherein the trained convolutional neural network has an aggregation rate of operation X, generates an intermediate tensor T of N data points each time it operates, and the method further comprises (Ko Col. 28, Lines 15–19: “The cluster controller 1605 configures the dot product bus 1610, post-processor 1615, and activation write bus 1620 as per the configuration instructions received from the fabric controller in some embodiments. For the dot product bus 1610, this configuration data specifies, in some embodiments, (i) which partial dot products are to be added together as part of the same neural network computation node and (ii) to which post-processing unit each aggregated dot product is sent (the post-processor 1615 of some embodiments includes numerous post-processing units with the same circuitry). In other embodiments, the post-processing unit that receives each aggregated dot product is not specified as configuration data because there are an equal number of dot product bus lanes and post-processing units, so that the data from each lane is provided as the primary input to a different post-processing unit”—[wherein the controller configures (i.e., sets aggregation rates) to generate the tensors for datapoints)]):
associating the buffer with the intermediate tensor T and defining the buffer to have size equal to N (Ko Col. 31, Lines 4–19: “The read control and cache 1715 also reads data (layer input values) from the activation memory 1710 into the activation window buffers 1730 and 1732. In addition, the read controller 1715 arranges the input values within the activation window buffers 1730 and 1732 in some embodiments to match up with the weight values in the filters. In some embodiments, the input values in an activation window read into the buffers 1730 (and 1732) include all of the values (as opposed to the 25% of the values needed for a particular filter), because the activation window is multiplied by numerous filters simultaneously (i.e., some or all of the filters stored in the filter slice buffers). The input values, in some embodiments, are quantized to have a fixed size (e.g., 4 bits), or set of fixed sizes (e.g., 4 bits or 8 bits) for ease and simplicity of computation”—[wherein the controller arranges (i.e., associates) the data to match up (i.e., equal size) with the weight values in the filters (i.e., tensors)]); 
defining the left subnetwork to generate M data points each time it operates and to have an aggregation rate of approximately X * N/M (Ko Col. 28, Line 66 – Col. 29, Line 41; Col. 31, Line 47 – Col. 32, Line 16: “The configuration data from the cluster controller 1605 specifies whether to send these dot products in one direction or the other along the global channel for each dot product bus lane, or to aggregate the dot products from the neighboring channels locally, depending on where post-processing will occur for each dot product … Using a small, fixed number of bits for the outputs of each computation node allows for (i) power and resource savings by enabling smaller computations and (ii) certainty in the scheduling of computations (i.e., by knowing that all input values will be within a particular range) that enables further power and resource savings in design. The non-linear activation function, in some embodiments, is implemented as a lookup table or a piecewise linear function based on configuration data, rather than a hardwired function. This enables the IC to execute different neural networks that use different activation functions and, in some embodiments, allows for different activation functions to be used in different layers of the neural network or even for different filters in the same layer” and “In some embodiments, the number of filter slice buffers in each of the sets 1720 and 1722 is equal to the number of adder trees 1735 in the core, as well as the number of dot product bus lanes, post-processing units, and activation write bus lanes in each segment. Thus, for a typical neural network computation node, the partial dot products computed by the adder trees 1735 in multiple cores having a particular index are aggregated by the dot product bus lane with the same index, that aggregated dot product is provided for post-processing to one of the post-processing units with the same index (i.e., the post-processing unit with that index in one of the channel segments), and the output of the post-processing unit is transported by the activation write bus with the same index to its destination core”—[wherein the aggregation rate (i.e., approximately X * N/M) is set by the defined filter slice buffers governed by configuration data from the cluster controller 1605]); and 
defining the right subnetwork to have an aggregation rate of X ((Ko Col. 28, Line 66 – Col. 29, Line 41; Col. 31, Line 47 – Col. 32, Line 16: “The configuration data from the cluster controller 1605 specifies whether to send these dot products in one direction or the other along the global channel for each dot product bus lane, or to aggregate the dot products from the neighboring channels locally, depending on where post-processing will occur for each dot product … Using a small, fixed number of bits for the outputs of each computation node allows for (i) power and resource savings by enabling smaller computations and (ii) certainty in the scheduling of computations (i.e., by knowing that all input values will be within a particular range) that enables further power and resource savings in design. The non-linear activation function, in some embodiments, is implemented as a lookup table or a piecewise linear function based on configuration data, rather than a hardwired function. This enables the IC to execute different neural networks that use different activation functions and, in some embodiments, allows for different activation functions to be used in different layers of the neural network or even for different filters in the same layer” and “In some embodiments, the number of filter slice buffers in each of the sets 1720 and 1722 is equal to the number of adder trees 1735 in the core, as well as the number of dot product bus lanes, post-processing units, and activation write bus lanes in each segment. Thus, for a typical neural network computation node, the partial dot products computed by the adder trees 1735 in multiple cores having a particular index are aggregated by the dot product bus lane with the same index, that aggregated dot product is provided for post-processing to one of the post-processing units with the same index (i.e., the post-processing unit with that index in one of the channel segments), and the output of the post-processing unit is transported by the activation write bus with the same index to its destination core”—[wherein the aggregation rate is set by the defined filter slice buffers governed by configuration data from the cluster controller 1605]).
Regarding claim 19, Ko teaches:
A system, comprising: one or more processors; and memory storing one or more programs for execution by the one or more processors, the one or more programs including instructions for (Ko Fig. 20, Col. 36, Lines 1–15: “FIG. 20 conceptually illustrates an electronic system 2000 with which some embodiments of the invention are implemented. The electronic system 2000 can be used to execute any of the control and/or compiler systems described above in some embodiments. The electronic system 2000 may be a computer (e.g., a desktop computer, personal computer, tablet computer, server computer, mainframe, a blade computer etc.), phone, PDA, or any other sort of electronic device. Such an electronic system includes various types of computer readable media and interfaces for various other types of computer readable media. Electronic system 2000 includes a bus 2005, processing unit(s) 2010, a system memory 2025, a read-only memory 2030, a permanent storage device 2035, input devices 2040, and output devices 2045”—).
Regarding claim 20, Ko teaches:
A non-transitory computer-readable storage medium, storing one or more programs configured for execution by one or more processors of a server system, the one or more programs including instructions, which when executed by the one or more processors cause the server system to (Ko Fig. 20, Col. 36, Lines 1–15: “FIG. 20 conceptually illustrates an electronic system 2000 with which some embodiments of the invention are implemented. The electronic system 2000 can be used to execute any of the control and/or compiler systems described above in some embodiments. The electronic system 2000 may be a computer (e.g., a desktop computer, personal computer, tablet computer, server computer, mainframe, a blade computer etc.), phone, PDA, or any other sort of electronic device. Such an electronic system includes various types of computer readable media and interfaces for various other types of computer readable media. Electronic system 2000 includes a bus 2005, processing unit(s) 2010, a system memory 2025, a read-only memory 2030, a permanent storage device 2035, input devices 2040, and output devices 2045”—).
Regarding the remaining limitations of claims 19 and 20, although varying in scope, the remaining limitations of claims 19 and 20 are substantially the same as the limitations of claim 1. Thus, the remaining limitations of claims 19 and 20 are rejected using the same reasoning and analysis as claim 1 above.

Claims 4 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Ko in view of Timofejevs, and further in view of Nez et al., (US-20240169192-A1), hereinafter “Nez”.
Regarding claim 4, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko teaches:
wherein computing the measure of locality is further based on parameters of convolution operations, including kernels, [strides,] padding, [and dilation,] for a predetermined set of layers of the trained convolutional neural network (Ko Col. 7, Lines 8–58: “In this case, the intermediate layers (referred to as “hidden” layers) may include convolutional layers, pooling layers, element-wise operation layers, fully-connected layers, and/or normalization layers. The convolutional layers of some embodiments use a small kernel (e.g., 2×2, 3×3, 5×5, etc.) to process blocks of input values (output values from a previous layer) in a set of two-dimensional grids (e.g., channels of pixels of an image, input feature maps) with the same set of parameters. The kernels (also referred to as filters) are three-dimensional, and multiple kernels are used to process each group of input values in a layer (resulting in a set of three-dimensional output grids, also referred to as output feature maps). Pooling layers combine clusters of outputs from one layer into a single node at the next layer, as part of the process of reducing an image (which may have a large number of pixels) or other input item down to a smaller size (e.g., a vector output). In some embodiments, pooling layers can use max pooling (in which the maximum value among the clusters of node outputs is selected) or average pooling (in which the clusters of node outputs are averaged)”).
Timofejevs teaches:
[wherein computing the measure of locality is further based on parameters of convolution operations, including kernels,] strides, [padding, and dilation, for a predetermined set of layers of the trained convolutional neural network] (Timofejevs ¶0403: “Referring next to FIG. 28Q, the neural network topology includes (28172) a convolutional neural network (CNN) that includes (i) a plurality of partially connected layers (e.g., sequence of convolutional and pooling layers; each pooling layer is assumed to be a convolutional later with stride larger than 1) and (ii) one or more fully-connected layers (the sequence ends in the fully-connected layers)”)
Ko in view of Timofejevs does not appear to explicitly teach:
[wherein computing the measure of locality is further based on parameters of convolution operations, including kernels, strides, padding,] and dilation, [for a predetermined set of layers of the trained convolutional neural network.]
However, Nez teaches:
[wherein computing the measure of locality is further based on parameters of convolution operations, including kernels, strides, padding,] and dilation, [for a predetermined set of layers of the trained convolutional neural network] (Nez ¶0032, ¶0074: “Convolution modules 110A, 110B, 110C, and 110D are in communication with input data memory 122, and are each configured to perform mathematical operations on input values from input data memory 122, and weight values. Each convolution module may output values to one or more of adder modules 112A, 112B, 112C, and 112D or accumulation memory 124. Each convolution module may provide direct support for different parameters of mathematical operations, such as a kernel size of height (KH)×width (KW), vertical and horizontal strides, dilation, padding, etc. In some embodiments of device 100” and “At S852, a determining section determines the size of a kernel used for inference of the neural network. The determining section may determine other characteristics of the kernel, such as dilation, etc. Because these values are not configurable, and are part of the neural network configuration, they may be obtained as part of the neural network configuration, and the determining section may determine these characteristics by simply referring to the values in the neural network configuration obtained at S850.”).
The methods of Ko, the teachings of Nez, and the instant application are analogous art because they pertain to processing data with convolutional neural networks.
It would be obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the methods of Ko with the teachings of Nez to provide determinations based on dilation. One would be motivated to do so to perform mathematical operations on input values from input data memory (Nez ¶0032).
Regarding claim 7, Ko in view of Timofejevs teaches all the limitations of claim 1. 
Ko in view of Timofejevs does not appear to explicitly teach:
wherein the buffer is a rotating FIFO queue having a fixed length.
However, Nez teaches:
wherein the buffer is a rotating FIFO queue having a fixed length (Nez ¶0114: “In the foregoing embodiment, the receiving section stores instructions in the writable memory. In other embodiments, instructions stored in the external memory, such as DDR, are later loaded into on-chip FIFO queues. The receiving section may include a dedicated instruction fetching module which loads instructions from external DDR memory, and stores them into FIFOs as instructions are consumed by other modules”).
The same motivation that was utilized for combining Ko with Nez, as set forth in claim 4, is equally applicable to claim 7.

Prior Art of Record
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Kim et al., (“Device and method with flexible neural network”) discloses partitioning parts of neural networks for optimization in processing “A device includes: an operation module configured to store and operate a weight for an operation of a layer of a neural network model; a control module configured to generate setting information for performing the operation of the layer by the neural network model using the stored weight; an input module configured to receive input data for the operation of the layer based on the generated setting information; a merging module configured to receive operation results of the operation of the layer from the operation module and merge the received operation results of the layer; a post-processing module configured to receive the merged operation results of the layer from the merging module and post-process the received merged operation results of the layer; and an output stream module configured to convert and store the post-processed operation results based on the generated setting information.” Kim, Abstract.
Ma et al., (“Neural network hardware accelerator architectures and operating method thereof”) discloses using different memory buffers to optimize neural networks “A memory-centric neural network system and operating method thereof includes: a processing unit; semiconductor memory devices coupled to the processing unit, the semiconductor memory devices contain instructions executed by the processing unit; a weight matrix constructed with rows and columns of memory cells, inputs of the memory cells of a same row are connected to one of Axons, outputs of the memory cells of a same column are connected to one of Neurons; timestamp registers registering timestamps of the Axons and the Neurons; and a lookup table containing adjusting values indexed in accordance with the timestamps, the processing unit updates the weight matrix in accordance with the adjusting values.” Ma, Abstract.

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS SHINE whose telephone number is (571)272-2512. The examiner can normally be reached M-F, 11a-7p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached on (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/N.B.S./Examiner, Art Unit 2126       

/DAVID YI/Supervisory Patent Examiner, Art Unit 2126
Read full office action
Prosecution Timeline

Apr 14, 2023
Application Filed
Mar 03, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/173,148
Patent 12579449
HYDROCARBON OIL FRACTION PREDICTION WHILE DRILLING
2y 5m to grant Granted Mar 17, 2026
17/213,958
Patent 12572440
AUTOMATICALLY DETECTING WORKLOAD TYPE-RELATED INFORMATION IN STORAGE SYSTEMS USING MACHINE LEARNING TECHNIQUES
2y 5m to grant Granted Mar 10, 2026
17/172,707
Patent 12561554
ERROR IDENTIFICATION FOR AN ARTIFICIAL NEURAL NETWORK
2y 5m to grant Granted Feb 24, 2026
17/103,827
Patent 12533800
TRAINING REINFORCEMENT LEARNING AGENTS TO LEARN FARSIGHTED BEHAVIORS BY PREDICTING IN LATENT SPACE
2y 5m to grant Granted Jan 27, 2026
17/183,870
Patent 12536428
KNOWLEDGE GRAPHS IN MACHINE LEARNING DECISION OPTIMIZATION
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
38%
Grant Probability
82%
With Interview (+44.6%)
5y 1m
Median Time to Grant
Low
PTA Risk
Based on 37 resolved cases by this examiner. Grant probability derived from career allow rate.