Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Detailed Action
This action is in response to the claims filed 6/13/2023:
Claims 1 – 20 are pending.
Claims 1, 9, and 15 are independent.
Specification
The disclosure is objected to because of the following informalities:
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 15-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 15, "an activation operation on the first and second portions of the plurality of activation values alternatively with a plurality of weight values to generate" is indefinite. It's grammatically unclear what is alternating with what. Specifically, one of ordinary skill in the art could read any number of elements of the claim as "alternating": the portions, the weights, the activation operation alternating between membrane-potential portions and weight sets, temporal alternation, spatial alternation, a logical OR, etc. It would be similarly unclear to one of ordinary skill in the art if the plurality of weight values to generate is mutually exclusive or selectable. Since the scope of the claim cannot be reasonably determined, the claim is seen as being indefinite. In the interest of further examination the claim limitation is interpreted as performing an activation operation optionally with or without weight values, i.e. the claim limitation optionally collapses to "an activation operation on the first and second portions of the plurality of activation values, wherein the second memory [...]".
Regarding claim 16, "the locations" lacks antecedent basis. "Locations" is recommended.
Claims 17-20 are rejected with respect to their dependence on the rejected claims.
Claim Rejections - 35 USC § 101
101 Rejection
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-14 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter.
Regarding Claim 1: Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to an non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a process, which is directed to a method, one of the statutory categories.
Step 2A Prong One Analysis: Claim 1 under its broadest reasonable interpretation is a series of mental processes and mathematical calculations and relationships. For example, but for the generic computer components language, the above limitations in the context of this claim encompass machine learning processing, including the following:
generating, based on a plurality of activation values corresponding to the plurality of spikes, location information including a plurality of count numbers each corresponding to non-zero values, in the plurality of activation values, in one of a plurality of rows of the input (observation, evaluation, and judgement),
performing, based on the location information, a matrix multiplication with the non-zero values in a first number of rows in the plurality of rows with a first group of filters of weight values to generate a plurality of first membrane potentials for outputting a first output spike at a first time (mathematical calculations and relationships)
performing, based on the location information, the matrix multiplication with the non-zero values in a second number of rows in the plurality of rows with the first group of filters of weight values to generate a plurality of second membrane potentials for outputting a second output spike at a second time after the first time (mathematical calculations and relationships)
Therefore, claim 1 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis: Claim 1 recites additional elements “receiving an input in an input layer of a spiking neural network during a plurality of time steps, wherein the input includes a plurality of spikes” which amounts to gathering and outputting of data which is insignificant extra-solution activity (See MPEP 2106.05(g)) that does not integrate the judicial exception into a practical application. Therefore, claim 1 is directed to a judicial exception.
Step 2B Analysis: Claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component and insignificant extra-solution activity. The gathering and outputting of data is considered well-understood, routine, and conventional in the art (See MPEP 2106.05(d)(II)(i)).
For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to dependent claims 2-8. The additional limitations of the dependent claims are addressed briefly below:
Dependent claim 2 recites additional observation, evaluation, and judgement “the first number is different from the second number”
Dependent claim 3 recites additional observation, evaluation, and judgement “wherein the first number is smaller than the second number.”
Dependent claim 4 recites additional observation, evaluation, and judgement “the location information further includes a plurality of channel numbers and a plurality of height numbers, wherein each of the channel numbers and a corresponding height number correspond to a location of the one of the plurality of rows”
Dependent claim 5 recites additional mathematical calculations and relationships “performing, based on the location information, the matrix multiplication with the non-zero values in the first number of rows in the plurality of rows with a second group of filters of weight values to generate a plurality of third membrane potentials for outputting a third output spike at a third time after the second time”
Dependent claim 6 recites additional observation, evaluation, and judgement “wherein a number of filters in the first group equals to a number of filters in the second group”
Dependent claim 7 recites additional mathematical calculations and relationships “performing, based on the location information, the matrix multiplication with the non-zero values in the first number of rows in the plurality of rows with a third group of filters of weight values to generate a plurality of fourth membrane potentials for outputting a fourth output spike at a fourth time after the third time”
Dependent claim 8 recites additional observation, evaluation, and judgement “generating a neural network result, based on the first and second output spikes, for an image recognition operation of the input”
Regarding Claim 9: Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 9 is directed to an non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a process, which is directed to a method, one of the statutory categories.
Step 2A Prong One Analysis: Claim 9 under its broadest reasonable interpretation is a series of mental processes and mathematical calculations and relationships. For example, but for the generic computer components language, the above limitations in the context of this claim encompass machine learning processing, including the following:
performing an activation operation with one of groups of filters of weight values and a number of layers in the plurality of input feature maps to generate corresponding membrane potentials in a plurality of membrane potentials, wherein the one of groups of filters of weight values and the number of the layers in the plurality of input feature maps correspond to one of the plurality of time steps (observation, evaluation, and judgement),
when the one of the plurality of time steps in step (b) is not an initial time step in the plurality of time steps, updating the corresponding membrane potentials in step (b) by adding up with membrane potentials corresponding to a previous time step (observation, evaluation, and judgement)
generating one of a plurality of output spikes to provide a neural network result (observation, evaluation, and judgement)
repeating steps (b) to (d) until the activation operation is performed to all groups of filters of weight values and all of the layers in the plurality of input feature maps (observation, evaluation, and judgement)
Therefore, claim 9 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis: Claim 9 recites additional elements “A non-transitory computer-readable medium for storing computer-executable instructions, the computer-executable instructions when executed by a processor implementing a method comprising the following steps”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component. An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application (See MPEP 2106.05(f)). Claim 9 also recites additional elements “reading a plurality of input feature maps corresponding to a plurality of time steps from a memory device” which amounts to gathering and outputting of data which is insignificant extra-solution activity (See MPEP 2106.05(g)). Therefore, claim 9 is directed to a judicial exception.
Step 2B Analysis: Claim 9 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 9 amount to no more than mere instructions to apply the judicial exception using a generic computer component and insignificant extra-solution activity. The gathering and outputting of data is considered well-understood, routine, and conventional in the art (See MPEP 2106.05(d)(II)(i)).
For the reasons above, claim 9 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to dependent claims 10-14.
The additional limitations of the dependent claims are addressed briefly below:
Dependent claim 10 recites additional observation, evaluation, and judgement “the generating, based on the plurality of input feature maps, location information first number is different from the second number” as well as additional insignificant extra-solution activity of gathering and outputting data “reading non-zero values in locations indicated by the location information for step (b)” which is well-understood, routine, and conventional in the art (See MPEP 2106.05(d)(II)(i))
Dependent claim 11 recites additional observation, evaluation, and judgement “wherein the location information includes a plurality of count numbers each corresponding to the non-zero values in one of a plurality of rows, a plurality of channel numbers and a plurality of height numbers, wherein each of the channel numbers and a corresponding height number correspond to a location of the one of the plurality of rows.”
Dependent claim 12 recites additional observation, evaluation, and judgement “(h) counting the non-zero values in a plurality of rows of the plurality of input feature maps to generate a plurality of count numbers included in the location information”
Dependent claim 13 recites additional mathematical calculations and relationships “in step (b) the activation operation is performed in a number of cycles on rows including the non-zero values, wherein the number of cycles is associated with count numbers corresponding to the rows including the non-zero values”
Dependent claim 14 recites additional observation, evaluation, and judgement “(f) eliminating the membrane potentials corresponding to the number of the layers associated with a time step before the previous time step”
Therefore, when considering the elements separately and in combination, they do not add significantly more to the inventive concept. Accordingly, claims 1-14 are rejected under 35 U.S.C. § 101.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1, 4, 5, 7, 8, 9, 10, and 14-20 are rejected under U.S.C. §102(a)(1) as being anticipated by Sommer (“Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking Neural Networks”, 2022).
Regarding claim 1, Sommer teaches A method, comprising: receiving an input in an input layer of a spiking neural network during a plurality of time steps, wherein the input includes a plurality of spikes;([p. 1 §1] "the outputs of the neurons (activations) are not encoded with real-valued scalars like in standard NNs, but with sequences of binary events called spikes" [p. 2] "A binary spike from the previous layer l−1 arrives at the synapse i of a neuron j and is weighted with the synaptic weight wi" [p. 6] "an SNN is processed layer by layer. Each layer of an SNN needs to be simulated for multiple time steps T [...] for t<-0 to T do Simulate layer for all time steps T" Sommer explicitly uses binary spikes (spike = 1))
generating, based on a plurality of activation values corresponding to the plurality of spikes, location information including a plurality of count numbers each corresponding to non-zero values, in the plurality of activation values, in one of a plurality of rows of the input;([p. 3] "Each output of a neuron (i.e. activation) is a pixel of the resulting 2D output image called a feature map (fmap). A convolutional layer l typically generates a number of Cl output fmaps called channels which can be interpreted as a third dimension. The channels of the input fmap Cl−1 dictate the number of channels of each kernel Kl. The number of kernels kl determine the number of output channels Cl, i.e. Cl = kl. For standard spiking convolutional layers, the binary input fmap [...] is convolved with a number of kl kernels" [pp. 4-5] "The AER of a spike is simply the spike's (i,j) coordinates in the 2D fmap [...] A 2D input fmap is stored as a queue of address events (the AEQ)" [p. 6] "The queues in the 9 columns can be filled in parallel. The 9 parallel write accesses are necessary since thresholding is performed in a 3×3 window. The write logic features 9 write counters, one for each column" Sommer's device generates and stores location information for nonzero spikes by converting binary spike activations into address events. Each spike is represented by its "(i,j) coordinates" and these address events are stored in the AEQ as the compressed representation of the activation feature map. The write counters in Sommer explicitly track non-zero values (spikes))
performing, based on the location information, a matrix multiplication with the non-zero values in a first number of rows in the plurality of rows with a first group of filters of weight values to generate a plurality of first membrane potentials for outputting a first output spike at a first time; and ([p. 3] "Each output of a neuron (i.e. activation) is a pixel of the resulting 2D output image called a feature map (fmap). A convolutional layer l typically generates a number of Cl output fmaps called channels which can be interpreted as a third dimension. The channels of the input fmap Cl−1 dictate the number of channels of each kernel Kl. The number of kernels kl determine the number of output channels Cl, i.e. Cl = kl. For standard spiking convolutional layers, the binary input fmap [...] is convolved with a number of kl kernels [...] This results in the following equations, with ∗ denoting the convolution operation [See Eqn. 3]" While Sommer discloses that they can optimize out multiplication they explicitly teach that this is only because with the binary spikes simply summing the spikes is mathematically equivalent to performing multiplication ([p. 2] "The spikes that encode neuron activations are binary in nature. Weighting the binary activations ∈ {1,0} does not require an actual multiplication, as the multiplication reduces to: 1 · w = w and 0 · w = 0". In other words Sommer explicitly teaches that their implementation is not an approximation but rather the exact multiplicative convolution formula performed in the instant. Multiplication by 1 optimization as performed in Sommer is still mathematically multiplication.)
performing, based on the location information, the matrix multiplication with the non-zero values in a second number of rows in the plurality of rows with the first group of filters of weight values to generate a plurality of second membrane potentials for outputting a second output spike at a second time after the first time. ([p. 5] "For each address event, 9 neurons (highlighted in white) can be updated in parallel, due to the 3x3 neighbourhood of the kernel […] all spike events are located inside the AEQ [...] the respective weights of the kernel can be added to the neuron potentials by rotating the kernel [...] a 3x3 kernel requires updating the neuron potentials at position (i,j) and all 8 neighboring membrane potentials" [p. 2] "When a membrane potential Vm exceeds the threshold Vt, then the neuron fires a spike itself and Vm is reset to 0. The membrane potential of a neuron j at layer l at each time step t" [p. 6] "an SNN is processed layer by layer. Each layer of an SNN needs to be simulated for multiple time steps T [...] for t<-0 to T do Simulate layer for all time steps T" Sommer's disclosed operation repeats across algorithmic time steps t. The device continues applying the same kernel weights K via the convolution unit and continues thresholding Vm to generate output spikes indexed by time t where t+1 corresponds to a second time after the first time (or similarly t corresponds to a second time after the first time t-1 [See Eqn. 1]).).
Regarding claim 4, Sommer teaches The method of claim 1, wherein the location information further includes a plurality of channel numbers and a plurality of height numbers, wherein each of the channel numbers and a corresponding height number correspond to a location of the one of the plurality of rows. (Sommer [pp. 4-5] "the AER of a spike is simply the spike's (i,j) coordinates in the 2D fmap" [p. 6] "Each element is addressed uniquely by its address (i,j) and its column s ∈ (0,...,8)" 2D fmap being a channel having height coordinate i.).
Regarding claim 5, Sommer teaches The method of claim 1, further comprising: performing, based on the location information, the matrix multiplication with the non-zero values in the first number of rows in the plurality of rows with a second group of filters of weight values to generate a plurality of third membrane potentials for outputting a third output spike at a third time after the second time. (Sommer [p. 5] "For each address event, 9 neurons (highlighted in white) can be updated in parallel, due to the 3x3 neighbourhood of the kernel […] all spike events are located inside the AEQ [...] the respective weights of the kernel can be added to the neuron potentials by rotating the kernel [...] a 3x3 kernel requires updating the neuron potentials at position (i,j) and all 8 neighboring membrane potentials" [p. 2] "When a membrane potential Vm exceeds the threshold Vt, then the neuron fires a spike itself and Vm is reset to 0. The membrane potential of a neuron j at layer l at each time step t" [p. 6] "an SNN is processed layer by layer. Each layer of an SNN needs to be simulated for multiple time steps T [...] for t<-0 to T do Simulate layer for all time steps T" Sommer explicitly supports multiple kernels/filters selected by channel and channel-wise sequential SNN processing).
Regarding claim 7, Sommer teaches 7. The method of claim 5, further comprising: performing, based on the location information, the matrix multiplication with the non-zero values in the first number of rows in the plurality of rows with a third group of filters of weight values to generate a plurality of fourth membrane potentials for outputting a fourth output spike at a fourth time after the third time.([p. 5] "For each address event, 9 neurons (highlighted in white) can be updated in parallel, due to the 3x3 neighbourhood of the kernel […] all spike events are located inside the AEQ [...] the respective weights of the kernel can be added to the neuron potentials by rotating the kernel [...] a 3x3 kernel requires updating the neuron potentials at position (i,j) and all 8 neighboring membrane potentials" [p. 2] "When a membrane potential Vm exceeds the threshold Vt, then the neuron fires a spike itself and Vm is reset to 0. The membrane potential of a neuron j at layer l at each time step t" [p. 6] "an SNN is processed layer by layer. Each layer of an SNN needs to be simulated for multiple time steps T [...] for t<-0 to T do Simulate layer for all time steps T" Sommer explicitly supports multiple kernels/filters selected by channel and channel-wise sequential SNN processing).
Regarding claim 8, Sommer teaches The method of claim 1, further comprising: generating a neural network result, based on the first and second output spikes, for an image recognition operation of the input.([p. 2] "To perform inference on a single sample (e.g. an input image)" [p. 3] " the SNN should provide a satisfactory classification" image classification interpreted as synonymous with an image recognition operation).
Regarding claim 9, Sommer teaches A non-transitory computer-readable medium for storing computer-executable instructions, the computer-executable instructions when executed by a processor implementing a method comprising the following steps:([p. 4] "We start by providing a top-level overview of the hardware architecture and then proceed to show how it can be implemented efficiently on either FPGAs or ASICs")
(a) reading a plurality of input feature maps corresponding to a plurality of time steps from a memory device;([p. 6] "The output fmap of each channel is represented by its own AEQ. These AEQs can be implemented in a single dual-port RAM since each individual AEQ is processed sequentially […] Vm<-ConvolutionUnit(AEQ[cin,l-1,t],K[cout,cin,l], Vm" Sommer stores feature maps in AEQs and explicitly states the AEQs are implementable in RAM (memory). The dataflow in Algorithm 1 explicitly reads AEQ indexed by time step t, i.e., feature-map content corresponding to time steps is retrieved from memory for processing)
(b) performing an activation operation with one of groups of filters of weight values and a number of layers in the plurality of input feature maps to generate corresponding membrane potentials in a plurality of membrane potentials,([p. 5] "The convolution core updates the MemPot memory depending on the address events. […] Note that K refers to all kernels of the SNN, thus for each convolution the correct kernel must be selected depending on the current layer l, the current input channel cin and output channel cout." [p. 6] "Vm<-ConvolutionUnit(AEQ[cin,l-1,t],K[cout,cin,l], Vm"" Sommers "activation operation" is the event-based convolution update that updates membrane potentials (MemPot) using kernel weights K. Sommer is explicit that kernels (filters) exist across the SNN and the correct kernel is selected based on layer and channels and Algorithm 1 shows membrane potential generation and update via ConvolutionUnit)
wherein the one of groups of filters of weight values and the number of the layers in the plurality of input feature maps correspond to one of the plurality of time steps;([p. 6] "Each layer of an SNN needs to be simulated for multiple time steps T")
(c) when the one of the plurality of time steps in step (b) is not an initial time step in the plurality of time steps, updating the corresponding membrane potentials in step (b) by adding up with membrane potentials corresponding to a previous time step;([p. 2] "A binary spike from the previous layer l−1 arrives at the synapse i of a neuron j and is weighted with the synaptic weight wi. The weighted spike is then integrated (i.e. added) into the neurons membrane potential V l mj . When a membrane potential Vm exceeds the threshold Vt, then the neuron fires a spike itself and Vm is reset to 0. The membrane potential of a neuron j at layer l at each time step t is described as [See Eqn. 1]" See also Algorithm 1 where Vm<-ConvolutionUnit is inside the "for t ← 0 to T do" loop. Sommers neuron model is explicitly time-discrete and additive where the membrane potential at the current step is formed from the previous membrane potential plus added weight spikes.)
(d) generating one of a plurality of output spikes to provide a neural network result; and([Abstract] "neuronal outputs (i.e. activations) are not encoded with real-valued activations but with sequences of binary spikes")
(e) repeating steps (b) to (d) until the activation operation is performed to all groups of filters of weight values and all of the layers in the plurality of input feature maps.([p. 6] "Here, an SNN is processed layer by layer. Each layer of an SNN needs to be simulated for multiple time steps T. […] Each output channel of a layer is simulated for all time steps t, one channel after the other." [p. 6] "The sparse output activations are stored in the AEQ [...] The output fmap of each channel is represented by its own AEQ").
Regarding claim 10, Sommer teaches The non-transitory computer-readable medium of claim 9, wherein the method further comprises the following steps: (f) generating, based on the plurality of input feature maps, location information; and(Sommer [p. 3] "Each output of a neuron (i.e. activation) is a pixel of the resulting 2D output image called a feature map (fmap). A convolutional layer l typically generates a number of Cl output fmaps called channels which can be interpreted as a third dimension. The channels of the input fmap Cl−1 dictate the number of channels of each kernel Kl. The number of kernels kl determine the number of output channels Cl, i.e. Cl = kl. For standard spiking convolutional layers, the binary input fmap [...] is convolved with a number of kl kernels" [pp. 4-5] "The AER of a spike is simply the spike's (i,j) coordinates in the 2D fmap [...] A 2D input fmap is stored as a queue of address events (the AEQ)" [p. 6] "The queues in the 9 columns can be filled in parallel. The 9 parallel write accesses are necessary since thresholding is performed in a 3×3 window. The write logic features 9 write counters, one for each column" Sommer's device generates and stores location information for nonzero spikes by converting binary spike activations into address events. Each spike is represented by its "(i,j) coordinates" and these address events are stored in the AEQ as the compressed representation of the activation feature map. The write counters in Sommer explicitly track non-zero values (spikes))
(g) reading non-zero values in locations indicated by the location information for step (b).(Sommer [p. 5] "The AEQ stores the address events that are read by the convolution core. The convolution core updates the MemPot memory depending on the address events.").
Regarding claim 14, Sommer teaches The non-transitory computer-readable medium of claim 9, wherein the method further comprises a step: (f) eliminating the membrane potentials corresponding to the number of the layers associated with a time step before the previous time step.(Sommer [p. 9] "If any of these two condition is true, then the spike indicator bit of the respective neuron is set to 1. It is only set back to 0 if a new sample has to be processed" Setting back to zero (eliminating the membrane potential) when a new sample (later timestep) has to be processed).
Regarding claim 15, Sommer teaches A system, comprising:([p. 5] "The architecture proposed here consists of six distinct units")
a mask generation circuit configured to generate, according to a plurality of activation values corresponding to a plurality of spikes, location information to a first memory circuit;([p. 4 §V] "spikes are represented as address events that are compressed into queues" [p. 5] "The AEQ stores the address events that are read by the convolution core. The convolution core updates the MemPot memory depending on the address events. The thresholding unit adds the bias to the neurons stored in MemPot. Also, the thresholding unit threshold the neurons to generate the address event that are stored in the AEQ [...] Threshold the MemPot and write the resulting address events to the AEQ" Threshold unit interpreted as mask generation circuit)
second and third memory circuits that are configured to store first and second portions of a plurality of activation values respectively; and([p. 6] "The sparse output activations are stored in the AEQ as address events [...] The AEQ has to store the address events in queues [...] The queues in the 9 columns can be filled in parallel. The 9 parallel write accesses are necessary since thresholding is performed in a 3×3 window. The write logic features 9 write counters, one for each column" Queues interpreted as memory circuits (see also FIG. 6))
a plurality of processing circuits configured to perform, based on the location information, ([p. 5] "The architecture proposed here consists of six distinct units")
an activation operation on the first and second portions of the plurality of activation values alternatively with a plurality of weight values to generate a plurality of first membrane potentials and a plurality of second membrane potentials to be stored in a fourth memory circuit,([p. 6] "Vm<-ConvolutionUnit(AEQ[cin,l-1,t],K[cout,cin,l], Vm"" [p. 5] "The Read Only Memory (ROM) for storing the kernel weights K and biases b" [p. 9] "The updated membrane potentials are compared to the threshold Vt to determine if they fire a spike […] The updated membrane potentials are written back to MemPot" Sommer explicitly performs an activation operation that consumes AEQ plus kernel weights K and updates membrane potentials Vm)
wherein the second memory circuit is further configured to store a plurality of first output values corresponding to the plurality of second membrane potentials generated based on the second portion of the plurality of activation values stored in the third memory circuit.([p. 5] "the thresholding unit thresholds the neurons to generate the addresses event that are stored in the AEQ" [p. 9] "Write MemPot and AEQ: The updated membrane potentials are written back to MemPot. If a spike is generated in S4, then the respective AEQ-column is written" Sommer explicitly stores output values (spike address events) into AEQ as the result of thresholding membrane potentials Vm. After the convolution produces/updates Vm the thresholding unit produces output spikes and stores them in AEQ).
Regarding claim 16, Sommer teaches The system of claim 15, further comprising: a controller circuit configured to send a control signal associated with the locations to the plurality of processing circuits to read non-zero values in the plurality of activation values as the first and second portions of the plurality of activation values.(Sommer [pp. 4-5] "The AER of a spike is simply the spike's (i,j) coordinates in the 2D fmap [...] A 2D input fmap is stored as a queue of address events (the AEQ)" [p. 6] "The AEQ is not only a data structure but also features two independent circuits: one for writing and one for reading the queue columns" In Sommer the locations are literally the address events identifying where spikes (non-zero values) occur and Sommer explicitly has dedicated read/write control around those location events).
Regarding claim 17, Sommer teaches The system of claim 16, wherein the first portion of the plurality of activation values are included in two rows of a plurality of input feature maps corresponding to one of a plurality of time steps, and(Sommer [pp. 4-5] "The AER of a spike is simply the spike's (i,j) coordinates in the 2D fmap [...] A 2D input fmap is stored as a queue of address events (the AEQ)" [p. 6] "The AEQ is not only a data structure but also features two independent circuits: one for writing and one for reading the queue columns" The portions included in any two rows of the fmap (shown as 5x5 in FIG. 7) interpreted as the portion included in two rows of a plurality of fmaps corresponding to one of time steps t to T.)
the second portion of the plurality of activation values is included in two rows of the plurality of input feature maps corresponding to another time step of the plurality of time steps.(Sommer [pp. 4-5] "The AER of a spike is simply the spike's (i,j) coordinates in the 2D fmap [...] A 2D input fmap is stored as a queue of address events (the AEQ)" [p. 6] "The AEQ is not only a data structure but also features two independent circuits: one for writing and one for reading the queue columns" The portions included in any two rows of the fmap (shown as 5x5 in FIG. 7) interpreted as the portion included in two rows of a plurality of fmaps corresponding to one of time steps t to T.).
Regarding claim 18, Sommer teaches The system of claim 16, wherein the first portion of the plurality of activation values are included in three rows of a plurality of input feature maps corresponding to one of a plurality of time steps, and the second portion of the plurality of activation values is included in three rows of the plurality of input feature maps corresponding to another time step of the plurality of time steps.(Sommer [p. 4] "the proposed architecture is optimized for 3 × 3 kernels" [p. 6] "The 9 parallel write accesses are necessary since thresholding is performed in a 3×3 window" [p. 6] "Each layer of an SNN needs to be simulated for multiple time steps T […] Each layer of an SNN needs to be simulated for multiple time steps T" See also FIG. 7 which explicitly shows the feature map interlacing scheme).
Regarding claim 19, Sommer teaches The system of claim 15, wherein the plurality of weight values are divided into N groups, and the plurality of processing circuits are further configured to perform the activation operation on the first portion of the plurality of activation values with one group, in the N groups, of weight values to generate the plurality of first membrane potentials,(Sommer [p. 5] "K refers to all kernels of the SNN, thus for each convolution the correct kernel must be selected depending on the current layer l, the current input channel cin and output channel cout. We use the following notation to indicate the selection of the correct kernel for the current convolution:" [p. 6] "To maximize the reuse of MemPot, processing is done in a channel-wise fashion. Each output channel of a layer is simulated for all time steps t, one channel after the other" Examiner notes that N is not restricted such that it would be very reasonable to interpret N=1.)
wherein the system further comprises: a neuron core circuit configured to generate an output spike based on the plurality of first membrane potentials.(Sommer [p. 4] "The idea of m-TTFS is that once a neuron has exceeded Vt , it emits a spike every algorithmic time step. After a sample has been processed for T time steps, the entire SNN is reset and all neurons can fire again.").
Regarding claim 20, Sommer teaches The system of claim 19, wherein each of the N groups includes M number of filters, and a number of the plurality of processing circuits is associated with a product of the number M and a dimension of the filters. (Sommer [p. 2] "the multiplication reduces to: 1 · w = w and 0 · w = 0" [p. 3] "Each output of a neuron (i.e. activation) is a pixel of the resulting 2D output image called a feature map (fmap). A convolutional layer l typically generates a number of Cl output fmaps called channels which can be interpreted as a third dimension. The channels of the input fmap Cl−1 dictate the number of channels of each kernel Kl. The number of kernels kl determine the number of output channels Cl, i.e. Cl = kl. For standard spiking convolutional layers, the binary input fmap [...] is convolved with a number of kl kernels [...] This results in the following equations, with ∗ denoting the convolution operation [See Eqn. 3]" Examiner notes that M is not restricted such that it would be very reasonable to interpret M=1.).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 2, 3, and 6 are rejected under U.S.C. §103 as being unpatentable over the combination of Sommer and Chen (“Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance”, 2022).
Regarding claim 2, Sommer teaches The method of claim 1.
However, Sommer doesn't explicitly teach wherein the first number is different from the second number.
Chen, in the same field of endeavor, teaches The method of claim 1, wherein the first number is different from the second number. ([p. 5733] "arithmetic operations can be saved if connections without a spike are skipped. Moreover, the spike rate differs across timesteps, indicating that the proportion of active neurons varies over time").
Sommer as well as Chen are directed towards spiking neural networks for convolution. Therefore, Sommer as well as Chen are analogous art in the same field of endeavor. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Sommer with the teachings of Chen by building cluster list C corresponding to different numbers of non-zero activations. Chen provides as additional motivation for combination ([p. 5735] “When CBWS has applied alone, this design only achieved a 54.37% balance ratio. After combining with the APRC mechanism, the ratio was improved to 95.69%”). This motivation for combination also applies to the remaining claims which depend on this combination.
Regarding claim 3, Sommer teaches The method of claim 1.
However, Sommer doesn't explicitly teach wherein the first number is smaller than the second number.
Chen, in the same field of endeavor, teaches The method of claim 1, wherein the first number is smaller than the second number.([p. 5733] "arithmetic operations can be saved if connections without a spike are skipped. Moreover, the spike rate differs across timesteps, indicating that the proportion of active neurons varies over time" While one of ordinary skill in the art would recognize that if the two numbers are different, one must necessarily be smaller than the other, FIG. 4 of Chen explicitly shows a second subsequent output channel producing a smaller number of spikes (2) over the same inference window as a first output channel producing 6 spikes.).
Sommer as well as Chen are directed towards spiking neural networks for convolution. Therefore, Sommer as well as Chen are analogous art in the same field of endeavor. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Sommer with the teachings of Chen by building cluster list C corresponding to different numbers of non-zero activations. Chen provides as additional motivation for combination ([p. 5735] “When CBWS has applied alone, this design only achieved a 54.37% balance ratio. After combining with the APRC mechanism, the ratio was improved to 95.69%”). This motivation for combination also applies to the remaining claims which depend on this combination.
Regarding claim 6, Sommer teaches The method of claim 5.
However, Sommer doesn't explicitly teach, wherein a number of filters in the first group equals to a number of filters in the second group.
Chen, in the same field of endeavor, teaches a number of filters in the first group equals to a number of filters in the second group. (See Algorithm 1. list C which is bounded by number of SPEs in a cluster, where one filter per iteration (for i=0;i<K) is operated on.).
Sommer as well as Chen are directed towards spiking neural networks for convolution. Therefore, Sommer as well as Chen are analogous art in the same field of endeavor. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Sommer with the teachings of Chen by building cluster list C corresponding to different numbers of non-zero activations. Chen provides as additional motivation for combination ([p. 5735] “When CBWS has applied alone, this design only achieved a 54.37% balance ratio. After combining with the APRC mechanism, the ratio was improved to 95.69%”). This motivation for combination also applies to the remaining claims which depend on this combination.
Claim 11 is rejected under U.S.C. §103 as being unpatentable over the combination of Sommer and Pal (“OuterSPACE: An Outer Product based Sparse Matrix Multiplication Accelerator”, 2018).
Regarding claim 11, Sommer teaches The non-transitory computer-readable medium of claim 10.
However, Sommer doesn't explicitly teach wherein the location information includes a plurality of count numbers each corresponding to the non-zero values in one of a plurality of rows,
a plurality of channel numbers and a plurality of height numbers,
wherein each of the channel numbers and a corresponding height number correspond to a location of the one of the plurality of rows.
Pal, in the same field of endeavor, teaches the location information includes a plurality of count numbers each corresponding to the non-zero values in one of a plurality of rows, ([p. 3 §3.2] "The vals array consists of the non-zero elements of the matrix in row-major order")
a plurality of channel numbers and a plurality of height numbers, ([p. 3 §3.2] "the cols array contains the column indices of the elements in vals")
wherein each of the channel numbers and a corresponding height number correspond to a location of the one of the plurality of rows.([p. 3 §3.2] "the row-ptrs array contains pointers to the start of each row of the matrix").
Sommer as well as Pal are directed towards sparse matrix multiplication acceleration. Therefore, Sommer as well as Pal are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Sommer with the teachings of Pal by using compressed sparse row (CSR) format for the convolution filters. While CSR is well established in the art and would be an obvious design choice for one of ordinary skill in the art, this is explicitly reinforced by Pal who provides as additional motivation for combination ([p. 10] “greater use of local memory during the merge phase”).
Claims 12 and 13 are rejected under U.S.C. §103 as being unpatentable over the combination of Sommer and Weaver (“Memory-Side Acceleration and Sparse Compression for Quantized Packed Convolutions”, 2022).
Regarding claim 12, Sommer teaches The non-transitory computer-readable medium of claim 10.
However, Sommer doesn't explicitly teach wherein the step (f) comprises a step: (h) counting the non-zero values in a plurality of rows of the plurality of input feature maps to generate a plurality of count numbers included in the location information.
Weaver, in the same field of endeavor, teaches the step (f) comprises a step: (h) counting the non-zero values in a plurality of rows of the plurality of input feature maps to generate a plurality of count numbers included in the location information. ([p. 2] "Any compression scheme involves at the very least storing the non-zero values along with at least one index to compute its original position within the dense matrix. CSR, for example, stores the column index of a value in a row as well as the cumulative number of non-zero values in each row" [p. 3] "the nnz array stores the number of values in each partition. Therefore, while the lengths of both the vals and offset arrays correspond to the total number of non-zero values in the dense matrix" See also Algorithm 2 where filter is explicitly packed).
Sommer as well as Weaver are directed towards acceleration of convolutional neural networks. Therefore, Sommer as well as Weaver are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Sommer with the teachings of Weaver by using the PSR format for the filters. Weaver provides as additional motivation for combination ([p. 6] “As the filter size increases, however, sparse CONV with PSR performs increasingly better than the dense version”). This motivation for combination also applies to the remaining claims which depend on this combination.
Regarding claim 13, the combination of Sommer, and Weaver teaches The non-transitory computer-readable medium of claim 12, wherein in step (b) the activation operation is performed in a number of cycles on rows including the non-zero values, (Sommer [p. 3] "Each output of a neuron (i.e. activation) is a pixel of the resulting 2D output image called a feature map (fmap). A convolutional layer l typically generates a number of Cl output fmaps called channels which can be interpreted as a third dimension. The channels of the input fmap Cl−1 dictate the number of channels of each kernel Kl. The number of kernels kl determine the number of output channels Cl, i.e. Cl = kl. For standard spiking convolutional layers, the binary input fmap [...] is convolved with a number of kl kernels [...] This results in the following equations, with ∗ denoting the convolution operation [See Eqn. 3]" [p. 5] "Because all spike events are located inside the AEQ, the clock cycles required to perform the convolution scale directly with the number of spikes, i.e., one clock cycle per event" the binary spikes (the spike being nonzero) simply summing the spikes is mathematically equivalent to performing multiplication ([p. 2] "The spikes that encode neuron activations are binary in nature. Weighting the binary activations ∈ {1,0} does not require an actual multiplication, as the multiplication reduces to: 1 · w = w and 0 · w = 0".)
wherein the number of cycles is associated with count numbers corresponding to the rows including the non-zero values.(Sommer [p. 5] "Because all spike events are located inside the AEQ, the clock cycles required to perform the convolution scale directly with the number of spikes, i.e., one clock cycle per event" [p. 6] "The AEQ is not only a data structure but also features two independent circuits: one for writing and one for reading the queue columns").
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Liu (“SATO: spiking neural network acceleration via temporal-oriented dataflow and architecture”, 2022) is directed towards sparse row wise spike representations across discrete time steps in a spiking neural network.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124