Office Action Analysis: 17440068 — NEURAL NETWORK DEVICE, NEURAL NETWORK SYSTEM, PROCESSING METHOD, AND RECORDING MEDIUM

Examiner Intelligence

BASOM, BLAINE T View full profile →
Grants 43% of resolved cases
Career Allowance Rate
140 granted / 326 resolved
-12.1% vs TC avg
Strong +23% interview lift
Without
With
+22.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 6m
Avg Prosecution
23 currently pending
Career history
364
Total Applications
across all art units
Statute-Specific Performance

§101
1.1%
-38.9% vs TC avg
§103
85.8%
+45.8% vs TC avg
§102
1.0%
-39.0% vs TC avg
§112
2.6%
-37.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 326 resolved cases
Office Action

§103 §112
DETAILED ACTION
This Office action is responsive to the Request for Continued Examination (RCE) filed under 37 CFR §1.53(d) for the instant application on November 3, 2025.  The Applicants have properly set forth the RCE, which has been entered into the application, and an examination on the merits follows herewith.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claim 3 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  In particular, there is no antecedent basis for “the firing probability density” recited in claim 3.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1 and 5 are rejected under 35 U.S.C. 103 as being unpatentable over the article entitled “Supervised learning based on temporal coding in spiking neural networks” by Hesham Mostafa (“Mostafa”), over the article entitled “A gradient descent rule for spiking neurons emitting multiple spikes” by Booij et al. (“Booij”), over the article entitled “Error-backpropagation in temporally encoded networks of spiking neurons” by Bohte et al. (“Bohte”), and also over U.S. Patent Application Publication No. 2015/0278680 to Annapureddy et al. (“Annapureddy”).
Regarding claims 1 and 5, Mostafa generally describes “a new approach for controlling the behavior of spiking networks with realistic temporal dynamics, opening up the potential for using these networks to process spike patterns with complex temporal information” (Abstract).  Like claimed, Mostafa particularly teaches: 
	a plurality of neuron models, wherein each of the plurality of neuron models is a non-leaky integrate-and-fire spiking neuron and a spiking neuron with which a postsynaptic current is represented using a step function, and wherein one neuron model fires once at most in response to one input to the one neuron model (see e.g. section 2 “Network Model,” in which Mostafa states:
We use non-leaky integrate and fire neurons with exponentially decaying synaptic current kernels. The neuron’s membrane dynamics are described by:

                                    d
                                    
                                            V
                                        
                                            m
                                            e
                                            m
                                        
                                            j
                                        
                                            t
                                        
                                    d
                                    t
                                
                            =
                             
                                    ∑
                                    
                                        i
                                    
                                            w
                                        
                                            j
                                            i
                                        
                                    ∑
                                    
                                        r
                                    
                                    K
                                    
                                            t
                                            -
                                            
                                                    t
                                                
                                                    i
                                                
                                                    r
                                                
                                                 (1)
where                         
                            
                                    V
                                
                                    m
                                    e
                                    m
                                
                                    j
                                
                     is the membrane potential of neuron j. The right hand side of the equation is the synaptic current.                          
                            
                                    w
                                
                                    j
                                    i
                                
                     is the weight of the synaptic connection from neuron i to neuron j and                         
                            
                                    t
                                
                                    i
                                
                                    r
                                
                     is the time of the rth spike from neuron i.  K is the synaptic current kernel given by:

                            K
                            
                                    x
                                
                            =
                            Θ
                            
                                    x
                                
                                    exp
                                
                                ⁡
                                
                                            -
                                            
                                                    x
                                                
                                                            τ
                                                        
                                                            s
                                                            y
                                                            n
                                                        
                       where	                        
                            Θ
                            
                                    x
                                
                            =
                            
                                                    1
                                                     
                                                    i
                                                    f
                                                     
                                                    x
                                                     
                                                    ≥
                                                     
                                                    0
                                                
                                                    0
                                                     
                                                    o
                                                    t
                                                    h
                                                    e
                                                    r
                                                    w
                                                    i
                                                    s
                                                    e
                                                
	(2)

Synaptic current thus jumps instantaneously on the arrival of an input spike, then decays exponentially with time constant τsyn. Since τsyn is the only time constant in the model, we set it to 1 in the rest of the paper, i.e., normalize all times with respect to it. The neuron spikes when its membrane potential crosses a firing threshold which we set to 1, i.e., all synaptic weights are normalized with respect to the firing threshold. The membrane potential is reset to zero after a spike. We allow the membrane potential to go below zero if the integral of the synaptic current is negative.

Mostafa thus describes a neuron model for non-leaky integrate-and-fire neurons, i.e. spiking neurons, with which a postsynaptic current – e.g. the current from neuron i – is represented in part by using a step function, i.e.                         
                            Θ
                            
                                    x
                                
                     in equation (2) above.  Mostafa further teaches in section 2 “Network model” that the non-leaky integrate-and-fire neurons can be configured to fire at most once in one process of a neural network to indicate an output:
In the rest of the paper, we consider a neuron’s output value to be the time of its first spike. Moreover, once a neuron spikes, it is not allowed to spike again, i.e., we assume it enters an infinitely long refractory period. We allow each neuron to spike at most once for each input presentation in order to make the spiking activity sparse and to force the training algorithm to make optimal use of each spike.

Mostafa thus teaches a neuron model configured as a non-leaky integrate-and-fire spiking neuron and a spiking neuron with which a postsynaptic current is represented using a step function, the neuron means being fired once at most in one process of a neural network to indicate an output of the neural model means itself at firing timing.  Mostafa also demonstrates in e.g. section 4.1 “XOR task” that a plurality of such neuron models can form a neural network in which information is transferred between the neurons:
In the XOR task, two spike sources send a spike each to the network. Each of the two input spikes can occur at time 0.0 (early spike) or 2.0 (late spike). The two input spike sources project to a hidden layer of four neurons and the hidden neurons project to two output neurons. The first output neuron must spike before the second output neuron if exactly one input spike is an early spike. The network is shown in Fig. 2(a) together with the four input patterns.	

    PNG
    media_image1.png
    310
    251
    media_image1.png
    Greyscale

Figure 2(a)

Mostafa thus teaches a plurality of neuron models like claimed.); and
 	a plurality of synapse models, wherein one synapse model of the plurality of synapse models connects the one neuron model with another neuron model (As demonstrated by equation (1) above, Mostafa teaches that the membrane potential of neuron j is based on input spikes received from neurons i connected to neuron j.  The neuron j thus receives information – particularly, input spikes – transferred from neurons i.  Additionally, as noted above, Mostafa demonstrates in section 4.1 “XOR task” that a plurality of such spiking neurons can form a neural network in which information is transferred between the neurons via connections therebetween.  Each of the connections between the spiking neurons is considered a synapse model like claimed, and wherein a synapse model connects one neuron model with another neuron model.);
  	wherein the one neuron model outputs, in response to firing of the one neuron model, an output of the one neuron model at firing timing (As demonstrated by equation (1) above, Mostafa teaches that the membrane potential of neuron j is based on input spikes received from neurons i connected to neuron j.  Regarding equation (1), section 2 “Network Model” recites that “                        
                            
                                    w
                                
                                    j
                                    i
                                
                     is the weight of the synaptic connection from neuron i to neuron j and                         
                            
                                    t
                                
                                    i
                                
                                    r
                                
                     is the time of the rth spike from neuron i.”  The membrane potential of neuron j is thus based in part on the firing timing of neuron i.  Accordingly it is apparent that a neuron model corresponding to a neuron i outputs, in response to firing of the neuron model, an output spike at firing timing so as to be provided to neuron j.); and
	wherein the synapse model transfers information from the one neuron model to the other neuron model (As noted above, equation (1) described by Mostafa indicates that the membrane potential of neuron j is based on input spikes received from neurons i connected to neuron j.  Accordingly, it is apparent that a synapse model transfers information, i.e. an indication of an output spike, from a neuron i to the neuron j.);	
Mostafa however does not teach training at least one of an output layer and a hidden layer of an artificial neural network using a learning rule that applies a ratio of the change in firing timing of the one neuron model to a change in firing timing of the other neuron model, wherein the first ratio is obtained using a linear approximation of temporal development of membrane potential, as is required by claims 1 and 5.  Moreover, Mostafa does not explicitly disclose that such teachings are implemented on an application specific integrated circuit (ASIC) like further required by claim 1, or via a program stored on a non-transitory recording medium like in claim 5.
	Booij generally describes a “supervised learning rule for Spiking Neural Networks (SNNs)…that can cope with neurons that spike multiple times.”  (Abstract).  Regarding the claimed invention, Booij particularly teaches that the learning rule applies a ratio of the change in the firing timing of one neuron to a change in firing timing of another neuron (e.g. the derivative of a postsynaptic spike with respect to a presynaptic spike):

We now derive a gradient descent learning rule in a similar way as SpikeProp [1]. However we take into account that the neurons can fire multiple spikes. In SpikeProp every neuron backpropagates a single error value, namely the error with respect to its one spike. In our derivation the neuron needs to determine and backpropagate an error value for each spike it fired.

…

We define the network error to be the sum squared error of the first spike of the output neurons O, so later spikes of these neurons are ignored:
                        
                            E
                            =
                            
                                    1
                                
                                    2
                                
                                    ∑
                                    
                                        o
                                        ∈
                                        O
                                    
                                                            t
                                                        
                                                            o
                                                        
                                                            1
                                                        
                                                    -
                                                     
                                                                    t
                                                                
                                                                    o
                                                                
                                                                    1
                                                                
                                                        ^
                                                    
                                            2
                                        
                     ,					       (6)				           				
where                         
                            
                                            t
                                        
                                            o
                                        
                                            1
                                        
                                ^
                            
                     denotes the desired spike time. Other error functions, including functions over more than only the first output spike, are also possible, as long as the derivative of the error to one spike exists. In our case the error with respect to one output spike is determined by 

                                    ∂
                                    E
                                
                                    ∂
                                    
                                            t
                                        
                                            o
                                        
                                            1
                                        
                     =                         
                            
                                    t
                                
                                    o
                                
                                    1
                                
                            -
                             
                                            t
                                        
                                            o
                                        
                                            1
                                        
                                ^
                            
                    .							       (7)

The parameters that are tuned to reduce this error are the weights of the synapses                         
                            
                                    w
                                
                                    j
                                    i
                                
                                    k
                                
                    .  Other parameters, like the axonal delays, could also be learned [14]. Like SpikeProp we calculate the derivative of the network error with respect to the weight in order to calculate the appropriate weight change [1]: 

                            ∆
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
                            =
                             
                            -
                            k
                            
                                    ∂
                                    E
                                
                                    ∂
                                    
                                            w
                                        
                                            i
                                            h
                                        
                                            k
                                        
                    ,						      (8)

where                         
                            k
                        
                     denotes the learning rate. Because neuron                         
                            i
                        
                     can fire multiple times and all these firing times depend on the weight, this equation can be expanded with regard to these spikes:

                            ∆
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
                            =
                             
                            -
                            k
                            
                                    ∑
                                    
                                                t
                                            
                                                i
                                            
                                                f
                                            
                                        ∈
                                        
                                                F
                                            
                                                i
                                            
                                            ∂
                                            E
                                        
                                            ∂
                                            
                                                    t
                                                
                                                    i
                                                
                                                    f
                                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                                    ∂
                                    
                                            w
                                        
                                            i
                                            h
                                        
                                            k
                                        
                     .					      (9)

…

We now calculate the first factor on the right-hand side of Eq. (9), the derivative of the network error with respect to a spike. This is derived for every spike of non-output neuron                         
                            i
                             
                            ∉
                             
                            O
                        
                    , which depends on all spikes by all its neural successors                         
                            
                                    Γ
                                
                                    i
                                
                     (see Fig. 1):

                                    ∂
                                    E
                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                            =
                             
                                    ∑
                                    
                                        j
                                        ∈
                                        
                                                Γ
                                            
                                                i
                                            
                                            ∑
                                            
                                                        t
                                                    
                                                        j
                                                    
                                                        g
                                                    
                                                ∈
                                                
                                                        F
                                                    
                                                        j
                                                    
                                                    ∂
                                                    E
                                                
                                                    ∂
                                                    
                                                            t
                                                        
                                                            j
                                                        
                                                            g
                                                        
                                                    ∂
                                                    
                                                            t
                                                        
                                                            j
                                                        
                                                            g
                                                        
                                                    ∂
                                                    
                                                            t
                                                        
                                                            i
                                                        
                                                            f
                                                        
                    .					    (13)

Thus, in order to calculate the derivative of the error with respect to a spike, we already have to know the derivative of the error with respect to spikes of its neuronal successors that happened later in time. In practice this means that the error with respect to the last spike of the network has to be computed first, then to the spike preceding the last, and so on. This is the temporal equivalent of the conventional backpropagation technique, where the error was propagated back spatially through the network [9].
The derivative of a postsynaptic spike with respect to a presynaptic spike can be further expanded as in Eq. (10): 

                                    ∂
                                    
                                            t
                                        
                                            j
                                        
                                            g
                                        
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                            =
                             
                                    ∂
                                    
                                            u
                                        
                                            i
                                        
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                                    -
                                    1
                                
                                            ∂
                                            
                                                    u
                                                
                                                    j
                                                
                                                            t
                                                        
                                                            j
                                                        
                                                            g
                                                        
                                            ∂
                                            
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                    .				  	    (14)

The second factor already appeared in Eq. (10) and is calculated in Eq. (12). The first factor, the derivative of the potential during a postsynaptic spike with respect to a presynaptic spike, can again be derived from Eq. (3): 

                            ∂
                            
                                    u
                                
                                    i
                                
                                            t
                                        
                                            j
                                        
                                            g
                                        
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                    =
                     
                    -
                    
                            ∑
                            
                                        t
                                    
                                        j
                                    
                                        l
                                    
                                ∈
                                
                                        F
                                    
                                        j
                                    
                                    n
                                
                                    '
                                
                                            t
                                        
                                            j
                                        
                                            g
                                        
                                    -
                                    
                                            t
                                        
                                            j
                                        
                                            l
                                        
                            ∂
                            
                                    t
                                
                                    j
                                
                                    l
                                
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                            -
                             
                                    ∑
                                    
                                        k
                                    
                                            w
                                        
                                            j
                                            i
                                        
                                            k
                                        
                                            ε
                                        
                                            '
                                        
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                            -
                                            
                                                    t
                                                
                                                    i
                                                
                                                    f
                                                
                                            -
                                            
                                                    d
                                                
                                                    j
                                                    i
                                                
                                                    k
                                                
                    . 			    (15)		
As can be seen this is again a recursive function, so the order in which the computations are done is important.
With this set of formulas it is now possible to compute the necessary weight changes that minimize a certain error measure of the spikes of the output neurons given a pattern of input spike trains.

(Section 3.1 “Gradient descent.”  Emphasis added.)

Booij further suggests that the learning rule can be used to train an output layer and hidden layer of an artificial neural network comprising spiking neurons (see e.g. section 4.1. “The XOR Benchmark,” in which the learning rule is used to train a neural network comprising multiple layers of spiking neurons to implement an XOR function.).
	Although Booij is directed to a learning rule for neurons that spike multiple times (see e.g. the Abstract), it would have been apparent to one of ordinary skill in the art that, with minor modifications (e.g. setting the number of spikes for each neuron in the above formulas to one), Booij’s learning rule could be applied to neurons that fire only once like claimed and taught by Mostafa.  Particularly, it would have been obvious to one of ordinary skill in the art, having the teachings of Mostafa and Booij before the effective filing date of the claimed invention, to modify the artificial neural network taught by Mostafa such that at least one of an output layer and a hidden layer of the neural network is trained using a learning rule like taught by Booij, which applies a ratio of the change in firing timing of one neuron model to a change in firing timing of another neuron model.  It would have been advantageous to one of ordinary skill to utilize such a learning rule because it “can be used for real-world applications requiring the processing of temporal data,” as is taught by Booij (see section 5 “Conclusion and future work”).
Booij, however, does not explicitly disclose that the ratio is obtained using a linear approximation of temporal development of membrane potential like claimed.  Nevertheless Booij does disclose that the learning rule is derived in a similar way as “SpikeProp,” and that some of the formulas use results of the SpikeProp algorithm:
The parameters that are tuned to reduce this error are the weights of the synapses                         
                            
                                    w
                                
                                    j
                                    i
                                
                                    k
                                
                    .  Other parameters, like the axonal delays, could also be learned [14]. Like SpikeProp we calculate the derivative of the network error with respect to the weight in order to calculate the appropriate weight change [1]: 

                            ∆
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
                            =
                             
                            -
                            k
                            
                                    ∂
                                    E
                                
                                    ∂
                                    
                                            w
                                        
                                            i
                                            h
                                        
                                            k
                                        
                    ,						      (8)

where                         
                            k
                        
                     denotes the learning rate. Because neuron                         
                            i
                        
                     can fire multiple times and all these firing times depend on the weight, this equation can be expanded with regard to these spikes:

                            ∆
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
                            =
                             
                            -
                            k
                            
                                    ∑
                                    
                                                t
                                            
                                                i
                                            
                                                f
                                            
                                        ∈
                                        
                                                F
                                            
                                                i
                                            
                                            ∂
                                            E
                                        
                                            ∂
                                            
                                                    t
                                                
                                                    i
                                                
                                                    f
                                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                                    ∂
                                    
                                            w
                                        
                                            i
                                            h
                                        
                                            k
                                        
                     .					      (9)

First the second factor on the right-hand side, the derivative of a firing time with respect to the weight, is calculated using the results of the study by Bohte et al. for the SpikeProp algorithm; see [1] or [2] for a complete derivation. 

                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                                    ∂
                                    
                                            w
                                        
                                            i
                                            h
                                        
                                            k
                                        
                            =
                             
                                    ∂
                                    
                                            u
                                        
                                            i
                                        
                                                    t
                                                
                                                    i
                                                
                                                    f
                                                
                                    ∂
                                    
                                            w
                                        
                                            i
                                            h
                                        
                                            k
                                        
                                    -
                                    1
                                
                                            ∂
                                            
                                                    u
                                                
                                                    j
                                                
                                                            t
                                                        
                                                            i
                                                        
                                                            f
                                                        
                                            ∂
                                            
                                                    t
                                                
                                                    i
                                                
                                                    f
                                                
                     .					    (10)
…
We now calculate the first factor on the right-hand side of Eq. (9), the derivative of the network error with respect to a spike. This is derived for every spike of non-output neuron                         
                            i
                             
                            ∉
                             
                            O
                        
                    , which depends on all spikes by all its neural successors                         
                            
                                    Γ
                                
                                    i
                                
                     (see Fig. 1):

                                    ∂
                                    E
                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                            =
                             
                                    ∑
                                    
                                        j
                                        ∈
                                        
                                                Γ
                                            
                                                i
                                            
                                            ∑
                                            
                                                        t
                                                    
                                                        j
                                                    
                                                        g
                                                    
                                                ∈
                                                
                                                        F
                                                    
                                                        j
                                                    
                                                    ∂
                                                    E
                                                
                                                    ∂
                                                    
                                                            t
                                                        
                                                            j
                                                        
                                                            g
                                                        
                                                    ∂
                                                    
                                                            t
                                                        
                                                            j
                                                        
                                                            g
                                                        
                                                    ∂
                                                    
                                                            t
                                                        
                                                            i
                                                        
                                                            f
                                                        
                    .					    (13)

Thus, in order to calculate the derivative of the error with respect to a spike, we already have to know the derivative of the error with respect to spikes of its neuronal successors that happened later in time. In practice this means that the error with respect to the last spike of the network has to be computed first, then to the spike preceding the last, and so on. This is the temporal equivalent of the conventional backpropagation technique, where the error was propagated back spatially through the network [9].
The derivative of a postsynaptic spike with respect to a presynaptic spike can be further expanded as in Eq. (10): 

                                    ∂
                                    
                                            t
                                        
                                            j
                                        
                                            g
                                        
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                            =
                             
                                    ∂
                                    
                                            u
                                        
                                            i
                                        
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                                    -
                                    1
                                
                                            ∂
                                            
                                                    u
                                                
                                                    j
                                                
                                                            t
                                                        
                                                            j
                                                        
                                                            g
                                                        
                                            ∂
                                            
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                    .				  	    (14)

(Section 3.1 “Gradient descent.”  Emphasis added.)

Along these lines, Bohte generally describes the supervised learning rule, SpikeProp, for a network of spiking neurons that encodes information in the timing of individual spike times (see e.g. the Abstract).  In particular, Bohte teaches that the membrane potential (i.e. internal state variable) for each spiking neuron is represented by a function xj(t):
The network architecture consists of a feedforward network of spiking neurons with multiple delayed synaptic terminals (Fig. 1, as described by Natschläger and Ruf [35]). The neurons in the network generate action potentials, or spikes, when the internal neuron state variable, called “membrane potential”, crosses a threshold                         
                            ϑ
                        
                    .  The relationship between input spikes and the internal state variable is described by the spike response model (SRM), as introduced by Gerstner [18]. Depending on the choice of suitable spike-response functions, one can adapt this model to reflect the dynamics of a large variety of different spiking neurons.
Formally, a neuron j, having a set Гj of immediate predecessors (“pre-synaptic neurons”), receives a set of spikes with firing times ti, i ∈ Гj. Any neuron generates at most one spike during the simulation interval, and fires when the internal state variable reaches a threshold                         
                            ϑ
                        
                    . The dynamics of the internal state variable xj(t) are determined by the impinging spikes, whose impact is described by the spike-response function ɛ(t) weighted by the synaptic efficiency (“weight”) wij:

                                    x
                                
                                    j
                                
                                    t
                                
                            =
                             
                                    ∑
                                    
                                        i
                                        ∈
                                        Г
                                        j
                                    
                                            w
                                        
                                            i
                                            j
                                        
                            ɛ
                            
                                    t
                                    -
                                     
                                            t
                                        
                                            i
                                        
                    								(1)

The spike-response function in (1) effectively models the unweighted post-synaptic potential (PSP) of a single spike impinging on a neuron. The height of the PSP is modulated by the synaptic weight wij to obtain the effective post-synaptic potential. The spike-response function as used in our experiments is defined in (3).

(Section 2.  “A network of spiking neurons;” emphasis added).

Regarding the claimed invention, Bohte further teaches that the SpikeProp learning rule uses ratios that are obtained using a linear approximation of membrane potential (i.e. a linear approximation of xj):
The target of the algorithm is to learn a set of target firing times, denoted                         
                            
                                            t
                                        
                                            j
                                        
                                            d
                                        
                    , at the output neurons j ∈ J for a given set of input patterns {P[t1…th]}, where P[t1…th] defines a single input pattern described by single spike times for each neuron h ∈ H.  We choose as the error-function the least mean squares error-function, but other choices like entropy are also possible. Given desired spike times                         
                            
                                            t
                                        
                                            j
                                        
                                            d
                                        
                     and actual firing times                         
                            
                                            t
                                        
                                            j
                                        
                                            a
                                        
                    , this error-function is defined by

                    E
                    =
                     
                            1
                        
                            2
                        
                            ∑
                            
                                j
                                 
                                ∈
                                 
                                J
                                 
                                                    j
                                                
                                                    j
                                                
                                                    a
                                                
                                            -
                                             
                                                    t
                                                
                                                    j
                                                
                                                    d
                                                
                                    2
                                
                    .
                
(5)

For error-backpropagation, we treat each synaptic terminal as a separate connection k with weight                         
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                    .  Hence, for a backprop rule, we need to calculate 

                    ∆
                    
                            w
                        
                            i
                            j
                        
                            k
                        
                    =
                     
                    -
                    n
                    
                            ∂
                            E
                        
                            ∂
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
								(6)

with n the learning rate and                         
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                     the weight of connection k from neuron i to neuron j. As tj is a function of xj, which depends on the weights                         
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                    , the derivative on the right-hand part of (6) can be expanded to

                            ∂
                            E
                        
                            ∂
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                    =
                     
                            ∂
                            E
                        
                            ∂
                            
                                    t
                                
                                    j
                                
                                    t
                                
                                    j
                                
                                    a
                                
                            ∂
                            
                                    t
                                
                                    j
                                
                            ∂
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                                    t
                                
                                    j
                                
                                    a
                                
                    =
                     
                            ∂
                            E
                        
                            ∂
                            
                                    t
                                
                                    j
                                
                                    t
                                
                                    j
                                
                                    a
                                
                            ∂
                            
                                    t
                                
                                    j
                                
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
                                    t
                                
                                    j
                                
                                    a
                                
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
                            ∂
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                                    t
                                
                                    j
                                
                                    a
                                
                    .
                
(7)
								
In the last two factors on the right, we express tj as a function of the thresholded post-synaptic input xj(t) around t =                         
                            
                                    t
                                
                                    j
                                
                                    a
                                
                    . We assume that for a small enough region around t =                         
                            
                                    t
                                
                                    j
                                
                                    a
                                
                    , the function xj can be approximated by a linear function of t, as depicted in Fig. 3.  For such a small region, we approximate the threshold function                         
                            δ
                            
                                    t
                                
                                    j
                                
                                            x
                                        
                                            j
                                        
                            =
                             
                            -
                            
                                    δ
                                    
                                            x
                                        
                                            j
                                        
                                                    t
                                                
                                                    j
                                                
                                    α
                                
                    , with                         
                            
                                    ∂
                                    
                                            t
                                        
                                            j
                                        
                                    ∂
                                    
                                            x
                                        
                                            j
                                        
                                            t
                                        
                     the derivative of the inverse function of                         
                            
                                    x
                                
                                    j
                                
                                    t
                                
                    .  The value                         
                            α
                        
                     equals the local derivative of                         
                            
                                    x
                                
                                    j
                                
                                    t
                                
                     with respect to t, that is                         
                            α
                            =
                            
                                    ∂
                                    
                                            x
                                        
                                            j
                                        
                                            t
                                        
                                    ∂
                                    t
                                    
                                                    t
                                                
                                                    j
                                                
                                                    a
                                                
                            .
                        
The second factor in (7) evaluates to

                                    ∂
                                    
                                            t
                                        
                                            j
                                        
                                    ∂
                                    
                                            x
                                        
                                            j
                                        
                                            t
                                        
                                            t
                                        
                                            j
                                        
                                            a
                                        
                            =
                             
                                    ∂
                                    
                                            t
                                        
                                            j
                                        
                                                    x
                                                
                                                    j
                                                
                                    ∂
                                    
                                            x
                                        
                                            j
                                        
                                            t
                                        
                                            ​
                                        
                                            x
                                        
                                            j
                                        
                                    =
                                    ϑ
                                
                            =
                             
                                    -
                                    1
                                
                                    α
                                
                            =
                             
                                    -
                                    1
                                
                                            ∂
                                            
                                                    x
                                                
                                                    j
                                                
                                                    t
                                                
                                            ∂
                                            t
                                            (
                                            
                                                    t
                                                
                                                    j
                                                
                                                    a
                                                
                                            )
                                        
                            =
                             
                                    -
                                    1
                                
                                            ∑
                                            
                                                i
                                                ,
                                                l
                                            
                                                    w
                                                
                                                    i
                                                    j
                                                
                                                    l
                                                
                                                            ∂
                                                            
                                                                    y
                                                                
                                                                    i
                                                                
                                                                    l
                                                                
                                                                    t
                                                                
                                                            ∂
                                                            t
                                                        
                                                            t
                                                        
                                                            j
                                                        
                                                            a
                                                        
                     .					(8)

In further calculations, we will write terms like                         
                            
                                    ∂
                                    
                                            x
                                        
                                            j
                                        
                                                    t
                                                
                                                    j
                                                
                                                    a
                                                
                                    ∂
                                    
                                            t
                                        
                                            j
                                        
                                            a
                                        
                     for                         
                            
                                    ∂
                                    
                                            x
                                        
                                            j
                                        
                                            t
                                        
                                    ∂
                                    t
                                    (
                                    
                                            t
                                        
                                            j
                                        
                                            a
                                        
                                    )
                                
                    .

(Section 3 “Error-backpropagation;” emphasis added).

It would have been obvious to one of ordinary skill in the art, having the teachings of Mostafa, Booij and Bohte before the effective filing date of the claimed invention, to modify the learning rule taught by Mostafa and Booij such that the ratio therein is obtained at least in part using a linear approximation of temporal development of membrane potential like taught by Bohte.  It would have been advantageous to one of ordinary skill to utilize such a linear approximation because it would aid in the derivation of the learning rule, as is evident from Bohte (see section 3 “Error-backpropagation”).
Annapureddy generally describes an artificial neural system comprised of a plurality of levels of spiking neurons and in which information (e.g. spikes) is transferred between spiking neurons (see e.g. paragraphs 0030-0033 and FIG. 1).  Regarding the claimed invention, Annapureddy particularly teaches that the artificial neural system can be emulated via an ASIC executing a program (e.g. software modules) stored on a non-transitory machine-readable medium (see e.g. paragraphs 0035, 0160, 0163 and 0165-0166).
It would have been obvious to one of ordinary skill in the art, having the teachings of Mostafa, Booij, Bohte and Annapureddy before the effective filing date of the claimed invention, to implement the artificial neural network taught by Mostafa, Booij and Bohte via an ASIC executing a program stored on a non-transitory machine-readable medium, like taught by Annapureddy.  It would have been advantageous to one of ordinary skill to utilize such a combination because it would enable the spiking neurons to be effectively realized, as is suggested by Annapureddy (see e.g. paragraphs 0035, 0160, 0163 and 0165-0166).  Accordingly, Mostafa, Booij, Bohte and Annapureddy are considered to teach, to one of ordinary skill in the art, an application specific integrated circuit like that of claim 1.  The non-transitory machine-readable medium storing the program to implement the teachings of Mostafa, Booij, Bohte and Annapureddy is considered a non-transitory recording medium like that of claim 5.

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Mostafa, Booij, Bohte and Annapureddy, which is described above, and also over the article entitled, “Matching Recall and Storage in Sequence Learning with Spiking Neural Networks” by Brea et al. (“Brea”).
Regarding claim 3, Mostafa, Booij, Bohte and Annapureddy teach an ASIC like that of claim 1, as is described above, and whereby at least one of an output layer and a hidden layer of an artificial neural network is trained using a learning rule.  Mostafa, Booij, Bohte and Annapureddy, however, do not explicitly disclose that the output layer of the artificial neural network is trained using a learning rule expressed using a slop of a firing probability density, as is required by claim 3.
Brea nevertheless teaches causing an output layer of a neural network comprised of spiking neurons to be learned by using a learning rule expressed using a slope (i.e. a derivative,                         
                            
                                    ρ
                                
                                    i
                                
                                    '
                                
                                    t
                                
                    ) of firing probability density (i.e. of a firing intensity function):
Neuron model. We consider a recurrent network of N spiking neurons over a duration of T time bins. Spiking of neuron i is characterized by the spike train xi, with xi(t) = 1 if a spike is emitted at time step t, and xi(t) = 0 otherwise.   The membrane potential of neuron i is described as in the spike-response model (Gerstner and Kistler, 2002):

                            u
                        
                            i
                        
                            t
                        
                    =
                     
                            u
                        
                            0
                        
                    +
                     
                            ∑
                            
                                j
                                =
                                1
                            
                                N
                            
                                    w
                                
                                    i
                                    j
                                
                                    x
                                
                                    j
                                
                                    ε
                                
                            (
                            t
                            )
                        
                    +
                     
                            x
                        
                            i
                        
                            k
                        
                            t
                        
                    ,
                
										(1)

where wij is the synaptic strength from neuron j to neuron i,                         
                            
                                    x
                                
                                    k
                                
                                    α
                                
                                    t
                                
                            =
                             
                                    ∑
                                    
                                        z
                                        =
                                        1
                                    
                                        ∞
                                    
                                    α
                                    (
                                    s
                                    )
                                    
                                            x
                                        
                                            k
                                        
                                    (
                                    t
                                    -
                                    s
                                    )
                                
                     represents the convolution of spike train xk with kernel                         
                            α
                        
                     and                         
                            
                                    u
                                
                                    0
                                
                     is the resting potential. The postsynaptic kernel is characterized by                         
                            ε
                            
                                    s
                                
                            =
                             
                                    δ
                                
                                    s
                                    ,
                                    1
                                
                     for the one time step response kernel scenarios of Figures 2 and 3, whereas in Figure 4 it is given by                         
                            ε
                            
                                    s
                                
                            =
                            (
                            e
                            x
                            p
                            ⁡
                            (
                            -
                            s
                            /
                            
                                    τ
                                
                                    1
                                
                            )
                             
                            -
                             
                            e
                            x
                            p
                            ⁡
                            (
                            -
                            s
                            /
                            
                                    τ
                                
                                    2
                                
                            )
                            )
                            /
                            (
                            
                                    τ
                                
                                    1
                                
                            -
                             
                                    τ
                                
                                    2
                                
                            )
                        
                     for                         
                            s
                            ≥
                            0
                        
                     and the adaptation kernel is characterized by                         
                            k
                            
                                    s
                                
                            =
                            c
                             
                            e
                            x
                            p
                            ⁡
                            ⁡
                            (
                            -
                            s
                            /
                            
                                    τ
                                
                                    r
                                
                            )
                        
                     for                         
                            s
                            ≥
                            0
                        
                    , with both kernels vanishing for                         
                            s
                             
                            <
                            0
                        
                    .  For the clarity of the exposition, we chose such a simple neural model description.  Note, however, that almost any neural model could be considered (e.g., conductance-based models).  The only constraint is that the dynamical model should be linear in the weights, i.e., any dynamical model of the form                         
                            
                                            u
                                        
                                            i
                                        
                                ˙
                            
                            =
                             
                                    f
                                
                                    i
                                
                                            u
                                        
                                            i
                                        
                            +
                             
                                    ∑
                                    
                                        j
                                    
                                            w
                                        
                                            i
                                            j
                                        
                                            g
                                        
                                            i
                                            j
                                        
                                    (
                                    
                                            u
                                        
                                            i
                                        
                                    ,
                                    
                                            x
                                        
                                            j
                                        
                                    )
                                
                     is suitable.
Consistently with the stochastic spike-response model or equivalently the generalized linear model (GLM; Pillow and Latham, 2008), noise is modeled by stochastic spiking given the (noise-free) membrane potential u in Equation 1, i.e., the probability that neuron i emits a spike in time bin t is a function of its membrane potential:                         
                            P
                            
                                            x
                                        
                                            i
                                        
                                            t
                                        
                                    =
                                    1
                                     
                                    u
                                
                                    i
                                
                            (
                            t
                            )
                            )
                            =
                            ρ
                            (
                            
                                    u
                                
                                    i
                                
                            (
                            t
                            )
                            )
                        
                    .  We stress the fact that given its own membrane potential, the spiking process is conditionally independent of the spiking of all the other neurons at this time. Due to this conditional independence, the probability that the network with (recurrent) weight matrix w is generating the spike trains x = (                        
                            
                                    x
                                
                                    1
                                
                    , …,                         
                            
                                    x
                                
                                    N
                                
                    ) can be calculated explicitly as the product of the probabilities for each individual spike train, hence a product of factors                         
                            ρ
                            (
                            
                                    u
                                
                                    i
                                
                            (
                            t
                            )
                            )
                        
                     and (1 -                         
                            ρ
                            (
                            
                                    u
                                
                                    i
                                
                            (
                            t
                            )
                            )
                        
                    , depending on whether neuron i did or did not fire at time t, respectively. Abbreviating                         
                            
                                    ρ
                                
                                    i
                                
                                    t
                                
                            =
                             
                            ρ
                            (
                            
                                    u
                                
                                    i
                                
                            (
                            t
                            )
                            )
                        
                    , this amounts for the log-likelihood (Pfister et al. (2006)) as follows:

                    l
                    o
                    g
                    
                            P
                        
                            w
                        
                            x
                        
                    =
                     
                            ∑
                            
                                T
                                =
                                1
                            
                                T
                            
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                        N
                                    
                                            x
                                        
                                            i
                                        
                                            t
                                        
                                    l
                                    o
                                    g
                                    
                                            ρ
                                        
                                            i
                                        
                                            t
                                        
                                    +
                                    
                                            1
                                            -
                                             
                                                    x
                                                
                                                    i
                                                
                                                    t
                                                
                                            log
                                        
                                        ⁡
                                        
                                                    1
                                                    -
                                                     
                                                            ρ
                                                        
                                                            i
                                                        
                                                            t
                                                        
                                    .
                                
										(2)

Unless mentioned otherwise, the firing probability will be assumed to be a sigmoidal                         
                            ρ
                            
                                    u
                                
                            =
                            1
                            /
                            (
                            1
                            +
                            
                                    exp
                                    ⁡
                                    (
                                
                                ⁡
                                
                                    -
                                    β
                                    u
                                
                            )
                            )
                        
                    , with parameter                         
                            β
                        
                     controlling the level of stochasticity. We introduced this parameter for convenience: for given weights w, the stochasticity of the network can be varied by changing the parameter                         
                            β
                        
                    , which multiplies the weights. In the limit                         
                            β
                             
                            →
                             
                            ∞
                        
                     each neuron acts like a threshold unit and therefore makes the network deterministic.

(Pages 9565-9566; emphasis added).

Link to the voltage-triplet rule. In the limit of continuous time, the learning rule for visible synapses can be written as a triplet potentiation term (2 post, 1 pre) and a depression term (1 post, 1 pre):

                            ẇ
                        
                            i
                            j
                        
                    =
                    n
                    
                            g
                        
                            i
                        
                            t
                        
                            x
                        
                            i
                        
                            t
                        
                            x
                        
                            j
                        
                            ε
                        
                            t
                        
                    -
                    n
                    
                            ρ
                        
                            i
                        
                            '
                        
                    (
                    t
                    )
                    
                            x
                        
                            j
                        
                            ε
                        
                    (
                    t
                    )
                
            ,
										(13)

Where                         
                            
                                    x
                                
                                    i
                                
                                    t
                                
                     denotes the Dirac spike train of neuron i,                         
                            
                                    ρ
                                
                                    i
                                
                                    '
                                
                                    t
                                
                            =
                            d
                            ρ
                            (
                            u
                            )
                            /
                            d
                            u
                            
                                    |
                                
                                    u
                                    =
                                    
                                            u
                                        
                                            i
                                        
                                    (
                                    t
                                    )
                                
                     denotes the derivative of the firing intensity function and the prefactor is defined by                         
                            
                                    g
                                
                                    i
                                
                                    t
                                
                            =
                             
                                            ρ
                                        
                                            i
                                        
                                            '
                                        
                                    (
                                    t
                                    )
                                
                                            ρ
                                        
                                            i
                                        
                                    (
                                    t
                                    )
                                
                    .  Note that in continuous time the prefactor has a slightly different form than in discrete time: to arrive at a continuous time description we explicitly introduce the time bin size                         
                            δ
                            t
                        
                    , set the probability of spiking in one time bin to                         
                            
                                    ρ
                                
                                    i
                                
                            (
                            t
                            )
                            δ
                            t
                        
                    , thereby reinterpreting                         
                            
                                    ρ
                                
                                    i
                                
                            (
                            t
                            )
                        
                     as a spike density function, and get in the limit of vanishing time bin size                         
                            
                                            lim
                                        
                                            δt
                                            →
                                            ∞
                                        
                                ⁡
                                
                                                    ρ
                                                
                                                    i
                                                
                                                    '
                                                
                                            (
                                            t
                                            )
                                            δ
                                            t
                                        
                                                    ρ
                                                
                                                    i
                                                
                                            (
                                            t
                                            )
                                            δ
                                            t
                                            (
                                            1
                                            -
                                             
                                                    ρ
                                                
                                                    i
                                                
                                                    t
                                                
                                            δ
                                            t
                                            )
                                        
                                    =
                                     
                                                    ρ
                                                
                                                    i
                                                
                                                    '
                                                
                                            (
                                            t
                                            )
                                        
                                                    ρ
                                                
                                                    i
                                                
                                                    t
                                                
                     (Pfister et al., 2006; Brea et al., 2011). Interestingly, formulated in this way, the learning rule closely resembles the voltage-triplet rule proposed by Clopath et al. (2010), which is an extension of the pure spike-based triplet rule (Pfister and Gerstner, 2006). The weight change prescribed by the voltage-triplet rule of Clopath et al. (2010), which we will compare with our rule (Eq. 13), can be also written as a post-post-pre potentiation term and a post-pre depression term:

                                    w
                                
                                    i
                                    j
                                
                        ˙
                    
                    =
                     
                            A
                        
                            3
                        
                                            u
                                        
                                            i
                                        
                                            α
                                        
                                            t
                                        
                                    -
                                     
                                            θ
                                        
                                            1
                                        
                            +
                        
                                            u
                                        
                                            i
                                        
                                            t
                                        
                                    -
                                     
                                            θ
                                        
                                            2
                                        
                            +
                        
                            x
                        
                            j
                        
                            β
                        
                            t
                        
                    -
                     
                            A
                        
                            2
                        
                            [
                            
                                    u
                                
                                    i
                                
                                    γ
                                
                                    t
                                
                            -
                             
                                    θ
                                
                                    1
                                
                            ]
                        
                            +
                        
                            x
                        
                            i
                        
                    (
                    t
                    )
                
            ,
										(14)

where the notation                         
                            
                                    [
                                     
                                    ∙
                                     
                                    ]
                                
                                    +
                                
                     denotes rectification, i.e.,                         
                            
                                    [
                                    x
                                    ]
                                
                                    +
                                
                            =
                            x
                            ,
                             
                            i
                            f
                             
                            x
                             
                            ≥
                            0
                        
                    , otherwise                         
                            
                                    x
                                
                                    +
                                
                            =
                            0
                        
                    .  The convolution kernels                         
                            α
                        
                    ,                         
                            β
                        
                    , and                         
                            γ
                        
                     are exponential decay kernels with time constants                         
                            
                                    τ
                                
                                    α
                                
                     (resp.                         
                            
                                    τ
                                
                                    β
                                
                    ,                         
                            
                                    τ
                                
                                    γ
                                
                    ), e.g.                         
                            α
                            
                                    s
                                
                            =
                             
                                    τ
                                
                                    α
                                
                                    -
                                    1
                                
                            e
                            x
                            p
                            ⁡
                            (
                            -
                            s
                            /
                            
                                    τ
                                
                                    α
                                
                            )
                            Θ
                            (
                            s
                            )
                        
                     where                         
                            Θ
                            (
                            s
                            )
                        
                     denotes the Heaviside function, i.e.,                         
                            Θ
                            
                                    s
                                
                            =
                            1
                             
                            f
                            o
                            r
                             
                            s
                            ≥
                            0
                        
                     and                         
                            Θ
                            
                                    s
                                
                            =
                            0
                        
                     otherwise.

(Page 9568; emphasis added).

It would have been obvious to one of ordinary skill in the art, having the teachings of Mostafa, Booij, Bohte, Annapureddy and Brea before the effective filing date of the claimed invention, to modify the ASIC taught by Mostafa, Booij, Bohte and Annapureddy such that the output layer of the artificial neural network is trained by using a learning rule expressed using a slope of the firing probability density, as is taught by Brea.  It would have been advantageous to one of ordinary skill to utilize such a learning rule because it is biologically plausible, as is suggested by Brea (see e.g. “Discussion” on page 9573).  Accordingly, Mostafa, Booij, Bohte, Annapureddy and Brea are considered to teach, to one of ordinary skill in the art, an ASIC like that of claim 3.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over the article entitled “Supervised learning based on temporal coding in spiking neural networks” by Hesham Mostafa (“Mostafa”), over the article entitled “A gradient descent rule for spiking neurons emitting multiple spikes” by Booij et al. (“Booij”), and also over the article entitled, “Error-backpropagation in temporally encoded networks of spiking neurons” by Bohte et al. (“Bohte”).
Regarding claim 4, Mostafa generally describes “a new approach for controlling the behavior of spiking networks with realistic temporal dynamics, opening up the potential for using these networks to process spike patterns with complex temporal information” (Abstract).  Like claimed, Mostafa particularly teaches: 
	a plurality of neuron models, wherein each of the plurality of neuron models is a non-leaky integrate-and-fire spiking neuron and a spiking neuron with which a postsynaptic current is represented using a step function, and wherein one neuron model fires once at most in response to one input to the one neuron model (see e.g. section 2 “Network Model,” in which Mostafa states:
We use non-leaky integrate and fire neurons with exponentially decaying synaptic current kernels. The neuron’s membrane dynamics are described by:

                            d
                            
                                    V
                                
                                    m
                                    e
                                    m
                                
                                    j
                                
                            (
                            t
                            )
                        
                            d
                            t
                        
                    =
                     
                            ∑
                            
                                i
                            
                                    w
                                
                                    j
                                    i
                                
                            ∑
                            
                                r
                            
                            K
                            (
                            t
                            -
                            
                                    t
                                
                                    i
                                
                                    r
                                
                            )
                        
                                         (1)
where                 
                    
                            V
                        
                            m
                            e
                            m
                        
                            j
                        
             is the membrane potential of neuron j. The right hand side of the equation is the synaptic current.                  
                    
                            w
                        
                            j
                            i
                        
             is the weight of the synaptic connection from neuron i to neuron j and                 
                    
                            t
                        
                            i
                        
                            r
                        
             is the time of the rth spike from neuron i.  K is the synaptic current kernel given by:

                    K
                    
                            x
                        
                    =
                     
                    Θ
                    (
                    x
                    )
                    e
                    x
                    p
                    ⁡
                    
                            -
                            
                                    x
                                
                                            τ
                                        
                                            s
                                            y
                                            n
                                        
               where	                
                    Θ
                    (
                    x
                    )
                    =
                    
                                            1
                                             
                                            i
                                            f
                                             
                                            x
                                             
                                            ≥
                                             
                                            0
                                        
                                            0
                                             
                                            o
                                            t
                                            h
                                            e
                                            r
                                            w
                                            i
                                            s
                                            e
                                        
	(2)

Synaptic current thus jumps instantaneously on the arrival of an input spike, then decays exponentially with time constant τsyn. Since τsyn is the only time constant in the model, we set it to 1 in the rest of the paper, i.e., normalize all times with respect to it. The neuron spikes when its membrane potential crosses a firing threshold which we set to 1, i.e., all synaptic weights are normalized with respect to the firing threshold. The membrane potential is reset to zero after a spike. We allow the membrane potential to go below zero if the integral of the synaptic current is negative.

Mostafa thus describes a neuron model for non-leaky integrate-and-fire neurons, i.e. spiking neurons, with which a postsynaptic current – e.g. the current from neuron i – is represented in part by using a step function, i.e.                 
                    Θ
                    (
                    x
                    )
                
             in equation (2) above.  Mostafa further teaches in section 2 “Network model” that the non-leaky integrate-and-fire neurons can be configured to fire at most once in one process of a neural network to indicate an output:
In the rest of the paper, we consider a neuron’s output value to be the time of its first spike. Moreover, once a neuron spikes, it is not allowed to spike again, i.e., we assume it enters an infinitely long refractory period. We allow each neuron to spike at most once for each input presentation in order to make the spiking activity sparse and to force the training algorithm to make optimal use of each spike.

Mostafa thus teaches a neuron model configured as a non-leaky integrate-and-fire spiking neuron and a spiking neuron with which a postsynaptic current is represented using a step function, the neuron means being fired once at most in one process of a neural network to indicate an output of the neural model means itself at firing timing.  Mostafa also demonstrates in e.g. section 4.1 “XOR task” that a plurality of such neuron models can form a neural network in which information is transferred between the neurons:
In the XOR task, two spike sources send a spike each to the network. Each of the two input spikes can occur at time 0.0 (early spike) or 2.0 (late spike). The two input spike sources project to a hidden layer of four neurons and the hidden neurons project to two output neurons. The first output neuron must spike before the second output neuron if exactly one input spike is an early spike. The network is shown in Fig. 2(a) together with the four input patterns.	

    PNG
    media_image1.png
    310
    251
    media_image1.png
    Greyscale

Figure 2(a)

Mostafa thus teaches a plurality of neuron models like claimed.); and
 	a plurality of synapse models, wherein one synapse model of the plurality of synapse models connects the one neuron model with another neuron model (As demonstrated by equation (1) above, Mostafa teaches that the membrane potential of neuron j is based on input spikes received from neurons i connected to neuron j.  The neuron j thus receives information – particularly, input spikes – transferred from neurons i.  Additionally, as noted above, Mostafa demonstrates in section 4.1 “XOR task” that a plurality of such spiking neurons can form a neural network in which information is transferred between the neurons via connections therebetween.  Each of the connections between the spiking neurons is considered a synapse model like claimed, and wherein a synapse model connects one neuron model with another neuron model.);
  	outputting by the one neuron model, in response to firing of one neuron model, an output of the one neuron model at firing timing (As demonstrated by equation (1) above, Mostafa teaches that the membrane potential of neuron j is based on input spikes received from neurons i connected to neuron j.  Regarding equation (1), section 2 “Network Model” recites that “                
                    
                            w
                        
                            j
                            i
                        
             is the weight of the synaptic connection from neuron i to neuron j and                 
                    
                            t
                        
                            i
                        
                            r
                        
             is the time of the rth spike from neuron i.”  The membrane potential of neuron j is thus based in part on the firing timing of neuron i.  Accordingly it is apparent that a neuron model corresponding to a neuron i outputs, in response to firing of the neuron model, an output spike at firing timing so as to be provided to neuron j.); and
	transferring by the one synapse model, information from the one neuron model to the other neuron model (As noted above, equation (1) described by Mostafa indicates that the membrane potential of neuron j is based on input spikes received from neurons i connected to neuron j.  Accordingly, it is apparent that a synapse model transfers information, i.e. an indication of an output spike, from a neuron i to the neuron j.);	
Mostafa thus teaches a processing method similar to that of claim 4, but does not explicitly teach causing at least one of an output layer and a hidden layer of a neural network device to be learned using a learning rule that applies a ratio of the change in firing timing of the one neuron model to a change in firing timing of the other neuron model, wherein the ratio is obtained using a linear approximation of temporal development of membrane potential, as is required by claim 4.
	Booij generally describes a “supervised learning rule for Spiking Neural Networks (SNNs)…that can cope with neurons that spike multiple times.”  (Abstract).  Regarding the claimed invention, Booij particularly teaches that the learning rule applies a ratio of the change in the firing timing of one neuron to a change in firing timing of another neuron (e.g. the derivative of a postsynaptic spike with respect to a presynaptic spike):

We now derive a gradient descent learning rule in a similar way as SpikeProp [1]. However we take into account that the neurons can fire multiple spikes. In SpikeProp every neuron backpropagates a single error value, namely the error with respect to its one spike. In our derivation the neuron needs to determine and backpropagate an error value for each spike it fired.

…

We define the network error to be the sum squared error of the first spike of the output neurons O, so later spikes of these neurons are ignored:
                
                    E
                    =
                    
                            1
                        
                            2
                        
                            ∑
                            
                                o
                                ∈
                                O
                            
                                                    t
                                                
                                                    o
                                                
                                                    1
                                                
                                            -
                                             
                                                            t
                                                        
                                                            o
                                                        
                                                            1
                                                        
                                                ^
                                            
                                    2
                                
             ,					       (6)				           				
where                 
                    
                                    t
                                
                                    o
                                
                                    1
                                
                        ^
                    
             denotes the desired spike time. Other error functions, including functions over more than only the first output spike, are also possible, as long as the derivative of the error to one spike exists. In our case the error with respect to one output spike is determined by 

                            ∂
                            E
                        
                            ∂
                            
                                    t
                                
                                    o
                                
                                    1
                                
             =                 
                    
                            t
                        
                            o
                        
                            1
                        
                    -
                     
                                    t
                                
                                    o
                                
                                    1
                                
                        ^
                    
            .							       (7)

The parameters that are tuned to reduce this error are the weights of the synapses                 
                    
                            w
                        
                            j
                            i
                        
                            k
                        
            .  Other parameters, like the axonal delays, could also be learned [14]. Like SpikeProp we calculate the derivative of the network error with respect to the weight in order to calculate the appropriate weight change [1]: 

                    ∆
                    
                            w
                        
                            i
                            h
                        
                            k
                        
                    =
                     
                    -
                    k
                    
                            ∂
                            E
                        
                            ∂
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
            ,						      (8)

where                 
                    k
                
             denotes the learning rate. Because neuron                 
                    i
                
             can fire multiple times and all these firing times depend on the weight, this equation can be expanded with regard to these spikes:

                    ∆
                    
                            w
                        
                            i
                            h
                        
                            k
                        
                    =
                     
                    -
                    k
                    
                            ∑
                            
                                        t
                                    
                                        i
                                    
                                        f
                                    
                                ∈
                                
                                        F
                                    
                                        i
                                    
                                    ∂
                                    E
                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                            ∂
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
             .					      (9)

…

We now calculate the first factor on the right-hand side of Eq. (9), the derivative of the network error with respect to a spike. This is derived for every spike of non-output neuron                 
                    i
                     
                    ∉
                     
                    O
                
            , which depends on all spikes by all its neural successors                 
                    
                            Γ
                        
                            i
                        
             (see Fig. 1):

                            ∂
                            E
                        
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                    =
                     
                            ∑
                            
                                j
                                ∈
                                
                                        Γ
                                    
                                        i
                                    
                                    ∑
                                    
                                                t
                                            
                                                j
                                            
                                                g
                                            
                                        ∈
                                        
                                                F
                                            
                                                j
                                            
                                            ∂
                                            E
                                        
                                            ∂
                                            
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                            ∂
                                            
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                            ∂
                                            
                                                    t
                                                
                                                    i
                                                
                                                    f
                                                
            .					    (13)

Thus, in order to calculate the derivative of the error with respect to a spike, we already have to know the derivative of the error with respect to spikes of its neuronal successors that happened later in time. In practice this means that the error with respect to the last spike of the network has to be computed first, then to the spike preceding the last, and so on. This is the temporal equivalent of the conventional backpropagation technique, where the error was propagated back spatially through the network [9].
The derivative of a postsynaptic spike with respect to a presynaptic spike can be further expanded as in Eq. (10): 

                            ∂
                            
                                    t
                                
                                    j
                                
                                    g
                                
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                    =
                     
                            ∂
                            
                                    u
                                
                                    i
                                
                                            t
                                        
                                            j
                                        
                                            g
                                        
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                            -
                            1
                        
                                    ∂
                                    
                                            u
                                        
                                            j
                                        
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                    ∂
                                    
                                            t
                                        
                                            j
                                        
                                            g
                                        
            .				  	    (14)

The second factor already appeared in Eq. (10) and is calculated in Eq. (12). The first factor, the derivative of the potential during a postsynaptic spike with respect to a presynaptic spike, can again be derived from Eq. (3): 

                            ∂
                            
                                    u
                                
                                    i
                                
                                            t
                                        
                                            j
                                        
                                            g
                                        
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                    =
                     
                    -
                    
                            ∑
                            
                                        t
                                    
                                        j
                                    
                                        l
                                    
                                ∈
                                
                                        F
                                    
                                        j
                                    
                                    n
                                
                                    '
                                
                                            t
                                        
                                            j
                                        
                                            g
                                        
                                    -
                                    
                                            t
                                        
                                            j
                                        
                                            l
                                        
                            ∂
                            
                                    t
                                
                                    j
                                
                                    l
                                
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                    -
                     
                            ∑
                            
                                k
                            
                                    w
                                
                                    j
                                    i
                                
                                    k
                                
                                    ε
                                
                                    '
                                
                                            t
                                        
                                            j
                                        
                                            g
                                        
                                    -
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                                    -
                                    
                                            d
                                        
                                            j
                                            i
                                        
                                            k
                                        
            . 			    (15)		
As can be seen this is again a recursive function, so the order in which the computations are done is important.
With this set of formulas it is now possible to compute the necessary weight changes that minimize a certain error measure of the spikes of the output neurons given a pattern of input spike trains.

(Section 3.1 “Gradient descent.”  Emphasis added.

Booij further suggests that the learning rule can be used to train an output layer and hidden layer of an artificial neural network (see e.g. section 4.1. “The XOR Benchmark,” in which the learning rule is used to train a neural network comprising multiple layers to implement an XOR function.).
	Although Booij is directed to a learning rule for neurons that spike multiple times (see e.g. the Abstract), it would have been apparent to one of ordinary skill in the art that, with minor modifications (e.g. setting the number of spikes for each neuron in the above formulas to one), Booij’s learning rule could be applied to neurons that fire only once like claimed and taught by Mostafa.  Particularly, it would have been obvious to one of ordinary skill in the art, having the teachings of Mostafa and Booij before the effective filing date of the claimed invention, to modify the artificial neural network taught by Mostafa such that at least one of an output layer and a hidden layer of the neural network is trained using a learning rule like taught by Booij, which applies a ratio of the change in firing timing of one neuron model to a change in firing timing of another neuron model.  It would have been advantageous to one of ordinary skill to utilize such a learning rule because it “can be used for real-world applications requiring the processing of temporal data,” as is taught by Booij (see section 5 “Conclusion and future work”).
Booij, however, does not explicitly disclose that the ratio is obtained using a linear approximation of temporal development of membrane potential like claimed.  Nevertheless Booij does disclose that the learning rule is derived in a similar way as “SpikeProp,” and that some of the formulas use results of the SpikeProp algorithm:
The parameters that are tuned to reduce this error are the weights of the synapses                 
                    
                            w
                        
                            j
                            i
                        
                            k
                        
            .  Other parameters, like the axonal delays, could also be learned [14]. Like SpikeProp we calculate the derivative of the network error with respect to the weight in order to calculate the appropriate weight change [1]: 

                    ∆
                    
                            w
                        
                            i
                            h
                        
                            k
                        
                    =
                     
                    -
                    k
                    
                            ∂
                            E
                        
                            ∂
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
            ,						      (8)

where                 
                    k
                
             denotes the learning rate. Because neuron                 
                    i
                
             can fire multiple times and all these firing times depend on the weight, this equation can be expanded with regard to these spikes:

                    ∆
                    
                            w
                        
                            i
                            h
                        
                            k
                        
                    =
                     
                    -
                    k
                    
                            ∑
                            
                                        t
                                    
                                        i
                                    
                                        f
                                    
                                ∈
                                
                                        F
                                    
                                        i
                                    
                                    ∂
                                    E
                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                            ∂
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
             .					      (9)

First the second factor on the right-hand side, the derivative of a firing time with respect to the weight, is calculated using the results of the study by Bohte et al. for the SpikeProp algorithm; see [1] or [2] for a complete derivation. 

                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                            ∂
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
                    =
                     
                            ∂
                            
                                    u
                                
                                    i
                                
                                            t
                                        
                                            i
                                        
                                            f
                                        
                            ∂
                            
                                    w
                                
                                    i
                                    h
                                
                                    k
                                
                            -
                            1
                        
                                    ∂
                                    
                                            u
                                        
                                            j
                                        
                                                    t
                                                
                                                    i
                                                
                                                    f
                                                
                                    ∂
                                    
                                            t
                                        
                                            i
                                        
                                            f
                                        
             .					    (10)
…
We now calculate the first factor on the right-hand side of Eq. (9), the derivative of the network error with respect to a spike. This is derived for every spike of non-output neuron                 
                    i
                     
                    ∉
                     
                    O
                
            , which depends on all spikes by all its neural successors                 
                    
                            Γ
                        
                            i
                        
             (see Fig. 1):

                            ∂
                            E
                        
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                    =
                     
                            ∑
                            
                                j
                                ∈
                                
                                        Γ
                                    
                                        i
                                    
                                    ∑
                                    
                                                t
                                            
                                                j
                                            
                                                g
                                            
                                        ∈
                                        
                                                F
                                            
                                                j
                                            
                                            ∂
                                            E
                                        
                                            ∂
                                            
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                            ∂
                                            
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                            ∂
                                            
                                                    t
                                                
                                                    i
                                                
                                                    f
                                                
            .					    (13)

Thus, in order to calculate the derivative of the error with respect to a spike, we already have to know the derivative of the error with respect to spikes of its neuronal successors that happened later in time. In practice this means that the error with respect to the last spike of the network has to be computed first, then to the spike preceding the last, and so on. This is the temporal equivalent of the conventional backpropagation technique, where the error was propagated back spatially through the network [9].
The derivative of a postsynaptic spike with respect to a presynaptic spike can be further expanded as in Eq. (10): 

                            ∂
                            
                                    t
                                
                                    j
                                
                                    g
                                
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                    =
                     
                            ∂
                            
                                    u
                                
                                    i
                                
                                            t
                                        
                                            j
                                        
                                            g
                                        
                            ∂
                            
                                    t
                                
                                    i
                                
                                    f
                                
                            -
                            1
                        
                                    ∂
                                    
                                            u
                                        
                                            j
                                        
                                                    t
                                                
                                                    j
                                                
                                                    g
                                                
                                    ∂
                                    
                                            t
                                        
                                            j
                                        
                                            g
                                        
            .				  	    (14)

(Section 3.1 “Gradient descent.”  Emphasis added.)

Along these lines, Bohte generally describes the supervised learning rule, SpikeProp, for a network of spiking neurons that encodes information in the timing of individual spike times (see e.g. the Abstract).  In particular, Bohte teaches that the membrane potential (i.e. internal state variable) for each spiking neuron is represented by a function xj(t):
The network architecture consists of a feedforward network of spiking neurons with multiple delayed synaptic terminals (Fig. 1, as described by Natschläger and Ruf [35]). The neurons in the network generate action potentials, or spikes, when the internal neuron state variable, called “membrane potential”, crosses a threshold                 
                    ϑ
                
            .  The relationship between input spikes and the internal state variable is described by the spike response model (SRM), as introduced by Gerstner [18]. Depending on the choice of suitable spike-response functions, one can adapt this model to reflect the dynamics of a large variety of different spiking neurons.
Formally, a neuron j, having a set Гj of immediate predecessors (“pre-synaptic neurons”), receives a set of spikes with firing times ti, i ∈ Гj. Any neuron generates at most one spike during the simulation interval, and fires when the internal state variable reaches a threshold                 
                    ϑ
                
            . The dynamics of the internal state variable xj(t) are determined by the impinging spikes, whose impact is described by the spike-response function ɛ(t) weighted by the synaptic efficiency (“weight”) wij:

                            x
                        
                            j
                        
                            t
                        
                    =
                     
                            ∑
                            
                                i
                                ∈
                                Г
                                j
                            
                                    w
                                
                                    i
                                    j
                                
                    ɛ
                    
                            t
                            -
                             
                                    t
                                
                                    i
                                
            								(1)

The spike-response function in (1) effectively models the unweighted post-synaptic potential (PSP) of a single spike impinging on a neuron. The height of the PSP is modulated by the synaptic weight wij to obtain the effective post-synaptic potential. The spike-response function as used in our experiments is defined in (3).

(Section 2.  “A network of spiking neurons;” emphasis added).

Regarding the claimed invention, Bohte further teaches that the SpikeProp learning rule uses ratios that are obtained using a linear approximation of membrane potential (i.e. a linear approximation of xj):
The target of the algorithm is to learn a set of target firing times, denoted                 
                    
                                    t
                                
                                    j
                                
                                    d
                                
            , at the output neurons j ∈ J for a given set of input patterns {P[t1…th]}, where P[t1…th] defines a single input pattern described by single spike times for each neuron h ∈ H.  We choose as the error-function the least mean squares error-function, but other choices like entropy are also possible. Given desired spike times                 
                    
                                    t
                                
                                    j
                                
                                    d
                                
             and actual firing times                 
                    
                                    t
                                
                                    j
                                
                                    a
                                
            , this error-function is defined by

                    E
                    =
                     
                            1
                        
                            2
                        
                            ∑
                            
                                j
                                 
                                ∈
                                 
                                J
                                 
                                                    j
                                                
                                                    j
                                                
                                                    a
                                                
                                            -
                                             
                                                    t
                                                
                                                    j
                                                
                                                    d
                                                
                                    2
                                
                    .
                
(5)

For error-backpropagation, we treat each synaptic terminal as a separate connection k with weight                 
                    
                            w
                        
                            i
                            j
                        
                            k
                        
            .  Hence, for a backprop rule, we need to calculate 

                    ∆
                    
                            w
                        
                            i
                            j
                        
                            k
                        
                    =
                     
                    -
                    n
                    
                            ∂
                            E
                        
                            ∂
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
								(6)

with n the learning rate and                 
                    
                            w
                        
                            i
                            j
                        
                            k
                        
             the weight of connection k from neuron i to neuron j. As tj is a function of xj, which depends on the weights                 
                    
                            w
                        
                            i
                            j
                        
                            k
                        
            , the derivative on the right-hand part of (6) can be expanded to

                            ∂
                            E
                        
                            ∂
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                    =
                     
                            ∂
                            E
                        
                            ∂
                            
                                    t
                                
                                    j
                                
                                    t
                                
                                    j
                                
                                    a
                                
                            ∂
                            
                                    t
                                
                                    j
                                
                            ∂
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                                    t
                                
                                    j
                                
                                    a
                                
                    =
                     
                            ∂
                            E
                        
                            ∂
                            
                                    t
                                
                                    j
                                
                                    t
                                
                                    j
                                
                                    a
                                
                            ∂
                            
                                    t
                                
                                    j
                                
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
                                    t
                                
                                    j
                                
                                    a
                                
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
                            ∂
                            
                                    w
                                
                                    i
                                    j
                                
                                    k
                                
                                    t
                                
                                    j
                                
                                    a
                                
                    .
                
(7)
								
In the last two factors on the right, we express tj as a function of the thresholded post-synaptic input xj(t) around t =                 
                    
                            t
                        
                            j
                        
                            a
                        
            . We assume that for a small enough region around t =                 
                    
                            t
                        
                            j
                        
                            a
                        
            , the function xj can be approximated by a linear function of t, as depicted in Fig. 3.  For such a small region, we approximate the threshold function                 
                    δ
                    
                            t
                        
                            j
                        
                                    x
                                
                                    j
                                
                    =
                     
                    -
                    
                            δ
                            
                                    x
                                
                                    j
                                
                                            t
                                        
                                            j
                                        
                            α
                        
            , with                 
                    
                            ∂
                            
                                    t
                                
                                    j
                                
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
             the derivative of the inverse function of                 
                    
                            x
                        
                            j
                        
                            t
                        
            .  The value                 
                    α
                
             equals the local derivative of                 
                    
                            x
                        
                            j
                        
                            t
                        
             with respect to t, that is                 
                    α
                    =
                    
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
                            ∂
                            t
                            
                                            t
                                        
                                            j
                                        
                                            a
                                        
                    .
                
The second factor in (7) evaluates to

                            ∂
                            
                                    t
                                
                                    j
                                
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
                                    t
                                
                                    j
                                
                                    a
                                
                    =
                     
                            ∂
                            
                                    t
                                
                                    j
                                
                                            x
                                        
                                            j
                                        
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
                                    ​
                                
                                    x
                                
                                    j
                                
                            =
                            ϑ
                        
                    =
                     
                            -
                            1
                        
                            α
                        
                    =
                     
                            -
                            1
                        
                                    ∂
                                    
                                            x
                                        
                                            j
                                        
                                            t
                                        
                                    ∂
                                    t
                                    (
                                    
                                            t
                                        
                                            j
                                        
                                            a
                                        
                                    )
                                
                    =
                     
                            -
                            1
                        
                                    ∑
                                    
                                        i
                                        ,
                                        l
                                    
                                            w
                                        
                                            i
                                            j
                                        
                                            l
                                        
                                                    ∂
                                                    
                                                            y
                                                        
                                                            i
                                                        
                                                            l
                                                        
                                                            t
                                                        
                                                    ∂
                                                    t
                                                
                                                    t
                                                
                                                    j
                                                
                                                    a
                                                
             .					(8)

In further calculations, we will write terms like                 
                    
                            ∂
                            
                                    x
                                
                                    j
                                
                                            t
                                        
                                            j
                                        
                                            a
                                        
                            ∂
                            
                                    t
                                
                                    j
                                
                                    a
                                
             for                 
                    
                            ∂
                            
                                    x
                                
                                    j
                                
                                    t
                                
                            ∂
                            t
                            (
                            
                                    t
                                
                                    j
                                
                                    a
                                
                            )
                        
            .

(Section 3 “Error-backpropagation;” emphasis added).

It would have been obvious to one of ordinary skill in the art, having the teachings of Mostafa, Booij and Bohte before the effective filing date of the claimed invention, to modify the learning rule taught by Mostafa and Booij such that the ratio therein is obtained at least in part using a linear approximation of temporal development of membrane potential like taught by Bohte.  It would have been advantageous to one of ordinary skill to utilize such a linear approximation because it would aid in the derivation of the learning rule, as is evident from Bohte (see section 3 “Error-backpropagation”).  Accordingly, Mostafa, Booij and Bohte are considered to teach, to one of ordinary skill in the art, a processing method like that of claim 4.

Response to Arguments
The Examiner acknowledges the Applicant’s amendments to claims 1, 4 and 5.  In response to these amendments, the 35 U.S.C. § 112 rejections presented in the previous Office Action with respect to claims 1 and 3-5 are respectfully withdrawn.  The Examiner respectfully notes, however, that the amendments have resulted in the new 35 U.S.C. § 112 rejection presented above with respect to claim 3.
The Applicant’s arguments concerning the 35 U.S.C. § 101 rejections presented in the previous Office Action have been considered and are persuasive.  Particularly, like noted by the Applicant, the neuron model and associated learning rule recited in the claims reflects an improvement in the functioning of a computer, e.g. lower processing load and/or a smaller circuit area like described in paragraphs 0092 and 0093 of Applicant’s published specification (U.S. Patent Application Publication No. 2022/0101092).
The Applicant’s arguments concerning the 35 U.S.C. § 103 rejections presented in the previous Office Action have been considered, but are moot in view of the new grounds of rejection presented above.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BLAINE T BASOM whose telephone number is (571)272-4044. The examiner can normally be reached Monday-Friday, 9:00 am - 5:30 pm, EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matt Ell can be reached at (571)270-3264. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BTB/
1/9/2026

/MATTHEW ELL/Supervisory Patent Examiner, Art Unit 2141
Read full office action
Prosecution Timeline

Show 2 earlier events
Dec 17, 2024
Non-Final Rejection mailed — §103, §112
Mar 17, 2025
Response Filed
Aug 01, 2025
Final Rejection mailed — §103, §112
Nov 03, 2025
Applicant Interview (Telephonic)
Nov 03, 2025
Request for Continued Examination
Nov 04, 2025
Response after Non-Final Action
Jan 15, 2026
Non-Final Rejection mailed — §103, §112
Apr 15, 2026
Response Filed
Precedent Cases

Applications granted by this same examiner with similar technology

17/644,425
Patent 12632794
METHOD AND SYSTEM FOR CROSS-CHAIN CONSENSUS ORIENTED TO FEDERATED LEARNING
4y 5m to grant Granted May 19, 2026
17/806,556
Patent 12608647
MULTIMODAL DATA INFERENCE
3y 10m to grant Granted Apr 21, 2026
17/334,697
Patent 12566981
METHOD AND SYSTEM FOR EVENT PREDICTION BASED ON TIME-DOMAIN BOOTSTRAPPED MODELS
4y 9m to grant Granted Mar 03, 2026
16/817,836
Patent 12487727
Sensory Adjustment Mechanism
5y 8m to grant Granted Dec 02, 2025
17/649,045
Patent 12443420
Automatic Image Conversion
3y 8m to grant Granted Oct 14, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
43%
Grant Probability
66%
With Interview (+22.7%)
4y 6m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 326 resolved cases by this examiner. Grant probability derived from career allowance rate.
NEURAL NETWORK DEVICE, NEURAL NETWORK SYSTEM, PROCESSING METHOD, AND RECORDING MEDIUM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

NEURAL NETWORK DEVICE, NEURAL NETWORK SYSTEM, PROCESSING METHOD, AND RECORDING MEDIUM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email