DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 12/02/2025 have been fully considered and they are partially persuasive.
Regarding applicant’s remarks directed to the rejection of claims under 35 USC § 103, the arguments are directed to newly amended limitations that were not previously examined by the examiner. Therefore, applicants arguments are rendered moot. The examiner refers to the rejection under 35 USC § 103 in the current office action for more details.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 5, 8-10, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. US20190370653A1 Chakrabartty et al. (“Chakrabartty-2019”) in view of Chatterjee, Oindrila, and Shantanu Chakrabartty. "Resonant machine learning based on complex growth transform dynamical systems." IEEE Transactions on Neural Networks and Learning Systems 32.3 (2020): 1289-1303. (“Chatterjee”) in further view of WIPO Pub No. WO2020210673A1 Cleland et al. (“Cleland”)
In regards to claim 1,
Chakrabartty-2019 teaches A backpropagation-less learning (BPL) computing device comprising at least one processor in communication with a memory device, the at least one processor configured to:
(Chakrabartty-2019, “[0006] In an additional aspect, a spiking SVM is provided that includes a growth transform neural network system. The growth transform neural network system includes a computing device. The computing device includes at least one processor and a memory [computing device comprising at least one processor in communication with a memory device, the at least one processor configured to:] storing a plurality of modules. Each module includes instructions executable on the at least one processor. The plurality of modules includes a growth transform neural network module, a growth transform module, and a network convergence module. The growth transform neural network module defines a plurality of mirrored neuron pairs that include a plurality of first components and a plurality of second components that are interconnected according to an interconnection matrix. The growth transform module updates each first component of each mirrored neuron pair of a plurality of mirrored neuron pairs according to a growth transform neuron model.”)
Chakrabartty-2019 teaches retrieve, from the memory device,
(Chakrabartty-2019, “[0142] Processor 404 may also be operatively coupled to a storage device 410 [from the memory device]. Storage device 410 is any computer-operated hardware suitable for storing and/or retrieving data. In some embodiments, storage device 410 is integrated in server computing device 402. For example, server computing device 402 may include one or more hard disk drives as storage device 410. In other embodiments, storage device 410 is external to server computing device 402 and is accessed by a plurality of server computing devices 402. For example, storage device 410 may include multiple storage units such as hard disks or solid state disks in a redundant array of inexpensive disks (RAID) configuration. Storage device 410 may include a storage area network (SAN) and/or a network attached storage (NAS) system.”)
Chakrabartty-2019 teaches at least one or more training datasets;
(Chakrabartty-2019, “[0123] While the systems described herein were demonstrated using simple two-dimensional synthetic problems (for the ease of visualization), it is to be understood that the results are suitable for larger and more complex tasks. By way of non-limiting example, Table I summarizes the classification results of different variants of switching SVMs, trained and evaluated on a benchmark ‘Adult (a3a)’ dataset. The training dataset (3185 instances) [at least one or more training datasets] and testing dataset (29376 instances) provided on the LIBSVM website were used for training and cross-validation respectively.”)
Chakrabartty-2019 teaches build a spike-response model relating one or more aspects of the at least one or more training datasets;
(Charkrabartty-2019, “[0070] In one aspect, the disclosed neural network system is incorporated into a spiking support vector machine (SVM) [build a spike-response model] that includes a network of growth transform neurons, as described herein below. Each neuron in the SVM network learns to encode output parameters such as spike rate and time-to-spike responses according to an equivalent margin of classification and those neurons corresponding to regions near the classification boundary learn to exhibit noise-shaping dynamics similar to behaviors observed in biological networks. As a result, the disclosed spiking support vector machine (SVM) enables large-scale population encoding, for examples for a case when the spiking SVM learns to solve two benchmark classification tasks [relating one or more aspects of the at least one or more training datasets], resulting in classification performance similar to that of the state-of-the-art SVM implementations.”)
Chakrabartty-2019 teaches store the spike-response model in the memory device; and
(Charkrabartty-2019, “[0135] In various aspects, the methods described herein are implemented using a remote and/or local computing device as described herein below. FIG. 21 illustrates an example configuration of a remote device system 300 and depicts an exemplary configuration of a remote or user computing device 302, such as requestor computing device. Computing device 302 includes a processor 304 for executing instructions. In some embodiments, executable instructions are stored in a memory area 306. Processor 304 may include one or more processing units (e.g., in a multi-core configuration). Memory area 306 is any device allowing information such as executable instructions and/or other data to be stored and retrieved [store the spike-response model in the memory device]. Memory area 306 may include one or more computer-readable media.”)
Chakrabartty-2019 teaches design, using the spike-response model, a Growth Transform (GT) neural network including a plurality of ON-OFF neuron pairs using weight adaptation
(Charkrabartty-2019, Fig. 1C, “[0067]
In various aspects, a growth transform neuron model is mutually coupled with a network objective function of a machine learning model [using the spike-response model; wherein the machine learning model is the spiking machine learning model (SVM)], as illustrated in FIG. 1C. While each individual neuron traverses a trajectory within the dual optimization space as the system converges on a solution, the overall network traverses a trajectory in an equivalent primal optimization space [on overall network spiking activity]. As a result, the network of growth transform neurons solves classification tasks while producing unique but interpretable neural dynamics, such as noise-shaping, spiking and bursting [design… a Growth Transform (GT) neural network].”)
(Chakrabartty-2019, “[0007] …The growth transform neural network module defines a plurality of mirrored neuron pairs [including a plurality of ON-OFF neuron pairs; see fig. 6A for on/off using the sgn function to create the binary values of on/off] that include a plurality of first components and a plurality of second components that are interconnected according to an interconnection matrix. The growth transform module updates each first component of each mirrored neuron pair of a plurality of mirrored neuron pairs according to a growth transform neuron model. The network convergence module converges [using weight adaptation; wherein convergence is updating the weights] the plurality of mirrored neuron pairs to a steady state condition by solving a system objective function subject to at least one normalization constraint. The first component and the second component of each mirrored neuron pair in the steady state condition may each produce a neuron response that includes a steady state value or a limit cycle with ΣΔ modulation according to a user-defined potential function Φ(pik) given by:
W 1∈1 +|p ik−(½−∈1)| for 0≤p ik<½−∈1,
W 1 |p ik−½| for ½−∈1 ≤p ik≤½,
W 2 |p ik−½| for ½<p ik<½+∈2, and
W 2∈2 +|p ik−(½−∈2)| for ½+∈2 <p ik≤1,
in which pik is the response of ith neuron of the plurality of mirrored neuron pairs, k is 1 or 2, W1>1, W2>1 ∈1>0, and ∈2>0….
[0022] FIG. 6A is a schematic illustration showing the relationship of the quantized response Sik=sgn(pik−1/M) to the ΣΔ limit cycles.
PNG
media_image1.png
258
527
media_image1.png
Greyscale
”)
Chakrabartty-2019 teaches enforce sparsity constraints on overall network spiking activity.
In light of the specification, (“[0070] As shown in FIG. 4A, energy-efficiency in energy-based neuromorphic machine learning, there is a loss function for training and an additional loss for enforcing sparsity. The embodiments set forth herein make the loss for training and loss for enforcing sparsity equal, as shown in FIG. 4B”) Examiner interprets the loss function to enforce sparsity constrains on overall network spiking activity.
(Charkrabartty-2019, “[0073] In one aspect, the growth transform neural network design and analysis includes estimating an equivalent dual optimization function based on the mapping given by Eqn. (1). Each neuron implements a continuous mapping based on a polynomial growth transform update that also dynamically optimizes the cost function [enforce sparsity constraints on overall network spiking activity; wherein the cost function corresponds to primal loss-functions]. Because the growth transform mapping is designed to evolve over a constrained manifold, the neuronal responses and the network are stable. The switching, spiking and bursting dynamics of the neurons emerge by choosing different types of potential functions and hyper-parameters in the dual cost function. The use of this approach is suitable for use in the design of SVMs that exhibit ΣΔ modulation type limit-cycles, spiking behavior and bursting responses.”)
Chakrabartty-2019 teaches the cost function corresponding to primal loss-functions
(Chakrabartty-2019, “[0122] One insight that emerged from the disclosed geometric framework is that, while each individual neuron is optimizing a relatively simple dual cost function, the network as a whole exhibits complex dynamics corresponding to primal loss-functions with hysteresis and discontinuities. In each of the support vector networks that incorporate the disclosed growth transform neural networks as described above, irrespective of the nature of the output (ΣΔ modulation, spiking or busting), the output of the neuron faithfully encodes an equivalent classification margin.”)
However, Charkrabartty-2019 does not explicitly teach wherein the sparsity constraints are configured to function as a regularizer such that the GT neural network is configured to (i) learn via few-shot learning and (ii) operate at the edge of a network in energy and resource-constrained environments
Chatterjee teaches wherein the sparsity constraints are configured to function as a regularizer
(Chatterjee, Section III., “In this formulation, the active-power dissipation D in (10) acts as a regularization function with β ≥ 0 being a hyperparameter…. Example 1: Consider a single-variable quadratic optimization problem of the form H1(x) = x^2, subject to the constraint
|x| ≤ 1, x ∈ R. Substituting x = |V|^2 − |I|^2, the problem can be mapped (please see Appendix B for more details) into the form equivalent to (10) as
PNG
media_image2.png
67
369
media_image2.png
Greyscale
Fig. 4(a)–(c) plots L1 for different values of β [wherein the sparsity constraints are configured to function as a regularizer; wherein L1 optimization acts as a regularization function with β ≥ 0 being a hyperparameter and thus enable the network to perform (i) and (ii). Examiner refers to Chatterjee, Section V for support in combining the cost function of Charkrabartty-2019 with the regularization function of Chatterjee]. As shown in Fig. 4(a) and as expected for β = 0, the cost function has several minima (or attractors), whereas for β > 0, the minima corresponds to φ = ±π/2, for which the active-power dissipation is zero. Fig. 4(b) and (c) shows that controlling β will control the optimization landscape (without changing the location of the attractors) and will determine the attractor trajectory. This feature has been exploited in Sections IV and V to optimize the active-power dissipation profile during the learning phase.”
PNG
media_image3.png
216
750
media_image3.png
Greyscale
PNG
media_image4.png
477
762
media_image4.png
Greyscale
)
However, Chatterjee does not explicitly teach such that the GT neural network is configured to (i) learn via few-shot learning and (ii) operate at the edge of a network in energy and resource-constrained environments.
Cleland teaches such that the GT neural network is configured to (i) learn via few-shot learning and
(Cleland, pg. 75 lines 22-24, “While certain of the present embodiments focus on one-shot learning, the network can also be configured for few-shot learning, in which it gradually adapts to the underlying statistics of training samples.”)
Cleland teaches (ii) operate at the edge of a network in energy and resource-constrained environments.
Examiner interprets the limitation as a TinyML device as an edge device to a network in light of the specification (“[0051] In the exemplary embodiment, GT system 114 a-114 c may be tinyML systems, or networks, that implement machine learning processes. In some embodiments, a tinyML system may include a device that provides low latency, low power consumption, low bandwidth, and privacy. Additionally, a tinyML device, sometimes called an always on device, may be placed on the edge of a network.”)
(Cleland, pg. 78 line 25-pg. 79 line 2, “Neuromorphic systems are custom integrated circuits that model biological neural computations, typically with orders of magnitude greater speed and energy efficiency than general-purpose computers. These systems enable the deployment of neural algorithms in edge devices [operate at the edge of a network], such as chemosensory signal analyzers, in which real-time operation, low power consumption [in energy], environmental robustness, and compact size [and resource-constrained environments] are important operational metrics.”)
Chatterjee is considered to be analogous to the claimed invention because they are in the same field of growth transform networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chakrabartty-2019 to incorporate the teachings of Chatterjee in order to provide a regularization function to select the trajectory with an optimal active-power dissipation profile (Chatterjee, “Illustration showing that operating in the complex domain allows different possible learning trajectories from an initial state to the final steady state. Regularization with respect to the phase factor could then be used to select the trajectory with an optimal active-power dissipation profile and results in limit cycle oscillations in steady state. The circles indicate the constant magnitude loci.”
PNG
media_image5.png
264
364
media_image5.png
Greyscale
)
Cleland is considered to be analogous to the claimed invention because they are in the same field of spiking neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chakrabartty-2019 and Chatterjee to incorporate the teachings of Cleland in order to provide few-shot learning to learn robust representations of even corrupt samples and provide neuromorphic systems with greater speed and energy efficiency than general-purpose computers. (Cleland, pg. 75 lines 24-25, “In this configuration, the network learns robust representations even when the training samples themselves are corrupted by impulse noise.”) (Cleland, pg. 78 line 25-pg. 79 line 2, “Neuromorphic systems are custom integrated circuits that model biological neural computations, typically with orders of magnitude greater speed and energy efficiency than general-purpose computers.”) (Cleland, pg. 17 lines 12-14, “When implemented on appropriate hardware, SNNs are extremely energy efficient and uniquely scalable to very large problems; consequently, marrying the efficiency of neuromorphic processors with the algorithmic power of deep learning is an important industry goal.”)
In regards to claim 5,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Chakrabartty-2019 teaches wherein, to enforce sparsity, the at least one processor is configured to: minimize network-level spiking activity while producing classification accuracy comparable to standard approaches on the one or more training datasets.
(Charkrabartty-2019, Table 2, “[0125] The first term in the cost function minimizes a kernel distance between the class labels and the probability variables pi +, pi −, and the second term minimizes [minimize network-level spiking activity while producing classification accuracy comparable to standard approaches on the one or more training datasets; wherein Table 2 teaches the classification accuracies is compared to standard approaches ie a benchmark] a cumulative potential function Ω(.) corresponding to each neuron. The kernel or the interconnection matrix Q is a positive definite matrix such that each of its elements is written as an inner-product in a high-dimensional space as Qij=Ψ(xi)·Ψ(xj) where xi∈RD correspond to the input data vector and Ψ(.) represents a high-dimensional mapping function.”)
In regards to claim 8,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Chakrabartty-2019 teaches wherein the spike-response model is built using machine learning, artificial intelligence, or a combination thereof.
(Chakrabartty-2019, “[0067] In various aspects, a growth transform neuron model is mutually coupled with a network objective function of a machine learning model [wherein the spike-response model is built using machine learning; the machine learning model is the spiking machine learning model (SVM)], as illustrated in FIG. 1C.”)
In regards to claim 9,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Chakrabartty-2019 teaches wherein the spike-response model is built using supervised learning, unsupervised learning, or both.
(Chakrabartty-2019, “[0113] The stimuli used were binary labels assigned to the neurons that determine the network configuration for a given classification problem, and the strength of the stimulus is higher for neurons closer to the classification hyperplane (i.e., for the ‘support vectors’) [wherein the spike-response model is built using supervised learning; wherein a support vector machine is a supervised ML algorithm]. As the margin of separation z from the hyperplane decreased, the spiking rate for a support vector increases and it starts spiking earlier in the convergence process.”)
In regards to claim 10,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 9,
Charkrabartty-2019 teaches wherein minimizing a training error is equivalent to minimizing overall spiking activity across the GT neural network.
(Charkrabartty-2019, “[0101] In various aspects, the growth transform neural network is incorporated into the design of a ΣΔ support vector machine (SVM). In this aspect, the potential function is given by Φ(pik)=pik−½|, as shown in FIG. 4A. The gradient of the function Φ(.) in this aspect has a discontinuity at pik=½, k=(1, 2), ∀i. The corresponding primal loss-function corresponding to a binary classification task is obtained using the geometric approach described by Eqn. (24) and illustrated in FIG. 4A. The primal loss-function [equivalent to minimizing overall spiking activity across the GT neural network] exhibits a piece-wise linear response where the slope of the loss-function changes at classification margins (or errors [minimizing a training error]) that are symmetric about the separating hyperplane. For the piece-wise linear potential function, the response of the ith neuron is given by
S ikΦ′(p ik)=sgn(p ik−0.5) (29)
and represents a binary output that switches between two values +1 and −1.”)
In regards to claim 18,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Cleland teaches wherein a number of shots of the few-shot learning is between 1 and 10.
(Cleland, pg. 91 lines 22-24, “one-shot, two-shot, three-shot, up through ten-shot [wherein a number of shots of the few-shot learning is between 1 and 10] in order to measure the utility of additional training. Test data (across all trained odorants and all concentrations in the dataset) were classified with 100.0% accuracy in all cases.”)
In regards to claim 19,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Cleland teaches wherein the GT neural network is configured to be embodied within a tinyML device operated at the edge of the network.
(Cleland, pg. 78 line 25-pg. 79 line 2, “Neuromorphic systems are custom integrated circuits that model biological neural computations, typically with orders of magnitude greater speed and energy efficiency than general-purpose computers. These systems enable the deployment of neural algorithms in edge devices [wherein the GT neural network is configured to be embodied within a tinyML device operated at the edge of the network], such as chemosensory signal analyzers, in which real-time operation, low power consumption [in energy], environmental robustness, and compact size [and resource-constrained] are important operational metrics.”)
In regards to claim 20,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 19,
Cleland teaches wherein the tinyML device is configured to perform learning
(Cleland, “The theoretical promise of SNNs has motivated the development of multiple neuromorphic hardware platforms, including the academic platforms SpiNNaker and BrainScaleS, as well as IBM’s TrueNorth platform and, most recently, the Loihi platform of Intel Labs, which embeds rapid, iterative plasticity (“self-learning”) [wherein the tinyML device is configured to perform learning] in a compact, scalable chip. These hardware platforms are considered illustrative examples of what are more generally referred to herein as“processing devices” or“processing platforms.” A premise of these hardware projects is that the availability of these platforms would spur the development of practical, useful neuromorphic algorithms by highlighting their particular strengths in energy efficiency and scalability.”)
Cleland teaches and inference at the edge of the network.
(Cleland, “Completing one inference cycle [and inference] (sniff; 5 gamma cycles; 200 timesteps) of the 72-core network required 2.75 ms and consumed 0.43 mJ, of which 0.12 mJ is dynamic energy. It should be noted that the time required to solution was not significantly affected by the scale of the problem (FIG. 10(f)), owing to the Loihi architecture’s fine-grained parallelism. This scalability highlights a key advantage of neuromorphic hardware for application to computational neuroscience and machine olfaction. Energy consumption also scaled only modestly as network size increased (FIG. 10(g)), owing to the colocalization of memory and compute and the use of sparse (spiking) communication, which minimize the movement of data. Using multichip Loihi systems [at the edge of the network], illustrative embodiments are readily scalable to hundreds of columns and hundreds of thousands of intemeurons, and can integrate circuit models of the glomerular layer and the piriform cortex with the current EPL network of the MOB.”)
Claim(s) 2-4 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Chakrabartty-2019 in view of Chatterjee and Cleland in further view of O. Chatterjee and S. Chakrabartty, "Decentralized Global Optimization Based on a Growth Transform Dynamical System Model," (“Chakrabartty-2018”).
In regards to claim 2,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Chakrabartty-2018 teaches wherein at least one of the at least one or more training datasets has one or more unique challenges include sensor drift, stimulus concentrations, or both.
(Chakrabartty-2018, Section III B., “While the result in Table I has been reported for a continuous functional, the model can be discretized (with respect to x ) and applied for solving various decentralized WTA problems as well [one or more unique challenges]. WTA is an extreme form of a nonlinear inhibition strategy, which facilitates the selection of the maximum out of a series of entries in a list [37], [38], and is a more powerful implementation of a competitive stage compared with other competing techniques, such as gate thresholding or sigmoidal thresholding [39]. WTA finds widespread use in an analog circuit design, artificial neural networks, neuromorphic circuit design, as well as in the implementation of the rectified linear unit for the max-pooling phase of different deep learning networks like restricted Boltzmann machines [40]. The dynamical system model in (2) can be discretized to formulate a decentralized WTA algorithm simply by evaluating the driver functional h(x,t) only at specific discrete points in x which correspond to the location of N discrete agents A1−AN having access to localized values q1−qN (which, for example, can represent the concentration levels of specific chemicals [stimulus concentrations; wherein said data is provided by the datasets]), and the objective is to find the agent, which possesses the maximum value of q (as a precursor of some other important steps), only by exchanging local information with the substrate and having no mutual interaction during the process. Fig. 8(a) shows an illustrative example of the setup, consisting of seven agents A1 –A7 associated with values q1 –q7 . The proposed optimization algorithm thus arrives at a global decision simply by computing the dynamically emergent behavior in the local components of any decentralized system.”)
Chakrabartty-2018 is considered to be analogous to the claimed invention because they are in the same field of growth transform models. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chakrabartty-2019 and Chatterjee and Cleland to incorporate the teachings of Chakrabartty-2018 in order to provide a dynamical system that exploits constraints and solve discrete global optimization problems in neural networks (Chakrabartty-2018, Abstract, “Conservation principles, such as conservation of charge, energy, or mass, provide a natural way to couple and constrain spatially separated variables. In this paper, we propose a dynamical system model that exploits these constraints for solving nonconvex and discrete global optimization problems. Unlike the traditional simulated annealing or quantum annealing-based global optimization techniques, the proposed method optimizes a target objective function by continuously evolving a driver functional over a conservation manifold, using a generalized variant of growth transformations. As a result, the driver functional asymptotically converges toward a Dirac-delta function that is centered at the global optimum of the target objective function. In this paper, we provide an outline of the proof of convergence for the dynamical system model and investigate different properties of the model using a benchmark nonlinear optimization problem. Also, we demonstrate how a discrete variant of the proposed dynamical system can be used for implementing decentralized optimization algorithms, where an ensemble of spatially separated entities (for example, biological cells or simple computational units) can collectively implement specific functions, such as winner-take-all and ranking, by exchanging signals only with its immediate substrate or environment. The proposed dynamical system model could potentially be used to implement continuous-time optimizers, annealers, and neural networks.”)
In regards to claim 3,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Chakrabartty-2018 teaches wherein the spike-response model further includes a learning framework comprising: spike responses generated
(Chakrabartty-2018, Section II., “Consider a functional optimization problem of the following form:
PNG
media_image6.png
128
504
media_image6.png
Greyscale
where q:RM↦R is any arbitrary function having multiple local minima, but a single global minimum at x∗ (i.e., q(x∗)<q(x),∀x∈D ), and H:RM+↦R is a convex functional with respect to f(x) . Typically, we choose H{q(x),f(x)} as some distance measure between the functions q(x) and f(x) , e.g., H{q(x),f(x)}=∫x∈D{q(x)−f(x)}2dx,∀x∈D . Note that (5) involves optimizing a convex cost functional H over a convex domain and thus has a unique solution. The Lagrangian L1 [spike responses generated] of (5) is given by
PNG
media_image7.png
75
505
media_image7.png
Greyscale
”)
Chakrabartty-2018 teaches as a result of a constraint violation;
(Chakrabartty-2018, Section II., “Taking derivative with respect to f(x) on both sides and making use of the fact that H is convex [implying that its derivative M{q,f,β} is monotonic with respect to f(x) ], we have
PNG
media_image8.png
35
358
media_image8.png
Greyscale
where [Φ{⋅,⋅}]+=max(0,Φ{⋅,⋅}) denotes the hinge function. Φ{⋅,⋅} is monotonically decreasing in q , implying Φ{q(x∗),β}>Φ{q(x),β} for q(x)>q(x∗)∀x≠x∗ . Thus, from the constraint equation [as a result of a constraint violation] ∫xf(x)dx=ν , we have
PNG
media_image9.png
59
369
media_image9.png
Greyscale
”)
Chakrabartty-2018 teaches one or more optimal parameters for a certain task learned using neurally relevant local learning rules;
(Chakrabartty-2018, Fig. 1, Illustration of different annealing principles for optimizing a target objective function q(x) with multiple local minima [using neurally relevant local learning rules] but only one global minimum x∗ . (a) Process of surmounting the energy barriers in simulated annealing. (b) Process of quantum tunneling through the barriers in quantum annealing. (c) Proposed approach where a driver function h(x,t) evolves under the influence of q(x) [one or more optimal parameters for a certain task learned; wherein the optimal parameters are the parameters of q(x)] to a Dirac-delta function centered at the global minimum x∗ .”)
PNG
media_image10.png
348
504
media_image10.png
Greyscale
However, Chakrabartty-2018 does not explicitly teach network optimization to encode a solution with as few spikes as possible; and a framework that is operable to incorporate additional structural and connectivity constraints on the GT neural network
Chakrabartty-2019 teaches network optimization to encode a solution with as few spikes as possible; and a framework that is operable to incorporate additional structural and connectivity constraints on the GT neural network.
(Chakrabartty-2019, “[0084]
This formulation is now consistent with the multi-class probability regression framework which was used for deriving different variants of SVMs. Introducing variables yjk, Eqn. (10) is expressed as:
Ψ−1(p ik)=Σj Q ij(p jk +y jk) (12)
where bik satisfies the relation:
b ik=−Σj Q ij y jk ,k=1,2 (13)
under the assumption that Q−1 exists. Eqn. (12) along with the constraints [a framework that is operable to incorporate additional structural and connectivity constraints on the GT neural network] given by Eqns. (5) and (6) are viewed as a first-order condition for the following minimization problem:
PNG
media_image11.png
71
304
media_image11.png
Greyscale
where
Φ(p ik)=−∫Ψ−1(p ik)dp ik. (16)
[0085]Φ(.) is referred to as the potential function. If it is assumed that the matrix Q is positive-definite, the first part of the optimization function in Eqn. (15) is equivalent to minimizing a quadratic distance between the responses pik and the variables yik. The second part of the optimization function is equivalent to minimizing a cumulative potential function Φ(.) corresponding to each neuron [network optimization to encode a solution with as few spikes as possible].”)
In regards to claim 4,
Chakrabartty-2019 and Chatterjee and Cleland and Chakrabartty-2018 teaches The BPL computing device of claim 3,
Chakrabartty-2018 teaches wherein the spike responses are Lagrangian parameters.
(Chakrabartty-2018, Section II., “Consider a functional optimization problem of the following form:
PNG
media_image6.png
128
504
media_image6.png
Greyscale
where q:RM↦R is any arbitrary function having multiple local minima, but a single global minimum at x∗ (i.e., q(x∗)<q(x),∀x∈D ), and H:RM+↦R is a convex functional with respect to f(x) . Typically, we choose H{q(x),f(x)} as some distance measure between the functions q(x) and f(x) , e.g., H{q(x),f(x)}=∫x∈D{q(x)−f(x)}2dx,∀x∈D . Note that (5) involves optimizing a convex cost functional H over a convex domain and thus has a unique solution. The Lagrangian L1 [Lagrangian parameters] of (5) is given by
PNG
media_image7.png
75
505
media_image7.png
Greyscale
”)
In regards to claim 6,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Chakrabartty-2018 teaches wherein the GT neural network comprises a neuromorphic tinyML system constrained in one or more of energy, resources and network structure.
(Chakrabartty-2018, Section I., “Naturally occurring systems generally obey two physical principles: 1) under the influence of an external field, a system converges to a set of equilibrium states (referred to as eigenstates) that correspond to the lowest energy configuration [neuromorphic tinyML system constrained in one or more of energy, resources and network structure; wherein the system is constrained in energy] [1] and 2) the dynamics of the system evolve in a manner such that some physical quantities (for example, energy, charge, mass, or momentum) are conserved [2].”)
Claim(s) 7 is rejected under 35 U.S.C. 103 as being unpatentable over Chakrabartty-2019 in view of Chatterjee and Cleland in further view of Andrey Ziyatdinov, Alexandre Perera, "Synthetic benchmarks for machine olfaction: Classification, segmentation and sensor damage," (“Ziyatdinov”).
In regards to claim 7,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Ziyatdinov teaches wherein at least one of the at least one or more training datasets is a publicly available machine olfaction dataset having one or more unique challenges.
(Ziyatdinov, Section 1.1, “Ten scenarios for machine olfaction [having one or more unique challenges] – classification, quantification, segmentation, habituation, event detection, novelty detection, drift compensation I, drift compensation II, sensor replacement I and sensor replacement II – were designed and formalized in the framework of the data simulation tool [3, Supporting Information, File S1]. For three of these scenarios – classification, segmentation, and sensor damage (adopted from sensor replacement scenario) – synthetic benchmark data sets at different difficulty levels were generated [a publicly available machine olfaction dataset].”)
Ziyatdinov is considered to be analogous to the claimed invention because they are in the same field of machine olfaction. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chakrabartty-2019 and Chatterjee and Cleland to incorporate the teachings of Ziyatdinov in order to provide a publicly-available machine olfaction dataset to meet the needs of researchers for data to utilize in testing and comparing algorithms (Ziyatdinov, Abstract, “The design of the signal and data processing algorithms requires a validation stage and some data relevant for a validation procedure. While the practice to share public data sets and make use of them is a recent and still on-going activity in the community, the synthetic benchmarks presented here are an option for the researches, who need data for testing and comparing the algorithms under development. The collection of synthetic benchmark data sets were generated for classification, segmentation and sensor damage scenarios, each defined at 5 difficulty levels.”)
Claim(s) 11 is rejected under 35 U.S.C. 103 as being unpatentable over Chakrabartty-2019 in view of Chatterjee and Cleland in further view of S. Farshchi, A. Pesterev, W. -L. Ho and J. W. Judy, "Acquiring High-Rate Neural Spike Data with Hardware-Constrained Embedded Sensors," (“Farshchi”).
In regards to claim 11,
Chakrabartty-2019 and Chatterjee and Cleland teaches The BPL computing device of claim 1,
Chakrabartty-2019 teaches further comprising one or more [miniaturized] sensors and devices.
(Chakrabartty-2019, “[0004] In one aspect, a growth transform neural network [growth transform (GT) neural network] system is provided that includes a computing device. The computing device includes at least one processor and a memory storing a plurality of modules. Each module includes instructions executable on the at least one processor [device having a memory and a processor]”)
(Chakrabartty-2019, “[0136] Computing device 302 also includes at least one media output component 308 for presenting information to a user 310….
[0137] In some embodiments, client computing device 302 [this is the same as computing device 302] includes an input device 312 for receiving input from user 310. Input device 312 may include, for example, a keyboard, a pointing device, a mouse, a stylus, a touch sensitive panel (e.g., a touch pad or a touch screen), a camera, a gyroscope, an accelerometer, a position detector, and/or an audio input device [wherein the GT neural network includes one or more [miniaturized] sensors and devices].” Wherein an accelerometer is a sensor for speed, etc)
However, Chakrabartty-2019 does not explicitly teach miniaturized
In light of the specification, (“[0004] Deployment of miniaturized and battery-powered sensors and devices has become ubiquitous and computation is increasingly moving from the cloud to the source of data collection. With it, there is a growing demand for specialized algorithms, hardware and software, collectively termed as tinyML systems. TinyML systems typically can perform learning and inference at the edge in energy and resource- constrained environments. Prior efforts at reducing energy requirements of classic machine learning algorithms include network architecture search, model compression through energy-aware pruning and quantization, model partitioning, among others.”),
Examiner is interpreting miniaturized sensors and devices to be any sensor and device that is energy and resource-constrained.
Farshchi teaches miniaturized sensors and devices
(Farshchi, Section II B. TinyOS and the Mica-Based Sensor Network, “A basic wireless neural recording, archiving, and hosting system based on embedded sensors has been demonstrated; however, continuous signal transmission limits battery life and spike recording to a single channel [5]. A similar vital sign monitoring system that uses TelosB motes has also been demonstrated in [19]; however, the fastest signal it acquires is a single channel of ECG. The work in this paper has been directed toward investigating computationally-efficient software filters and compression algorithms to enable chronic, multi-channel wireless biosignal recording with embedded sensors [sensors and devices] that are hardware and bandwidth constrained [miniaturized].”)
Farshchi is considered to be analogous to the claimed invention because Farshchi is reasonably pertinent to the problem the inventor faced (recording signals representing human functions on resource-constrained hardware). See MPEP 2141.01(a)(i). Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chakrabartty-2019 and Chatterjee and Cleland to incorporate the teachings of Farshchi in order to provide embedded sensors to record data for training a neural network in a way to alleviate network traffic and increase battery life (Farshchi, Abstract, “In an effort to enable embedded sensors that are hardware and bandwidth constrained to acquire high- frequency neural signals, signal-filtering and signal- compression algorithms have been implemented and tested on a commercial-off-the-shelf embedded-system platform. The sensor modules have been programmed to acquire, filter, and transmit raw biological signals at a rate of 32 kbps. Furthermore, on-board signal processing enables one channel sampled at a rate of 4 kS/s at 12-bit resolution to be compressed via ADPCM and transmitted in real time. In addition, the sensors can be configured to only transmit individual time-referenced "spike" waveforms, or only the spike parameters for alleviating network traffic and increasing battery life.”)
Claim(s) 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Chakrabartty-2019 in view of Chatterjee in further view of Cleland.
In regards to claim 12,
Chakrabartty-2019 teaches A neuromorphic tinyML system, comprising: at least one [tinyML] device having a memory and a processor;
(Chakrabartty-2019, “[0004] In one aspect, a growth transform neural network [growth transform (GT) neural network] system is provided that includes a computing device. The computing device includes at least one processor and a memory storing a plurality of modules. Each module includes instructions executable on the at least one processor [device having a memory and a processor]”)
Charkrabartty-2019 teaches and a growth transform (GT) neural network including a plurality of ON-OFF neuron pairs using weight adaptation
(Chakrabartty-2019, “[0007] …The growth transform neural network module defines a plurality of mirrored neuron pairs [including a plurality of ON-OFF neuron pairs; see fig. 6A for on/off] that include a plurality of first components and a plurality of second components that are interconnected according to an interconnection matrix. The growth transform module updates each first component of each mirrored neuron pair of a plurality of mirrored neuron pairs according to a growth transform neuron model. The network convergence module converges [using weight adaptation; wherein convergence is updating the weights] the plurality of mirrored neuron pairs to a steady state condition by solving a system objective function subject to at least one normalization constraint. The first component and the second component of each mirrored neuron pair in the steady state condition may each produce a neuron response that includes a steady state value or a limit cycle with ΣΔ modulation according to a user-defined potential function Φ(pik) given by:
W 1∈1 +|p ik−(½−∈1)| for 0≤p ik<½−∈1,
W 1 |p ik−½| for ½−∈1 ≤p ik≤½,
W 2 |p ik−½| for ½<p ik<½+∈2, and
W 2∈2 +|p ik−(½−∈2)| for ½+∈2 <p ik≤1,
in which pik is the response of ith neuron of the plurality of mirrored neuron pairs, k is 1 or 2, W1>1, W2>1 ∈1>0, and ∈2>0….
[0022] FIG. 6A is a schematic illustration showing the relationship of the quantized response Sik=sgn(pik−1/M) to the ΣΔ limit cycles.
PNG
media_image1.png
258
527
media_image1.png
Greyscale
”)
Charkrabartty-2019 teaches to enforce sparsity constraints,
In light of the specification, (“[0070] As shown in FIG. 4A, energy-efficiency in energy-based neuromorphic machine learning, there is a loss function for training and an additional loss for enforcing sparsity. The embodiments set forth herein make the loss for training and loss for enforcing sparsity equal, as shown in FIG. 4B”) Examiner interprets the loss function to enforce sparsity constrains on overall network spiking activity.
(Charkrabartty-2019, “[0073] In one aspect, the growth transform neural network design and analysis includes estimating an equivalent dual optimization function based on the mapping given by Eqn. (1). Each neuron implements a continuous mapping based on a polynomial growth transform update that also dynamically optimizes the cost function [enforce sparsity constraints on overall network spiking activity; wherein the cost function corresponds to primal loss-functions]. Because the growth transform mapping is designed to evolve over a constrained manifold, the neuronal responses and the network are stable. The switching, spiking and bursting dynamics of the neurons emerge by choosing different types of potential functions and hyper-parameters in the dual cost function. The use of this approach is suitable for use in the design of SVMs that exhibit ΣΔ modulation type limit-cycles, spiking behavior and bursting responses.”)
Charkrabartty-2019 teaches the GT neural network configured to: simultaneously learn optimal parameters for a task and minimize spiking activity across the GT neural network;
(Charkrabartty-2019, “[0073] In one aspect, the growth transform neural network design and analysis includes estimating an equivalent dual optimization function [simultaneously learn optimal parameters for a task] based on the mapping given by Eqn. (1). Each neuron implements a continuous mapping based on a polynomial growth transform update that also dynamically optimizes the cost function. Because the growth transform mapping is designed to evolve over a constrained manifold, the neuronal responses and the network are stable. The switching, spiking [spiking activity across the GT neural network] and bursting dynamics of the neurons emerge by choosing different types of potential functions and hyper-parameters in the dual cost function. The use of this approach is suitable for use in the design of SVMs that exhibit ΣΔ modulation type limit-cycles, spiking behavior and bursting responses.”)
(Charkrabartty-2019, Table 2, “[0125] The first term in the cost function minimizes a kernel distance between the class labels and the probability variables pi +, pi −, and the second term minimizes [minimize] a cumulative potential function Ω(.) corresponding to each neuron. The kernel or the interconnection matrix Q is a positive definite matrix such that each of its elements is written as an inner-product in a high-dimensional space as Qij=Ψ(xi)·Ψ(xj) where xi∈RD correspond to the input data vector and Ψ(.) represents a high-dimensional mapping function.”)
Chakrabartty-2019 teaches generate a training dataset
(Chakrabartty-2019, “[0075] A continuous-time variant of the growth transform neuron model is described herein and a network of growth transform neurons is used to implement a spiking SVM. The growth transform neuron in the SVM network may learn to encode (rate and time-to-spike response) its output according to an equivalent margin of classification and the neurons corresponding to regions near the classification boundary may learn to exhibit noise-shaping dynamics similar to what has been reported in biological networks. In one aspect, the model of the growth transform neuron is summarized along with its dynamical properties. In another aspect, an SVM formulation is mapped onto a growth transform neural network and different spiking dynamics are demonstrated based on synthetic [a training dataset;] and benchmark datasets.”)
Further, the Chakrabartty-2019 teaches the synthetic dataset to be given by a synthetic classification task
(Chakrabartty-2019, “[0111] FIGS. 9B and 9C show the classification contours for a two-dimensional linear and non-linear synthetic [generate] classification task, respectively. The introduction of the non-linearity in the transition region does not affect the classification boundaries or performance. The resulting spike-trains generated by different neurons (located at different classification margins) are shown in FIG. 10 for the system illustrated in FIG. 9B.”)
Chakrabartty-2019 teaches based on the learned optimal parameters and spiking activity data on the GT neural network;
(Charkrabartty-2019, “[0070] In one aspect, the disclosed neural network system is incorporated into a spiking support vector machine (SVM) that includes a network of growth transform neurons, as described herein below. Each neuron in the SVM network learns to encode output parameters [based on the learned optimal parameters] such as spike rate and time-to-spike responses according to an equivalent margin of classification and those neurons corresponding to regions near the classification boundary learn to exhibit noise-shaping dynamics similar to behaviors observed in biological networks. As a result, the disclosed spiking support vector machine (SVM) enables large-scale population encoding, for examples for a case when the spiking SVM learns to solve two benchmark classification tasks, resulting in classification performance similar to that of the state-of-the-art SVM implementations.”)
(Chakrabartty-2019, “[0114] FIGS. 12A and 12B show that the firing rate (number of spikes per 1000 iterations) monotonically decreases and the time (iteration count) to the first spike monotonically increases with the margin of separation, respectively. T\Without being limited to any particular theory, the nearer a neuron is to the classification hyperplane, the more its relative contribution in determining the weight vectors and the faster it reaches the transition region around p=½, leading to the spiking response. The spikes thus represent a manifestation of the process of convergence, and are directly related to the learning behavior of the dynamical system through the support vectors (i.e., the vectors that are the most useful in learning the classification boundary) [spiking activity data on the GT neural network].”)
Chakrabartty-2019 teaches design another GT neural network [for at least one other tinyML device] based on the training dataset
(Chakrabartty-2019, “[0075] A continuous-time variant of the growth transform neuron model is described herein and a network of growth transform neurons is used to implement a spiking SVM. The growth transform neuron in the SVM network may learn to encode (rate and time-to-spike response) its output according to an equivalent margin of classification and the neurons corresponding to regions near the classification boundary may learn to exhibit noise-shaping dynamics similar to what has been reported in biological networks. In one aspect, the model of the growth transform neuron is summarized along with its dynamical properties. In another aspect, an SVM formulation is mapped onto a growth transform neural network and different spiking dynamics are demonstrated [design another GT neural network] based on synthetic [based on the training dataset] and benchmark datasets.”)
However, Chakrabartty-2019 does not explicitly teach for at least one tinyML device having a memory and a processor and comprising at least one other tinyML device.. wherein the sparsity constraints are configured to function as a regularizer such that the GT neural network is configured to (i) learn via few-shot learning and (ii) operate at the edge of a network in energy and resource-constrained environments
Chatterjee teaches wherein the sparsity constraints are configured to function as a regularizer
(Chatterjee, Section III., “In this formulation, the active-power dissipation D in (10) acts as a regularization function with β ≥ 0 being a hyperparameter…. Example 1: Consider a single-variable quadratic optimization problem of the form H1(x) = x^2, subject to the constraint
|x| ≤ 1, x ∈ R. Substituting x = |V|^2 − |I|^2, the problem can be mapped (please see Appendix B for more details) into the form equivalent to (10) as
PNG
media_image2.png
67
369
media_image2.png
Greyscale
Fig. 4(a)–(c) plots L1 for different values of β [wherein the sparsity constraints are configured to function as a regularizer; wherein L1 optimization acts as a regularization function with β ≥ 0 being a hyperparameter and thus enable the network to perform (i) and (ii). Examiner refers to (Chatterjee, Section V for support in combining the cost function of Charkrabartty-2019 with the regularization function of Chatterjee)]. As shown in Fig. 4(a) and as expected for β = 0, the cost function has several minima (or attractors), whereas for β > 0, the minima corresponds to φ = ±π/2, for which the active-power dissipation is zero. Fig. 4(b) and (c) shows that controlling β will control the optimization landscape (without changing the location of the attractors) and will determine the attractor trajectory. This feature has been exploited in Sections IV and V to optimize the active-power dissipation profile during the learning phase.”
PNG
media_image3.png
216
750
media_image3.png
Greyscale
PNG
media_image4.png
477
762
media_image4.png
Greyscale
)
However, Chatterjee does not explicitly teach for at least one tinyML device having a memory and a processor… for at least one other tinyML device… such that the GT neural network is configured to (i) learn via few-shot learning and (ii) operate at the edge of a network in energy and resource-constrained environments.
Cleland teaches for at least one tinyML device having a memory and a processor… for at least one other tinyML device
Examiner’s note: Examiner interprets TinyML systems (and devices) to be energy and resource-constrained.
(Cleland, pg. 78 line 25-pg. 79 line 2, “Neuromorphic systems are custom integrated circuits that model biological neural computations, typically with orders of magnitude greater speed and energy efficiency than general-purpose computers. These systems enable the deployment of neural algorithms in edge devices [for at least one tinyML device having a memory and a processor… for at least one other tinyML device], such as chemosensory signal analyzers, in which real-time operation, low power consumption [energy-constrained], environmental robustness, and compact size [and resource-constrained] are important operational metrics.”)
Cleland teaches such that the GT neural network is configured to (i) learn via few-shot learning and
(Cleland, pg. 75 lines 22-24, “While certain of the present embodiments focus on one-shot learning, the network can also be configured for few-shot learning, in which it gradually adapts to the underlying statistics of training samples.”)
Cleland teaches (ii) operate at the edge of a network in energy and resource-constrained environments.
Examiner interprets the limitation as a TinyML device as an edge device to a network in light of the specification (“[0051] In the exemplary embodiment, GT system 114 a-114 c may be tinyML systems, or networks, that implement machine learning processes. In some embodiments, a tinyML system may include a device that provides low latency, low power consumption, low bandwidth, and privacy. Additionally, a tinyML device, sometimes called an always on device, may be placed on the edge of a network.”)
(Cleland, pg. 78 line 25-pg. 79 line 2, “Neuromorphic systems are custom integrated circuits that model biological neural computations, typically with orders of magnitude greater speed and energy efficiency than general-purpose computers. These systems enable the deployment of neural algorithms in edge devices [operate at the edge of a network], such as chemosensory signal analyzers, in which real-time operation, low power consumption [in energy], environmental robustness, and compact size [and resource-constrained environments] are important operational metrics.”)
Chatterjee is considered to be analogous to the claimed invention because they are in the same field of growth transform networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chakrabartty-2019 to incorporate the teachings of Chatterjee in order to provide a regularization function to select the trajectory with an optimal active-power dissipation profile (Chatterjee, “Illustration showing that operating in the complex domain allows different possible learning trajectories from an initial state to the final steady state. Regularization with respect to the phase factor could then be used to select the trajectory with an optimal active-power dissipation profile and results in limit cycle oscillations in steady state. The circles indicate the constant magnitude loci.”
PNG
media_image5.png
264
364
media_image5.png
Greyscale
)
Cleland is considered to be analogous to the claimed invention because they are in the same field of spiking neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chakrabartty-2019 and Chatterjee to incorporate the teachings of Cleland in order to provide few-shot learning to learn robust representations of even corrupt samples and provide neuromorphic systems with greater speed and energy efficiency than general-purpose computers. (Cleland, pg. 75 lines 24-25, “In this configuration, the network learns robust representations even when the training samples themselves are corrupted by impulse noise.”) (Cleland, pg. 78 line 25-pg. 79 line 2, “Neuromorphic systems are custom integrated circuits that model biological neural computations, typically with orders of magnitude greater speed and energy efficiency than general-purpose computers.”) (Cleland, pg. 17 lines 12-14, “When implemented on appropriate hardware, SNNs are extremely energy efficient and uniquely scalable to very large problems; consequently, marrying the efficiency of neuromorphic processors with the algorithmic power of deep learning is an important industry goal.”)
In regards to claim 13,
Chakrabartty-2019 and Chatterjee and Cleland teaches The neuromorphic tinyML system of claim 12,
Chakrabartty-2019 teaches wherein the GT neural network is further configured to: store the training dataset on a database communicatively-coupled to the GT neural network.
(Chakrabartty-2019, “[0142] Processor 404 may also be operatively coupled to a storage device 410. Storage device 410 is any computer-operated hardware suitable for storing and/or retrieving data.”)
In regards to claim 14,
Chakrabartty-2019 and Chatterjee and Cleland teaches The neuromorphic tinyML system of claim 13,
Chakrabartty-2019 teaches wherein the training dataset is used to develop a training data model for designing new tinyML systems, wherein the training data model is created using the training dataset and one or more additional datasets.
(Chakrabartty-2019, “[0075] A continuous-time variant of the growth transform neuron model is described herein and a network of growth transform neurons is used to implement a spiking SVM. The growth transform neuron in the SVM network may learn to encode (rate and time-to-spike response) its output according to an equivalent margin of classification and the neurons corresponding to regions near the classification boundary may learn to exhibit noise-shaping dynamics similar to what has been reported in biological networks. In one aspect, the model of the growth transform neuron is summarized along with its dynamical properties. In another aspect, an SVM formulation is mapped onto a growth transform neural network and different spiking dynamics are demonstrated [training data model is created] based on synthetic [using the training dataset] and benchmark datasets [one or more additional datasets].”)
Claim(s) 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Chakrabartty-2019 in view of Chatterjee and Cleland in further view of Ziyatdinov.
In regards to claim 15,
Chakrabartty-2019 and Chatterjee and Cleland teaches The neuromorphic tinyML system of claim 14,
Ziyatdinov teaches wherein the one or more additional datasets is a publicly-available dataset.
(Ziyatdinov, Section 1.1, “Ten scenarios for machine olfaction – classification, quantification, segmentation, habituation, event detection, novelty detection, drift compensation I, drift compensation II, sensor replacement I and sensor replacement II – were designed and formalized in the framework of the data simulation tool [3, Supporting Information, File S1]. For three of these scenarios – classification, segmentation, and sensor damage (adopted from sensor replacement scenario) – synthetic benchmark data sets at different difficulty levels were generated [a publicly available dataset].”)
Ziyatdinov is considered to be analogous to the claimed invention because they are in the same field of machine olfaction. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chakrabartty-2019 and Chatterjee and Cleland to incorporate the teachings of Ziyatdinov in order to provide a publicly-available machine olfaction dataset to meet the needs of researchers for data to utilize in testing and comparing algorithms (Ziyatdinov, Abstract, “The design of the signal and data processing algorithms requires a validation stage and some data relevant for a validation procedure. While the practice to share public data sets and make use of them is a recent and still on-going activity in the community, the synthetic benchmarks presented here are an option for the researches, who need data for testing and comparing the algorithms under development. The collection of synthetic benchmark data sets were generated for classification, segmentation and sensor damage scenarios, each defined at 5 difficulty levels.”)
In regards to claim 16,
Chakrabartty-2019 and Chatterjee and Cleland and Ziyatdinov teaches The neuromorphic tinyML system of claim 15,
Ziyatdinov teaches wherein the publicly- available dataset is a machine olfaction dataset.
(Ziyatdinov, Section 1.1, “Ten scenarios for machine olfaction– classification, quantification, segmentation, habituation, event detection, novelty detection, drift compensation I, drift compensation II, sensor replacement I and sensor replacement II – were designed and formalized in the framework of the data simulation tool [3, Supporting Information, File S1]. For three of these scenarios – classification, segmentation, and sensor damage (adopted from sensor replacement scenario) – synthetic benchmark data sets at different difficulty levels were generated [a publicly available machine olfaction dataset].”)
Claim 17 is rejected on the same rationale under 35 U.S.C. 103 as claim 16 as they are substantially similar (wherein claim 16 is a dependent of claim 14 and developing a training data model is substantially similar to updating it as it uses the same training data set and no additional step is provided for updating)
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASMINE THAI whose telephone number is (703)756-5904. The examiner can normally be reached M-F 8-4.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached at (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.T.T./Examiner, Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129