DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The instant Office action is responsive to communications: amendment/argument filed 9/15/2025.
Claims 1-20 pending. Claim 1 is independent.
Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 4, 8, 13-15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 4 (and 8 based on dependency).
Claim 4 recites the limitation "mispredicted ones" in “configured to update class hypervectors of the HDC model for mispredicted ones of the class hypervectors.” It is unclear what is “ones” referring to in claim 4. For examples, “ones” could be referring to class hypervectors, it could be referring to elements in the hypervectors or it could be referring to a type of binary classification, i.e., [0, 1]. The specification does state the projection matrix which encodes the hyper vectors may be a binary/bipolar matrix (0031). The lack of an antecedent basis for “mispredicted” or “ones” creates further latent ambiguity. Therefore, “ones” fails to particularly point out and distinctly claim the subject matter.
Based on the context of claim 4, and in light of the broadest reasonable interpretation, for the purposes of compact prosecution, Examiner shall interpret claim 4 as “configured to update class hypervectors of the HDC model based on mispredicted class hypervectors.” Claim 8 is rejected based on their dependency to claim 13.
Claim 13 (and 14 and 15 based on dependency)
The term “directly” in claim 13 is a relative term which renders the claim indefinite. The term directly is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. “Directly” is a term of degree and what may be “directly” to a person skilled in the art, may not be “directly” to another person skilled in the art. The term “directly” modifies “operates over data” to suggest superimposing the data or some type of vertical integration, however the specification does not discuss such types of vertical integration. Applicant seems to be employing the term “directly” to mean “is operating on the data encoded by the VAE module.”
For the purposes of compact prosecution, and in light of the broadest reasonable interpretation, Examiner shall interpret claim 13 as “wherein the HDC module is configured to instantiate a hyperdimensional classification that performs operations on data encoded by the VAE module.” Claims 14 and 15 are rejected based on their dependency to claim 13.
Claims 14 and 15
In addition to being rejected based on their dependency to claim 13, both claims 14 and 15 are rejected for an additional reason under 35 U.S.C. 112(b).
The term “achieves” in claim 14 and 15 is a relative term which renders the claim indefinite. The term directly is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. “Achieve” is a term of degree, and an accomplishment to a person skilled in the art may not be an accomplishment in to another person skilled in the art. In the context of machine learning, specifically in the context of classification problems which this application is directed to, the term “achieves” is often in the context of higher accuracy, i.e., achieves over 90% accuracy, or in terms of memory/energy efficiency, i.e., achieves convergence in 39% less training time. Alternatively, the term “achieves” refers to an improvement over prior art such as “achieves” greater accuracy than AlexNet [a conventional neural network]. Applicant seems to be employing the term “achieves” to mean “can perform” instead of “to accomplish.”
For the purposes of compact prosecution, and in light of the broadest reasonable interpretation, Examiner shall interpret claims 14 and 15 as “wherein the hyperdimensional classification can perform single-pass learning.” And “wherein the hyperdimensional classification can perform iterative learning.” (respectively).
Claim Rejection - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 2, 3, 9, 11, 13, 15 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Rolfe, et al. (US 20180101784 A1) (hereinafter referred to as “Rolfe”) in view of Bandaragoda, et al., (30 Oct. 2019) “Trajectory clustering of road traffic in urban environments using incremental machine learning in combination with hyperdimensional computing,” DOI: 10.1109/ITSC.2019.8917320 (hereinafter referred to as “Bandaragoda”) and Rosing, et al. (US 20220019441 A1) (hereinafter referred to as “Rosing”).
Regarding claim 1, Rolfe recites “A hyperdimensional learning framework comprising:”
“a variational encoder (VAE) module” (Rolfe at 0108: A discrete variational auto-encoder (DVAE) is a hierarchical probabilistic model consisting of an RBM, followed by multiple layers of continuous latent variables, allowing the binary variables to be marginalized out, and the gradient to backpropagate smoothly through the auto-encoding component of the ELBO. See also 0102. See also 0055: In some implementations system memory 108 may store processor- or computer-readable calculation instructions and/or data to perform pre-processing, co-processing, and post-processing to analog computer 104. As described above, system memory 108 may store a VAE instructions module that includes processor- or computer-readable instructions to perform VAE.)
“configured to generate variational autoencoding and to generate an unsupervised network that receives a data input and learns to predict the same data in an output layer; and” (Rolfe at 0149: In summary, as described in more detail above, the discrete VAE method extends the encoder and the prior with a transformation to a continuous, auxiliary latent representation, and correspondingly makes the decoder a function of the same continuous representation. See also Rolfe at 0116: Therefore, to use discrete latent representations in the VAE framework, the method described herein for unsupervised learning transforms the distributions to a continuous latent space within which the probability packets move smoothly. The encoder q(z|x, ϕ) and prior distribution p(z|θ) are extended by a transformation to a continuous, auxiliary latent representation ζ, and the decoder is correspondingly transformed to be a function of the continuous representation. By extending the encoder and the prior distribution in the same way, the remaining KL-divergence (referred to above) is unaffected.)
However, Rolfe does not recite “a hyperdimensional computing (HDC) learning module coupled to the unsupervised network through a data bus, wherein the HDC module is configured to receive data from the VAE module and update an HDC model of the HDC learning module.”
On the other hand, Bandaragoda recites “a hyperdimensional computing (HDC) learning module” (Bandaragoda at 1666, cl. 2: B. Unsupervised incremental machine learning using the IKASL algorithm … As shown in Figure 2, structurally, the IKASL algorithm represents a layer network structure, build based on the number of periods of incrementally learning.)
“wherein the HDC module is configured to receive data from the VAE module and update an HDC model of the HDC learning module.” (Bandaragoda at 1666, cl. 2: As shown in Figure 2, structurally, the IKASL algorithm represents a layer network structure, build based on the number of periods of incrementally learning. The layers are virtual and generated as required by the sequential incremental learning process. Each layer consists of two sub-layers, learning layer, LEi and generalization layer, GEi. The learning layer encompasses the GSOM (dynamic topology preserving feature map) functionality which organizes the input HD feature vectors from TRt=Δt×(i+1)t=Δt×i into trajectory clusters. The generalization layer GEi encodes a generalized representation of the immediate learning layer LEi, which is the base layer for the next learning layer LEi+1.) [Rolfe recites encoding data via a VAE module.]
A person skilled in the art, before the effective filing date of the present application, would be motivated to modify Rolfe with Bandaragoda to recite “a hyperdimensional computing (HDC) learning module” and “wherein the HDC module is configured to receive data from the VAE module and update an HDC model of the HDC learning module” with the motivation being “(Bandaragoda at 1665, cl. 2) developed based on hyperdimensional computing [28] which uses a suite of bio-inspired methods to represent/embed a set of patterns in Vector Symbolic Architectures (VSA). The resulting HD vectors use distributed representations where information is distributed across HD vector positions, i.e., HD vectors are interpretable only in entirety, any subspace is not interpretable. Hyperdimensional computing has been amply demonstrated and applied in industrial systems [29], [30]. An architecture for memory-recall of sensor stimuli, through the use of VSA has been proposed in [31]” and “(Bandaragoda at 1666, cl. 2) his feature transformation approach creates a single HD vector for a given trajectory which is a bundled HD vectors of the n-gram sequences in the trajectory. The key rationale behind this approach is that if two trajectories consists of common sub-sequences (at different positions), there would be a similarity between their HD vectors as both of them as created by bundling the HD vectors of that common subsequence.”
However, neither Rolfe nor Bandaragoda recite “coupled to the unsupervised network through a data bus,” On the other hand, Rosing recites “coupled to the unsupervised network through a data bus,” (Rosing at 0240: Our experimental infrastructure assumes that HPU interfaces with CPU using PCIe;) [A PCIe is a type of a data bus.]
A person skilled in the art, before the effective filing date of the present application, would be motivated to modify Rolfe, Bandaragoda with Rosing to recite “coupled to the unsupervised network through a data bus” with the motivation being “(0240) The issue of the bandwidth is negligible in the disclosed systems since the interface only delivers instructions and 32-bit numbers while all hypervectors are stored in the HPU and HDC memory side” and “(0262) Processing in-memory (PIM) is a promising solution to accelerate HD computations running for memory-centric applications by enabling parallelism. PIM performs some or all of a set of computation tasks (e.g., bit-wise or search computations) inside the memory without using any processing cores. Thus application performance may be accelerated significantly by avoiding the memory access bottleneck.”
Regarding claim 2, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 1” and Rolfe further recites “wherein the VAE module has an input configured to receive unlabeled data” (Rolfe at 0014: A method for unsupervised learning over an input space including discrete or continuous variables, and at least a subset of a training dataset of samples of the respective variables, to attempt to identify a value of at least one parameter that increases a log-likelihood of at least the subset of the training dataset with respect to a model,.. See also Fig. 5, 0155.) [The use of unsupervised learning over an input space to derive a model from example inputs such as a training set to maximize the log-likelihood of an observed dataset is an exemplar where the structure of the data itself is learned without explicit labels, i.e., receives unlabeled data. This is depicted as ‘x’ in Fig 5, which is included in training data, as explained in 0155.]
And Bandaragoda recites “and the HDC learning model is configured to update the HDC model based on the unlabeled data.” (Bandaragoda at 1665, cl. 1: Advancing the case for unsupervised machine learning that successfully address the aforementioned challenges of unlabelled datasets, sub-sequences in the trajectories and timesensitivity, in this paper, we propose a trajectory clustering approach to automatically segment and detect traffic behaviours and incrementally learn the time-sensitive nature of these behaviours. We have develop a feature transformation technique based on hyperdimensional computing to represent variable-length trajectories of commuter trips into feature vectors with encoded sub-sequence information. We apply an incremental learning technique, the Incremental Knowledge Acquiring Self-Learning (IKASL) algorithm to incrementally learn trajectory clusters as well as their incremental changes over time…. See also: It [the IKASL algorithm] addresses the primary challenges of learning from unlabelled datasets, impact of sub-sequences in traffic trajectories and time-sensitivity of road traffic.) The motivation rationale of claim 1 is similarly applicable to claim 2.
Regarding claim 3, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 2” and Rolfe further recites “wherein the unsupervised network is an encoder neural network and the output layer comprises a decoder neural network with latent space between the encoder neural network and the decoder neural network.” (Rolfe at 0102: Since the approximating posterior distribution q(z|x, ϕ) maps each input to a distribution over the latent space, it is called the “encoder”. Correspondingly, since the conditional likelihood distribution p(x|z, ϕ) maps each configuration of the latent variables to a distribution over the input space, it is called the “decoder”. See also Rolfe at 0108: A discrete variational auto-encoder (DVAE) is a hierarchical probabilistic model consisting of an RBM, followed by multiple layers of continuous latent variables, allowing the binary variables to be marginalized out, and the gradient to backpropagate smoothly through the auto-encoding component of the ELBO.)
Regarding claim 4, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 1” and Bandaragoda recites “wherein the HDC learning module is further configured to update class hypervectors of the HDC” (Bandaragoda at 1666, cl. 2: The learning layer encompasses the GSOM (dynamic topology preserving feature map) functionality which organizes the input HD feature vectors from TRt=Δt×(i+1)t=Δt×i into trajectory clusters. The generalization layer GEi encodes a generalized representation of the immediate learning layer LEi, which is the base layer for the next learning layer LEi+1. Due to space limitations, the interested reader is referred to [36] and [34] for further details on the workings of the IKASL algorithm. See also 1668 at Fig 4:
PNG
media_image1.png
131
565
media_image1.png
Greyscale
) [The generalized representation of trajectory clusters, which are hypervectors, are analogous to class hypervectors because both represents a bundle of hypervectors, in the case of the trajectory clusters they are bundled representations of HD vectors of the n-grams.]
Rosing recites “model for mispredicted ones of the class hypervectors.” (Rosing at 0214: FIG. 16a shows the results of the regression inference for a synthetic function with the initial regressor model. The results show that it follows the trend of the target function, while underfitting for extreme cases. The main reason of the underfitting is that the randomly generated hypervectors are not perfectly orthogonal. To compensate for this error, we run a retraining procedure for several epochs. In the retraining procedure, we update the regressor model with the observed error for each sample as follows:
PNG
media_image2.png
36
185
media_image2.png
Greyscale
) [M refers to a hypervectors (see 0211) based on observed errors, i.e., mispredictions. ]
A person skilled in the art, before the effective filing date of the present application, would be motivated to modify Rolfe and Bandaragoda with Rosing to recite “model for mispredicted ones of the class hypervectors” with the motivation being “(0215) FIG. 16b shows the results after 2 retraining epochs. The results show that the model better fits to the dataset.”
Regarding claim 5, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 3” and Rosing further recites “wherein the HDC learning module is configured with a loss function that adaptively updates the hypervectors based on a data label.” (Rosing at 0214: FIG. 16a shows the results of the regression inference for a synthetic function with the initial regressor model. The results show that it follows the trend of the target function, while underfitting for extreme cases. The main reason of the underfitting is that the randomly generated hypervectors are not perfectly orthogonal. To compensate for this error, we run a retraining procedure for several epochs. In the retraining procedure, we update the regressor model with the observed error for each sample as follows:
PNG
media_image2.png
36
185
media_image2.png
Greyscale
) [M refers to a hypervectors (see 0211) based on observed errors, i.e., mispredictions of labelled data, the updates are based on the term (1-f(Xi)) which represents the error in the prediction, i.e., a loss function.] The motivation rationale employed in claim 4 is similarly applicable to claim 5.
Regarding claim 7, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 5” and Rolfe recites “wherein the loss function is a logarithmic type loss function.” (Rolfe at 0003: Machine learning is related to optimization. Some problems can be expressed in terms of minimizing a loss function on a training set, where the loss function describes the disparity between the predictions of the model being trained and observable data. See also Rolfe at 0127: The KL-divergence portion of the loss function is as follows:
PNG
media_image3.png
78
516
media_image3.png
Greyscale
See also Rolfe at 0087.) [Rolfe describes maximizing the log-likelihood, which is equivalent to minimizing a negative log-likelihood, by employing a loss function that includes a term for KL-divergence which explicitly utilizes log, i.e., a logarithmic type of loss function.] The motivation rationale employed in claim 4 is similarly applicable to claim 7.
Regarding claim 8, Rolfe in view of Bandaragoda and Rosing recites “The hyperdimensional learning framework of claim 4” and Rosing further recites “wherein the HDC learning module is configured to employ a loss function to minimize a number of iterations needed to update the class hypervectors of the HDC model.” (Rosing at 0214: FIG. 16a shows the results of the regression inference for a synthetic function with the initial regressor model. The results show that it follows the trend of the target function, while underfitting for extreme cases. The main reason of the underfitting is that the randomly generated hypervectors are not perfectly orthogonal. To compensate for this error, we run a retraining procedure for several epochs. In the retraining procedure, we update the regressor model with the observed error for each sample as follows:
PNG
media_image2.png
36
185
media_image2.png
Greyscale
See also 0215: FIG. 16b shows the results after 2 retraining epochs. The results show that the model better fits to the dataset.) [The term retraining is functionally equivalent with the term iteration in the present context. To compensate for an error, i.e., an error that is trying to be reduced (which the term (1-f(Xi)) measures) the retraining procedure is ran a minimal number of epochs such as 2. Examiner also notes that, broadly, all loss functions quantify an error or loss between a model’s current output and desired output which guides an optimization algorithm to adjust the model parameters to minimize this error, i.e., in a minimal number of iterations.] The motivation rationale employed in claim 4 is similarly applicable to claim 8.
Regarding claim 9, Rolfe in view of Bandaragoda and Dockendorf recite “The hyperdimensional learning framework of claim 1” and Rolfe further recites “wherein the VAE module is implemented in a field programmable gate array (FPGA).” (Rolfe at 0139: At 430, the system generates or causes generation of samples from the approximating posterior over ζ, given the full distribution over z. Typically, this is performed by a non-quantum processor, and uses the inverse of the CDF Fi(x) described above. The non-quantum processor can, for example, take the form of one or more of one or more digital microprocessors, digital signal processors, graphical processing units, central processing units, digital application specific integrated circuits, digital field programmable gate arrays, digital microcontrollers, and/or any associated memories, registers or other nontransitory computer- or processor-readable media, communicatively coupled to the non-quantum processor.)
Regarding claim 10, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 9” and Rosing further recites “wherein the HDC module is implemented in the FPGA.” (Rosing at 0127: The bit-level operations involved in the disclosed techniques and dimension-wise parallelism of the computation makes FPGA a promising platform to accelerate privacy-aware HD computing. See also Rosing at 1-IV.3. FPGA Implementation 0135: We implemented the HD inference using the proposed encoding with the optimization detailed in Section 1-III-D. We implemented a pipelined architecture with building blocks shown in FIG. 7a as in the inference we only used binary (bipolar) quantization. See also 0136.) A person skilled in the art would be motivated to modify Rolfe and Bandaragoda with Rosing to recite “wherein the HDC module is implemented in the FPGA” with the motivation being “(0135) Thanks to the massive bit-level parallelism of FPGA with relatively low power consumption (˜7 W obtained via Xilinx Power Estimator” and “(0136) Finally, we implemented the disclosed encoding on an FPGA platform which achieved 4.1×energy efficiency compared to existing binary techniques.” See also Rosing at Table 1-I.1
Regarding claim 11, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 1” and Rolfe further recites “wherein the VAE module is implemented within a central processing unit (CPU).” (Rolfe at 0139: At 430, the system generates or causes generation of samples from the approximating posterior over ζ, given the full distribution over z. Typically, this is performed by a non-quantum processor, and uses the inverse of the CDF F.sub.i(x) described above. The non-quantum processor can, for example, take the form of one or more of one or more digital microprocessors, digital signal processors, graphical processing units, central processing units, digital application specific integrated circuits, digital field programmable gate arrays, digital microcontrollers, and/or any associated memories, registers or other nontransitory computer- or processor-readable media, communicatively coupled to the non-quantum processor.)
Regarding claim 13, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 1” and Bandaragoda recites “wherein the HDC module is configured to instantiate a hyperdimensional classification that directly operates over data” (Bandaragoda at 1666, cl. 2: The fixed-width HD vectors are presented to the IKASL algorithm which dynamically learns a topology preserving two-dimensional feature map consisting of segments of common trajectory patterns. See also: The learning layer encompasses the GSOM (dynamic topology preserving feature map) functionality which organizes the input HD feature vectors from TRt=Δt×(i+1)t=Δt×i into trajectory clusters. The generalization layer GEi encodes a generalized representation of the immediate learning layer LEi, which is the base layer for the next learning layer LEi+1. Due to space limitations, the interested reader is referred to [36] and [34] for further details on the workings of the IKASL algorithm.) [The IKASL algorithm operates on the HD vectors that are the output of the hyperdimensional encoding process.] The motivation rationale employed in claim 1 is similarly applicable to claim 13.
and Rolfe recites “encoded by the VAE module.” (Rolfe at 0138: In response to determining the stopping criterion has not been reached, the system fetches a mini-batch of the training data set at 420. At 425, the system propagates the training data set through the encoder to compute the full approximating posterior over discrete space z. See also Rolfe at 0102: Since the approximating posterior distribution q(z|x, ϕ) maps each input to a distribution over the latent space, it is called the “encoder”.)
Regarding claim 14, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 13,” and Rosing further recites “wherein the hyperdimensional classification achieves single-pass learning.” (Rosing at 0243: We observe that the HDC-based techniques can learn suitable models with much less epochs than DNN. For example, only with 1 epoch (no retraining) also known as single-pass learning, the HDC techniques achieve high accuracy. It also converges quickly only with several epochs.) A person skilled in the art would be motivated to modify with Rosing to recite “wherein the hyperdimensional classification achieves single-pass learning” with the motivation being “(0243) To summarize, the HDC technique performs the regression and classification tasks with accuracy differences of 0.39% and 0.94% on average. FIG. 23a shows how the HDC RL technique solves the CARTPOLE problem, achieving higher scores over trials. The results show that the disclosed technique successfully solves the problem, exceeding the threshold score (195) after 80 epochs. FIG. 23b show the accuracy changes over training epochs, where the initial training/each retraining during the boosting is counted as a single epoch. We observe that the HDC-based techniques can learn suitable models with much less epochs than DNN. For example, only with 1 epoch (no retraining) also known as single-pass learning, the HDC techniques achieve high accuracy. It also converges quickly only with several epochs.”
Regarding claim 15, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 13,” and Bandaragoda further recites “wherein the hyperdimensional classification achieves iterative learning.” (Bandaragoda at 1666, cl. 2: As shown in Figure 2, structurally, the IKASL algorithm represents a layer network structure, build based on the number of periods of incrementally learning. The layers are virtual and generated as required by the sequential incremental learning process.) The motivation rationale outlined in claim 1 is similarly applicable to claim 15.
Regarding claim 17, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 1,” and Bandaragoda further recites “wherein the VAE module is configured to generate a holographic distribution of the data.” (Bandaragoda at 1665, cl. 2: The resulting HD vectors use distributed representations where information is distributed across HD vector positions, i.e., HD vectors are interpretable only in entirety, any subspace is not interpretable.) The motivation rationale outlined in claim 1 is similarly applicable to claim 15.
Regarding claim 18, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 1,” and Bandaragoda further recites “comprises a training module that is configured to linearly add hypervectors associated with a class into a single hypervector that represents the class as a class hypervector.” (bundling: denoted with ⊕ and implemented via elementwise addition. It combines several HD vectors into a single HD vector e.g., a = Hl1 ⊕Hl2. In contrast to the binding and shifting operations, the resultant HD vector a is similar to all bundled HD vectors, i.e., the cosine similarity between a and any bundled vector is greater than 0. See also:
PNG
media_image4.png
514
997
media_image4.png
Greyscale
) [Element wise addition and ⊕ to combine multiple hypervectors in to a single hypervectors is an example of a linear addition process.]
Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Rolfe in view of Bandaragoda, Rosing and Park, et al. (US 20200242774 A1) (hereinafter referred to as “Park”).
Regarding claim 6, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 5” however neither Rolfe, Bandaragoda nor Rosing recite “wherein the loss function is a hinge type loss function.” On the other hand Park recites “wherein the loss function is a hinge type loss function.” (Park at 0035: Using a random vector at the input of the generator network can enable an example architecture to provide a straightforward way to produce multi-modal results in semantic image synthesis. Namely, one can attach an image encoder network e 406 that processes a real image 402 into a random vector or other latent representation 408, which can be then fed to the generator 410. The encoder 406 and the generator 410 form a variational auto-encoder in which the encoder network attempts to capture the style of the image, while the generator combines the encoded style and the segmentation map information via SPADE to reconstruct the original image. See also Park at 0038: A learning objective function can be used, such as may include a Hinge loss term.)
A person skilled in the art, before the effective filing date of the present application, would be motivated to modify Rolfe, Bandaragoda and Rosing with Park to recite “wherein the loss function is a hinge type loss function” with the motivation being “(0038) When training an example framework with an image encoder for multimodal synthesis and style-guided image synthesis, a divergence loss term [i.e., the hinge loss term] can be included that utilizes a standard Gaussian distribution and the variational distribution q is fully determined by a mean vector and a variance vector. A re-parameterization can be performed for back-propagating the gradient from the generator 410 to the image encoder 406… The network can be trained using, for example, hundreds of thousands of images of objects of the relevant labels or object types. The network can then generate photorealistic images conforming to that segmentation mask.”
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Rolfe in view of Bandaragoda, Rosing and Imani, et al. (US 20200410404 A1) (hereinafter referred to as “Imani”).
Regarding claim 12, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 11” and however neither Rolfe, Bandaragoda nor Rosing addresses “wherein the HDC module is implemented within the CPU.” On the other hand, Imani recites “wherein the HDC module is implemented within the CPU.” (Imani at 0078: Both the SecureHD framework and homomorphic library were run on ARM Cortex 53 and Intel i7 processors.)
A person skilled in the art, before the effective filing date of the present application, would be motivated to modify Rolfe, Bandaragoda and Rosing with Imani to recite “wherein the HDC module is implemented within the CPU” with the motivation being “(0078) Evaluation shows that SecureHD achieved on average 133× and 14.7× (145.6× and 6.8×) speedup for the encoding and decoding, respectively, as compared to the homomorphic technique running on the ARM architecture (Intel i7). The encoding of SecureHD running on embedded devices (ARM) was still 8.1× faster than the homomorphic encryption running on the high-performance client (Intel i7).”2
Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Rolfe in view of Bandaragoda, Rosing and in further view of Salamat, et al. (US 20210334703 A1) (hereinafter referred to as “Salamat”).
Regarding claim 16, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 1,” and however neither Rolfe, Bandaragoda nor Rosing recite “wherein the VAE module is configured to remain static while the HDC learning module updates the HDC model after a first prediction.” On the other hand, Salamat recites “wherein the VAE module is configured to remain static while the HDC learning module updates the HDC model after a first prediction.” (Salamat at 0043: The model generator also initializes the BRAMs with the base hypervectors. For this end, F5-HD exploits a fixed, predetermined hypervector as the seed vector, and generates the remaining t.sub.iv−1 hypervectors according to the procedure explained above. In the cases the user already has a trained model (i.e., base and class hypervectors), F5-HD allows direct initializing of these hypervectors. See also Salamat at 0046: We denote this single-epoch learning as model initialization. During the subsequent optional epochs (referred to as retraining), which either can be specified by the user or F5-HD itself continues until the accuracy improvement diminishes, under the management of the scheduler, F5-HD enhances the model by discarding the attributes of the mispredicted query hypervector H, from the mispredicted class hypervector C’H, and adding it to the correct class hypervector CH.) [The base vectors are initialized from a fixed, predetermined seed, i.e., and the training process updates the class hypervectors using the query vectors with the base hypervectors.]
A person skilled in the art, before the effective filing date of the present application, would be motivated to modify Rolfe, Bandaragoda and Rosing with Salamat to recite “wherein the VAE module is configured to remain static while the HDC learning module updates the HDC model after a first prediction” with the motivation being “(0043) After the design analyzer specified the parameters of the template architecture, F5-HD's model generator…. For this end, F5-HD exploits a fixed, predetermined hypervector as the seed vector…” i.e., a fixed, predetermined seed allows reproducibility and a consistent starting point, additionally as this paragraph is in the context of a power budget (0042) it also helps maintaining a consistent power budget.
Claims 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Rolfe in view of Bandaragoda, Rosing and in further view of Khaleghi, et al. (US 20210326756 A1) (hereinafter referred to as “Khaleghi”).
Regarding claim 19, Rolfe in view of Bandaragoda and Rosing recite “The hyperdimensional learning framework of claim 18,” and however neither Rolfe, Bandaragoda nor Rosing recite “further configured to perform dot product between a new training data point with a class hypervector that has a same label as the new training data point.”
On the other hand, Khaleghi recites “further configured to perform dot product between a new training data point with a class hypervector that has a same label as the new training data point.” (Khaleghi at 0050: While looking for the similarity (dot product) of the Q with class hypervector {C1, . . . ,CN}, Q is common among all the class hypervectors. Therefore, regardless of the elements of Q, the dimensions where all classes have similar values have low impact on differentiating the classes. In order to enable dimension-wise sparsity in HD computing, our framework measures the changes in the class elements in each dimension. The following equation shows the variation in the jth dimension of the class hypervectors:
PNG
media_image5.png
40
291
media_image5.png
Greyscale
See also 0048, 0049, 0051-54)
A person skilled in the art, before the effective filing date of the present application, would be motivated to modify Rolfe, Bandaragoda and Rosing with Khaleghi to recite “further configured to perform dot product between a new training data point with a class hypervector that has a same label as the new training data point” with the motivation being “(Khaleghi at 0041) For a class hypervectors with binarized values, Hamming distance is an inexpensive and suitable similarity metric, while class hypervectors with non-binarized elements require to use Cosine similarity” and “(0049) The goal of HD computing at inference is to find a class hypervector which has the highest Cosine similarity to a query hypervectors.” (see also (0048, 050)
Regarding claim 20, Rolfe in view of Bandaragoda, Rosing and Khaleghi recite “The hyperdimensional learning framework of claim 19” and Khaleghi further recites “wherein the HDC learning module is configured to update the HDC model based on the dot product.” (Khaleghi at 0056: HD looks at the similarity of each input hypervector to all stored class hypervectors; (i) if a query hypervector, Q, is correctly classified by the current model, our design does not change the model. (ii) While if it is wrongly matched with the i.sup.th class hypervector (C) when it actually belongs to jth class (C), our retraining procedure subtracts the query hypervector from the ith class and adds it to jth class hypervector: See also 0048-0054) [The similarity is evaluated by the cosine similarity, which utilizes a dot product as seen in claim 19, the model is updated if a misclassification occurs based on the similarity, i.e., based on the dot product.] The motivation rationale employed in claim 19 is similarly applicable to claim 20, additional motivation rationale includes “(0056) In order to compensate for the quality loss due to model sparsity, we adjust the model based on the new constraints. Model adjustment is similar to training procedure and its goal is to modify the sparse model in order to provide higher accuracy over training data.”
Examiner’s Note
Examiner cites particular columns, paragraphs, figures and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner. Secondary references may also overlap in reciting the claimed invention. The entire reference is considered to provide disclosure relating to the claimed invention.
Response to Arguments
Applicant's arguments filed 9/15/2025 have been fully considered but they are not persuasive.
Applicant argues on page 9 of the response that the references applied to the instant rejection operate in distinctly different technical domains, and therefore deal with problems unrelated to each other.
The examiner respectfully disagrees. As Applicant notes, Rolfe deals with hybrid-classical machine learning systems that trains a discrete VAE using quantum computers, and Bandaragoda deals with applying an HDC to traffic data for clustering variable-length trajectories into fixed-length high-dimensional vectors. The examiner sees no issue in applying Bandaragoda’s traffic profiling and incremental clustering to Rolfe’s hybrid machine-learning system. Both references are in the same field of endeavor inasmuch as both references deal with machine-learning methods, and the examiner cannot find any negative teachings in the applied references that would preclude the combination, as instantly presented.
Applicant argues on page 10 of the response that the specific architecture described in claim 1 is not present in any of the applied references.
The examiner respectfully disagrees. It is noted that the combined teachings of the three references (Rolfe, Bandaragoda, and Rosing) are used to teach and/or suggest Applicant’s claimed invention.
Applicant argues on page 10 of the response that there is no motivation or teaching present in either reference to combine their technologies as claimed.
The examiner respectfully disagrees. The instant rejection recites Bandaragoda in pertinent part “hyperdimensional computing has been amply demonstrated and applied in industrial systems” (instant Office action, page 6 at bottom, to page 7). The skilled artisan is cognizant that quantum machine-learning techniques are typically used in high-complexity fields, such as logistics, climate modeling, and cybersecurity. There is no reason why traffic analysis would not be a candidate for these techniques, especially when said analysis encompasses large geographical areas.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM BASHORE whose telephone number is (571)272-4088. The examiner can normally be reached Mon-Thursday 9am - 5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Bashore can be reached at (571) 272-4088. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/WILLIAM L BASHORE/ Supervisory Patent Examiner, Art Unit 2174
1 For the purpose of compact prosecution, Imani (US 202000410404 A1) at 0078 also provides additional motivation rationale in regards to claim 10, specifically “The SecureHD efficiency was also compared on the FPGA implementation. It was observed that the encoding and decoding of SecureHD achieved 626.2× and 389.4× (35.5× and 20.4×) faster execution as compared to the SecureHD execution on the ARM (Intel i7). For example, the proposed FPGA implementation was able to encode 2,600 data points and decode 1,335 for the MNIST images in a second.”
2 See also Montagna, et al. (24 Apr 2018), “Pulp-HD: Accelerating Brain-Inspired High-Dimensional Computing on a Parallel Ultra-Low Power Platform,” arXiv:1804.09123v1 at pg. 4, cl. 2: “As shown, the HD classifier achieves ≈2× faster execution and lower power at iso-accuracy compared to the SVM on the ARM Cortex M4. This is due to the fact that HD classifier mostly uses basic componentwise operations on the hypervectors.”