Office Action Analysis: 18066484 — SYSTEMS AND METHODS FOR TENSORIZING CONVOLUTIONAL NEURAL NETWORKS

Office Action

§102 §103 §112
DETAILED ACTION
Status of Claims
This Office action is responsive to communications filed on 2026-01-09. Claim(s) 8-9 and 18-19 is/are cancelled. Claim(s) 1-7, 10-17, and 20 is/are pending and are examined herein.
Claim(s) 1-7, 10-17, and 20 is/are rejected under 35 USC 112(b).
Claim(s) 11-17 and 20 is/are rejected under 35 USC 102.
Claim(s) 1-7 and 10 is/are rejected under 35 USC 103.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after 2013-03-16, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Regarding rejections under 35 USC 112(b), the applicant’s remarks have been fully considered. While the amendments clarify some of the indefinite claim elements indicated in the previous Office action, they do not substantively address all of the issues described previously. Moreover, they introduce a number of new instances of indefinite language. Issues in the pending claims are described below. 

The rejections under 35 USC 101 are withdrawn upon further consideration of the claims as a whole. 

Regarding the rejections under 35 USC 102/103, the applicant’s arguments have been fully considered but they are not persuasive. 
The applicant asserts that “Su does not disclose a ‘feature extraction network’ at all” [remarks, page 17]. The examiner respectfully disagrees. Any initial portion of the neural networks disclosed in Su (e.g., the first layer) can be mapped to the “feature extraction network” of the claim, and its output is then the “plurality of extracted features” of the claim. 
The applicant asserts that “Su does not actually disclose inputting the plurality of extracted features into a tensor contraction layer” [remarks, page 18]. The examiner respectfully disagrees. As noted in the previous Office action, Su discloses that, in each layer of a TNN, “each mode i of the input tensor U’ contracts with the corresponding kernel K_i” [Su, secton 3.1 paragraph beginning “One-layer of TNN”]. Since the output of an initial portion of the neural network maps to the “plurality of extracted features” of the claim, the layer immediately following this initial portion does in fact receive the “plurality of extracted features” as input. Since it also performs a tensor contraction, it can be mapped to the “tensor contraction layer” of the claim. 
The applicant asserts that “Su does not disclose ‘a tensorized regression layer’” and “Su also does not even disclose a ‘classification layer’” [remarks, page 18]. The examiner respectfully disagrees. The examiner notes that the specification supports interpreting the claim so that the “classification layer” and the “tensorized regression layer” refer to the same layer in the neural network (cf. “a classification (regression) layer 760” [specification, 0090]). Moreover, as explained in the previous Office action, the final layer of the neural network falls under the broadest reasonable interpretation of both a “tensorized regression layer” and a “classification layer” of the claim. More precisely, Su discloses an application of the neural network to image classification datasets (such as MNIST) [Su, section 6 first paragraph]. This means that the final layer falls under the broadest reasonable interpretation of the “classification layer” claim (since its output is a classification). Moreover, the final layer involves tensors, so it also falls under the broadest reasonable interpretation of the “tensorized regression layer” of the claim. 
The examiner maintains that Su still substantively discloses all elements of the pending independent claim. The complete prior art mapping, updated in view of the amended claims, is given below. 

Claim Rejections - 35 USC 112(b)
The following is a quotation of 35 USC 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 USC 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claim(s) 1-7, 10-17, and 20 is/are rejected under 35 USC 112(b) or 35 USC 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 USC 112, the applicant), regards as the invention.

Claims 1 and 11 are indefinite for at least the following reasons: 
It was amended to recite classifying an input image using improving a convolutional neural network [emphasis added] but this is ungrammatical. The examiner suggests “classifying an input image by improving a convolutional neural network” for grammaticality. For the purpose of compact prosecution, the claim is interpreted broadly as encompassing at least this interpretation. 
It was amended to recite at least one weight tensor having N parameters for tensorizing the convolutional neural network [emphasis added]. However, nowhere in the specification does it indicate that the “N parameters of the weight tensor” are “for tensorizing the convolutional neural network” (cf. [specification, 0007, 0017, 0082, 0086, etc]). Parameters of weight tensors are the weights used for processing data that passes through the neural network, not for “tensorizing” the neural network. Moreover, it is not entirely clear what it means to “tensorize” a convolutional neural network. The word “tensorize” would be understood by a person of ordinary skill in the art to refer to a process by which something (e.g., a vector or a matrix) is regarded as a tensor, and the specification provides no alternative special definition for the term. However, a convolutional neural network is a type of machine learning model. It makes use of tensors, but it is not clear to the examiner how the model itself could be regarded as a tensor. In view of the specification, the examiner suggests removing the underlined phrase (i.e., “at least one weight tensor having N parameters 
It was amended to recite each of the at least one weight tensor corresponding to a weight tensor of a convolutional layer of the convolutional neural network [emphasis added] but it is not clear what it means for a weight tensor to correspond to a weight tensor, and it is further not clear how the “a weight tensor” relates to the “at least one weight tensor” introduced previously. In view of the specification, the examiner suggests removing the underlined phrase (i.e., “each of the at least one weight tensor corresponding to “each of the at least one weight tensor corresponding to a convolutional layer of the convolutional neural network”). For the purpose of compact prosecution, the claim is interpreted broadly as encompassing at least this interpretation.
It was amended to recite process the input image by a feature extraction of the convolutional neural network comprising the at least one factorized weight tensor to obtain a plurality of extracted features [emphasis added] but the specification and the originally filed claims indicate that the factorized weight tensors are part of the “improved convolutional neural network” of the claim (not of the original “convolutional neural network” before it has been “improved”; cf. “supply/ing the factorized weight tensor to a classification layer of the convolutional neural network, thereby generating an improved convolutional neural network” [specification, 0007, 0017]). This results in a conflict between the claims and the specification, and MPEP 2173.03 indicates that “indefinite when a conflict or inconsistency between the claimed subject matter and the specification disclosure renders the scope of the claim uncertain”. In view of the invention as described in the specification and the originally filed claims, the examiner suggests removing the underlined phrase (i.e., “process the input image by a feature extraction of the convolutional neural network 
It recites and supply the factorized weight tensor contract the tensor contraction layer with a weight tensor of the classification layer to obtain a tensorized regression layer but this limitation is ungrammatical and includes numerous ambiguities (including the ambiguities described in the previous Office action in the 112(b) rejections of claims 1, 9, 11, and 19). First, since there are potentially multiple factorized weight tensors, the phrase “the factorized weight tensor” has potentially multiple antecedents and it is therefore indefinite. Second, the claim language as presently recited leaves unspecified what this indefinite claim element is being “suppl[ied]” to. Third, the limitation recites “the classification layer” but this lacks antecedent basis. Fourth, it is not clear what it means to contract a “tensor contraction layer with a weight tensor” since contraction is a mathematical operation defined between two tensors, not between a layer and a tensor (the specification also describes contraction between two tensors, not between a layer and a tensor; cf. [specification, 0091]). Fifth, the use of “a weight tensor” results in ambiguity of antecedent since the claim already previously introduces a weight tensor, and in fact, in view of the specification [specification, 0091-0092], the “weight tensor” recited in this limitation appears to the examiner to not be one of the “at least one weight tensor” recited previously, as the terminology would suggest at first glance; rather, it appears to refer to an input tensor of the classification layer (which, in the specification, is called a “low-rank weight convolution tensor 745” [specification, 0091] as well as a “convolution tensor 745” [specification, 0091]). In view of these numerous issues of indefiniteness (and in keeping with the suggestions made in the previous Office action), the examiner suggests: 
one of the at least one factorized weight tensor to a classification layer of the convolutional neural network; and contract the one of the at least one factorized weight tensor with a convolution tensor of the classification layer to obtain a tensorized regression layer
For the purpose of compact prosecution, the claim is interpreted broadly as encompassing at least this interpretation.
Dependent claims 2-7, 10, and 12-17, and 20 inherit these rejections. 

Claim Rejections - 35 USC 102
The following is a quotation of the appropriate paragraphs of 35 USC 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 USC 102(b)(2)(C) for any potential 35 USC 102(a)(2) prior art against the later invention.

Claim(s) 11-17 and 20 is/are rejected under 35 USC 102(a)(1) as being anticipated by Jiahao SU et al. (Tensorial Neural Networks: Generalization of Neural Networks and Application to Model Compression, published 2018-12-08; hereafter, “Su”).

Claim 11
Su discloses:
A method for classifying an input image using improving a convolutional neural network, the method comprising: ([Su, abstract, sections 1, 3, and 6]: Su discloses “tensorial neural networks (TNNs)” and a method of “[m]aping a neural network to TNNs with the same expressive power [which] results in a TNN of fewer parameters” [Su, abstract]. Su indicates that “[g]iven a pre-trained NN g^q in G^q, compressing it to a TNN with p parameters results in [a TNN] h^p that is closest to g^q in H^p” [Su, section 1 paragraph beginning “TNNs can be used”], where H^p (resp. G^q) denote the sets of functions that can be represented by TNNs (resp. NNs) with at most p (resp. q) trainable parameters [Su, section 1 paragraph beginning “Figure 1 illustrates”; see also, figure 1, section 4 first paragraph, etc]. The examiner notes that Su explicitly discusses g^q being a convolutional neural network [Su, section 3; see also, figure 4]. The pre-trained NN g^q maps to the “convolutional neural network” of the claim, and the process of compressing it to obtain h^p is a process of “improving a convolutional neural network” as recited by the claim. Su further discloses evaluating their compression method on image classification datasets (MNIST, CIFAR-10, and ImageNet) [Su, section 6 first paragraph] (cf. the conclusion of a previous Office action). In other words, an image from these datasets maps to the “input image” of the claim.)
receiving the input image; ([Su, section 6]: As noted above, an image from one of the image datasets disclosed in Su [Su, section 6 first paragraph] maps to the “input image” of the claim.)
receiving at least one weight tensor having N parameters for tensorizing the convolutional neural network, each of the at least one weight tensor corresponding to a weight tensor of a convolutional layer of the convolutional neural network; ([Su, sections 1 and 3-4]: As noted above, the input to the compression algorithm is a convolutional neural network g^q [Su, section 1]. Su indicates that a “convolutional layer in traditional neural networks is parameterized by a 4-order kernel K in R^{H × W × S × T}, where H, W are height/width of the filters, and S, T are the numbers of input/output channels” [Su, section 3.1 paragraph beginning “One-layer of CNN”]. The kernel in the lth layer of g^q is denoted K^{(l)} in [Su, section 4.2 paragraph beginning “Generalized Tensor Decomposition”]. In other words, the kernels {K^{(l)}} of g^q map to the “at least one weight tensor” of the claim, and the number of parameters in one of the kernels (i.e., the product HWST in the case of a kernel in R^{H × W × S × T}) maps to the number N of parameters of the claim.)
factorizing the at least one weight tensor to obtain a corresponding at least one factorized weight tensor, each of the at least one factorized weight tensor having M parameters, wherein M < N; ([Su, section 4.2]: Su discloses performing a tensor decomposition, i.e., “find[ing] K_i^{(l)^*} such that their composition is [close] to the uncompressed kernel K^{(l)}” [Su, section 4.2 paragraph beginning “Generalization Tensor Decomposition”]. Finding this decomposition maps to the “factorizing” step of the claim, and the number of parameters in the decomposition maps to the number M of parameters of the claim. As noted above, the decomposition has fewer parameters (cf. “fewer parameters” [Su, abstract], “p << q” [Su, section 4 first paragraph], etc), so it is in fact true that “M < N” as required by the claim. See [Su, appendix E] for further details about tensor decomposition.)
processing the input image by a feature extraction network of the convolutional neural network comprising the at least one factorized weight tensor to obtain a plurality of extracted features; ([Su, figure 4]: Any initial part of the neural network (e.g., the first layer) can be mapped to the “feature extraction network” of the claim, and the output of that part of the neural network maps to the “plurality of extracted features” of the claim.)
inputting the plurality of extracted features into a tensor contraction layer of the convolutional neural network; ([Su, section 3.2]: Su discloses that, in each layer, “each mode i of the input tensor U’ contracts with the corresponding kernel K_i” [Su, secton 3.1 paragraph beginning “One-layer of TNN”]. The layer immediately following the “feature extraction network” as mapped above thus maps to the “tensor contraction layer” of the claim (e.g., if the first layer is taken to be the “feature extraction network” of the claim, then the second layer can be taken to be the “tensor contraction layer” of the claim).)
and supplying the factorized weight tensor ([Su, section 4]: As noted above, Su discloses using tensor decompositions K_i^{(l)^*} of the kernels K^{(l)} of g^q to compress g^q and obtain h^p [Su, section 4]. The use of the tensor decompositions to obtain h^p maps to the “supplying” step of the claim. The applicant is advised to consult the 112(b) rejections for the interpretation of the numerous indefinite claim elements recited in this limitation and the subsequent one.)
contracting the tensor contraction layer with a weight tensor of the classification layer to obtain a tensorized regression layer, ([Su, section 3.2 and 6]: As noted above, Su discloses that, in each layer of a TNN, “each mode i of the input tensor U’ contracts with the corresponding kernel K_i” [Su, secton 3.1 paragraph beginning “One-layer of TNN”]. Moreover, again as noted above, Su explicitly indicates experimenting on image classification datasets [Su, section 6 first paragraph]. This means that the final layer of the TNN h^p can be mapped to the “tensorized regression layer” and the “classification layer” of the claim, and the input tensor of that layer to the indefinite “weight tensor” of this limitation. The tensor decomposition {K_i^{(l)^*}_i appearing in that layer is then the “[one of the] factorized weight tensor[s]” of the claim.)
the tensorized regression layer classifying the input image, ([Su, sections 4 and 6]: As noted above, Su explicitly indicates experimenting on image classification datasets [Su, section 6 first paragraph]. This means that the final layer of the TNN (i.e., the the “tensorized regression layer” of the claim) does in fact “classify[…] the input image” as required by the claim.)
thereby generating an improved convolutional neural network. ([Su, section 4]: As noted above, Su discloses using tensor decompositions K_i^{(l)^*} of the kernels K^{(l)} of g^q to compress g^q and obtain h^p [Su, section 4]. Then h^p maps to the “improved convolutional neural network” of the claim.)

Claim 12
Su discloses: 
[The method of claim 11, further comprising:] determining a rank of each weight tensor of the at least one weight tensor; ([Su, section 3.1]: As noted above, Su notes that a “convolutional layer in traditional neural networks is parameterized by a 4-order kernel” [Su, section 3.1 paragraph beginning “One-layer of CNN”]. It also discusses more general tensors of arbitrary order (e.g., “[g]iven an m-order tensor T in R^{I_0 × I_1 × … × I_{m-1}}” [Su, appendix E paragraph beginning “Tucker decomposition”]. The order/rank of the kernel maps to the “rank of the at least one weight tensor” of the claim.)
and decomposing the at least one weight tensor into a core tensor and a number R of factor matrices, where R corresponds to the rank of the weight tensor. ([Su, appendix E]: Su discloses the use of “Tucker decomposition” which, “[g]iven an m-order tensor T in R^{I_0 × I_1 × … × I_{m-1}}… factors it into m factor matrices M^{(l)}_{l = 0}^{m-1}, where M^{(l)} in R^{R_l × I_l}… and an additional m-order core tensor C in R^{R_0 × R_1 × … × R_{m-1}}” [Su, appendix E paragraph beginning “Tucker decomposition”]. In other words, C maps to the “core tensor” of the claim, and the matrices M^{(l)} to the “factor matrices” of the claim. The number m of factor matrices is equal to the order/rank of the input tensor T. The examiner remarks that the applicant’s specification also indicates that one of the two factorization methods explicitly envisioned by the applicant is precisely the Tucker decomposition [specification, 0071].)

Claim 13
Su discloses: 
[The method of claim 12, further comprising:] providing a number R of factorization ranks χ_i for i =1 ... R, where R corresponds to the rank of the weight tensor such that each χ_i is upper-bounded by a size of a corresponding dimension D_i. ([Su, appendix E]: Su discloses that the “Tucker ranks R_l’s are required to be smaller or equal than the dimensions at their corresponding modes, i.e., R_l ≤ I_l” [Su, appendix E paragraph beginning “Tucker decomposition”]. The Tucker ranks {R_l}_{l = 0}^{m-1} map to the “factorization ranks χ_i for i = 1 … R” of the claim, and the dimension I_l of the corresponding mode of the tensor T maps to the “corresponding dimension D_i” of the claim. The fact that R_l ≤ I_l means that “each χ_i is upper-bounded by” D_i as required by the claim.)

Claim 14
Su discloses: 
[The method of claim 13, wherein] the factor matrices and the core tensor have (D_1 × χ_1 + D_2 × χ_2 + … + D_R × χ_R) + (χ_1 × χ_2 × … × χ_R) trainable parameters. ([Su, appendix E]: Su discloses that the Tucker decomposition has “prod_{l = 0}^{m-1} R_l + sum_{l = 0}^{m-1} I_l R_l entries” [Su, appendix E paragraph beginning “Tucker decomposition”]. As noted under the parent claims, m in Su corresponds to R in the claim, R_i in Su corresponds to χ_i in the claim, and I_i in Su corresponds to D_i in the claim. With these correspondences, the expression for the number of entries in the Tucker decomposition as given in Su is exactly equal to the expression for the number of trainable parameters given in the claim.)

Claim 15
Su discloses: 
[The method of claim 14,] wherein the rank of the weight tensor R = 4 and the dimensions D, are T, W, H, and C, where T is a number of output channels, W is a width of features in the classification layer, H is a height of features in the classification layer, and C is a number of input channels. ([Su, section 3.1]: As noted above, Su indicates that a “convolutional layer in traditional neural networks is parameterized by a 4-order kernel K in R^{H × W × S × T}, where H, W are height/width of the filters, and S, T are the numbers of input/output channels” [Su, section 3.1 paragraph beginning “One-layer of CNN”]. In other words, the variables H, W, S, and T as described in Su matches exactly the identically-named variables of the claim.)

Claim 16
Su discloses: 
[The method of claim 11, further comprising:] determining a decomposition rank R of each weight tensor of the at least one weight tensor; and factorizing the weight tensor as a sum of a number R of tensor products. ([Su, appendix E]: Su discloses “CANDECOMP/PARAFAC (CP) decomposition” [Su, appendix E first paragraph] which, “given an m-order tensor T in R^{I_0 × I_1 × … × I_{m-1}}, … factorizes it into m factor matrices {M^{(0)}}_{l = 0}^{m-1}, where M^{(l)} in R^{R × I_l}… where R is called the canonical rank of the CP decomposition” [Su, appendix E paragraph beginning “CP decomposition”]. In other words, the canonical rank R of the CP decomposition maps to the “decomposition rank R” of the claim. The decomposition [Su, appendix E paragraph beginning “CP decomposition” equation (28b)] expresses T as a “sum of a number R of tensor products” as required by the claim. The examiner remarks that the applicant’s specification also indicates that one of the two methods of performing factorization explicitly envisioned by the applicant is precisely CP decomposition [specification, 0071].)

Claim 17
Su discloses: 
[The method of claim 15, wherein] the sum of the number R of tensor products is equal to sum_{r = 1}^R u_r^{(1)} ∘ u_r^{(2)} ∘ … ∘ u_R^{(N)}, where r is a summation index from 1 to R and each of u_r^{(1)}, …, u_r^{(2)}, …, u_r^{(N)} is a one-dimensional vector. ([Su, appendix E]: The limitation merely recites CP decomposition in an alternative notation. For example, Su cites the Kolda (as recorded in the conclusion of a previous Office action) as a reference regarding CP decomposition, and Kolda explicitly indicates that the CP decomposition can also be expressed as a sum of R outer products of vectors [Kolda, equation (3.1)], as described in this claim.) 

Claim 20
Su discloses: 
[The method of claim11, further comprising:] producing a class of an input using the improved convolutional neural network. ([Su, section 6]: As noted under the parent claim, Su explicitly indicates experimenting on classification tasks using the classification datasets MNIST, CIFAR-10, and ImageNet [Su, section 6 first paragraph; see also, Wikipedia as cited in the conclusion of this Office action], and h^p maps to the “improved convolutional neural network” of the claim. Any input provided to h^p maps to the “input” of the claim, and the resulting output maps to the “class of [the] input” of the claim.)

Claim Rejections - 35 USC 103
The following is a quotation of 35 USC 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-7 and 10 is/are rejected under 35 USC 103 as being unpatentable over Su in view of Anamitra CHOUDHURY et al. (US20200410336A1, published 2020-12-31; hereafter, “Choudhury”).

Claim 1
Su discloses:
A system for classifying an input image using improving a convolutional neural network, ([Su, abstract, sections 1 and 3]: Su discloses “tensorial neural networks (TNNs)” and a method of “[m]aping a neural network to TNNs with the same expressive power [which] results in a TNN of fewer parameters” [Su, abstract]. Su indicates that “[g]iven a pre-trained NN g^q in G^q, compressing it to a TNN with p parameters results in [a TNN] h^p that is closest to g^q in H^p” [Su, section 1 paragraph beginning “TNNs can be used”], where H^p (resp. G^q) denote the sets of functions that can be represented by TNNs (resp. NNs) with at most p (resp. q) trainable parameters [Su, section 1 paragraph beginning “Figure 1 illustrates”; see also, figure 1, section 4 first paragraph, etc]. The examiner notes that Su explicitly discusses g^q being a convolutional neural network [Su, section 3; see also, figure 4]. The pre-trained NN g^q maps to the “convolutional neural network” of the claim, and the process of compressing it to obtain h^p is a process of “improving a convolutional neural network” as recited by the claim.)
receive the input image; ([Su, section 6]: As noted above, an image from one of the image datasets disclosed in Su [Su, section 6 first paragraph] maps to the “input image” of the claim.)
receive at least one weight tensor having N parameters for tensorizing the convolutional neural network, each of the at least one weight tensor corresponding to a weight tensor of a convolutional layer of the convolutional neural network; ([Su, sections 1 and 3-4]: As noted above, the input to the compression algorithm is a convolutional neural network g^q [Su, section 1]. Su indicates that a “convolutional layer in traditional neural networks is parameterized by a 4-order kernel K in R^{H × W × S × T}, where H, W are height/width of the filters, and S, T are the numbers of input/output channels” [Su, section 3.1 paragraph beginning “One-layer of CNN”]. The kernel in the lth layer of g^q is denoted K^{(l)} in [Su, section 4.2 paragraph beginning “Generalized Tensor Decomposition”]. In other words, the kernels {K^{(l)}} of g^q map to the “at least one weight tensor” of the claim, and the number of parameters in one of the kernels (i.e., the product HWST in the case of a kernel in R^{H × W × S × T}) maps to the number N of parameters of the claim.)
factorize the at least one weight tensor to obtain a corresponding at least one factorized weight tensor, each of the at least one factorized weight tensor having M parameters, wherein M < N; ([Su, section 4.2]: Su discloses performing a tensor decomposition, i.e., “find[ing] K_i^{(l)^*} such that their composition is [close] to the uncompressed kernel K^{(l)}” [Su, section 4.2 paragraph beginning “Generalization Tensor Decomposition”]. Finding this decomposition maps to the “factorizing” step of the claim, and the number of parameters in the decomposition maps to the number M of parameters of the claim. As noted above, the decomposition has fewer parameters (cf. “fewer parameters” [Su, abstract], “p << q” [Su, section 4 first paragraph], etc), so it is in fact true that “M < N” as required by the claim. See [Su, appendix E] for further details about tensor decomposition.)
process the input image by a feature extraction network of the convolutional neural network comprising the at least one factorized weight tensor to obtain a plurality of extracted features; ([Su, figure 4]: Any initial part of the neural network (e.g., the first layer) can be mapped to the “feature extraction network” of the claim, and the output of that part of the neural network maps to the “plurality of extracted features” of the claim.)
input the plurality of extracted features into a tensor contraction layer of the convolutional neural network; ([Su, section 3.2]: Su discloses that, in each layer, “each mode i of the input tensor U’ contracts with the corresponding kernel K_i” [Su, secton 3.1 paragraph beginning “One-layer of TNN”]. The layer immediately following the “feature extraction network” as mapped above thus maps to the “tensor contraction layer” of the claim (e.g., if the first layer is taken to be the “feature extraction network” of the claim, then the second layer can be taken to be the “tensor contraction layer” of the claim).)
and supply the factorized weight tensor ([Su, section 4]: As noted above, Su discloses using tensor decompositions K_i^{(l)^*} of the kernels K^{(l)} of g^q to compress g^q and obtain h^p [Su, section 4]. The use of the tensor decompositions to obtain h^p maps to the “supplying” step of the claim. The applicant is advised to consult the 112(b) rejections for the interpretation of the numerous indefinite claim elements recited in this limitation and the subsequent one.)
contract the tensor contraction layer with a weight tensor of the classification layer to obtain a tensorized regression layer, ([Su, section 3.2 and 6]: As noted above, Su discloses that, in each layer of a TNN, “each mode i of the input tensor U’ contracts with the corresponding kernel K_i” [Su, secton 3.1 paragraph beginning “One-layer of TNN”]. Moreover, again as noted above, Su explicitly indicates experimenting on image classification datasets [Su, section 6 first paragraph]. This means that the final layer of the TNN h^p can be mapped to the “tensorized regression layer” and the “classification layer” of the claim, and the input tensor of that layer to the indefinite “weight tensor” of this limitation. The tensor decomposition {K_i^{(l)^*}_i appearing in that layer is then the “[one of the] factorized weight tensor[s]” of the claim.)
the tensorized regression layer classifying the input image, ([Su, sections 4 and 6]: As noted above, Su explicitly indicates experimenting on image classification datasets [Su, section 6 first paragraph]. This means that the final layer of the TNN (i.e., the the “tensorized regression layer” of the claim) does in fact “classify[…] the input image” as required by the claim.)
thereby generating an improved convolutional neural network. ([Su, section 4]: As noted above, Su discloses using tensor decompositions K_i^{(l)^*} of the kernels K^{(l)} of g^q to compress g^q and obtain h^p [Su, section 4]. Then h^p maps to the “improved convolutional neural network” of the claim.)

While it is clear from context that the method of Su is intended to be implemented on a computer having a processor, it may be argued that Su does not distinctly disclose: 
the system comprising at least one processor configured to:

Choudhary is in the field of machine learning. It discloses compressing neural network models [Choudhury, 0019] using Tucker decompositions [Choudhury, 0020]. It also discloses:
the system comprising at least one processor configured to: ([Choudhury, 0005]: Choudhury discloses that the methods disclosed therein “can be implemented in the form of a system including a memory and at least one processor that is coupled to the memory” [Choudhury, 0005].)

Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to implement the tensorial neural networks of Su on a computer as described in Choudhury because performing calculations using a computer would be more efficient than performing then manually. 

Claims 2-7 and 10 inherit limitations from claim 1 and recite additional limitations which are substantially similar to those recited by claims 12-17 and 20, respectively, so they are rejected by the same rationale. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).

A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Shishir AGRAWAL whose telephone number is +1 703-756-1183. The examiner can normally be reached Monday through Thursday, 08:30-14:30 Pacific Time.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey SHMATOV can be reached on +1 571-270-3428. The fax phone number for the organization where this application or proceeding is assigned is +1 571-273-8300.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at +1 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call +1 800-786-9199 (IN USA OR CANADA) or +1 571-272-1000.

/S.A./Examiner, Art Unit 2123

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123
Read full office action
SYSTEMS AND METHODS FOR TENSORIZING CONVOLUTIONAL NEURAL NETWORKS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

SYSTEMS AND METHODS FOR TENSORIZING CONVOLUTIONAL NEURAL NETWORKS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email