DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Regarding the 35 USC 103 rejections of claims 1 and 3-8, Applicant's arguments filed 01/13/2026 have been fully considered but they are not persuasive.
Examiner response: Examiner respectfully disagrees. Huang discloses that a weight may be assigned an importance based on input data (para [0035] “Each input may have an associated weight (w), which may be assigned based on the importance of the input relative to other inputs.”). Huang further teaches that the values of a weight may be assigned and updated through training and backwards propagation (para [0048] “A learning or training process may assign appropriate weights for these connections. In some implementations, the initial values of the weights may be randomly assigned. For every input in a training dataset, the output of the artificial neural network may be observed and compared with the expected output, and the error between the expected output and the observed output may be propagated back to the previous layer. The weights may be adjusted accordingly based on the error. This process is repeated until the output error is below a predetermined threshold.”). Huang further teaches that a filter is comprised of weights that may be learned through training (para [0057] “In practice, a CNN may learn the weights of the filters on its own during the training process based on some user specified parameters (which may be referred to as hyperparameters), such as the number of filters, the filter size, the architecture of the network, etc.”). The weights of the filters are assigned based on the input data and correspond to channels of the input data (para [0073] “More specifically, as shown in FIG. 5, for a 3-D input 520-1, . . . , or 520-N and a 3-D filter 510-1, . . . , or 510-M, the C 2-D filters (each with dimensions R×S) in 3-D filter 510-m may correspond to the C channels of 2-D input feature maps (each with dimensions H×W) in the 3-D input, and the convolution operation between each 2-D filter of the C 2-D filters and the corresponding channel of the C channels of 2-D input feature maps may be performed.”). Arguments are not persuasive.
Applicant’s arguments, see pages 8-9, filed 01/13/2026, with respect to claim 21 have been fully considered and are persuasive. The 35 USC 103 rejection of claims 21- 25 has been withdrawn.
Applicant’s arguments, see pages 6-7 of remarks, filed 01/13/2026, with respect to the 35 USC 101 rejection have been fully considered and are persuasive. The 35 USC 101 rejection of claims 1, 3-8, and 21-25 has been withdrawn.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. (US-20200410337-A1) in view of Martin et al. (US-20190087718-A1) and Yeo (US-20210256258-A1).
Regarding Claim 1,
Huang (US 20200410337 A1) teaches a data block processing method, wherein the method is applied to hardware that processes floating-point data or hardware that processes calculations and storage using fixed-point data, and the method comprises:
obtaining, by an element-wise layer of a neural network model (para [0065] “For example, as shown in FIG. 3A, a convolution layer (e.g., first convolution layer 215 or second convolution layer 235) or a processing node of the convolution layer may receive pixel values for a region 322 (including R×S pixels) of input pixel array 320, perform element-wise multiplications between corresponding elements in filter 310 and region 322, and sum the products of the element-wise multiplications to generate a convolution output value 332.” Convolutional layer is an element-wise layer), n data blocks inputted by a previous level network layer of the element-wise layer (para [0064] “Input pixel array 320 may include an input image, a channel of an input image, or a feature map generated by another convolution layer or pooling layer.” The input into the current convolutional layer (i.e., element-wise layer) is received from a previous layer.), wherein the neural network model is used for image recognition and speech recognition (para [0035] “Artificial neural networks (also referred to as “neural networks”) have been used in machine learning research and industrial applications and have achieved many breakthrough results in, for example, image recognition, speech recognition, computer vision, text processing, and the like.”), and n being a positive integer (para [0072] “FIG. 5 illustrates an example of a model 500 for a convolution layer of a convolutional neural network used in, for example, image processing. As illustrated in the example, there may be multiple (e.g., N) 3-D inputs 520-1, . . . , and 520-N to the convolution layer. 3-D inputs (i.e., n data blocks).);
obtaining, by the element-wise layer, compensation factors corresponding to channels of each of the n data blocks (para [0073] “More specifically, as shown in FIG. 5, for a 3-D input 520-1, . . . , or 520-N and a 3-D filter 510-1, . . . , or 510-M, the C 2-D filters (each with dimensions R×S) in 3-D filter 510-m may correspond to the C channels of 2-D input feature maps (each with dimensions H×W) in the 3-D input, and the convolution operation between each 2-D filter of the C 2-D filters and the corresponding channel of the C channels of 2-D input feature maps may be performed.” Filter (i.e., compensation factors).) from input data of the element-wise layer (para [0107] “During the convolution operation, a weight in each 2-D filter (with dimensions R×S) of the four 2-D filters in each of the four 3-D filters (with dimensions C×R×S) may be pre-loaded into PE array 910.”);
multiplying, by the element-wise layer, data on the channels of each of the n data blocks by the compensation factors corresponding to the channels respectively to obtain n compensated data blocks (para [0072] “For example, 3-D filter 510-1 may be applied to 3-D input 520-N to generate an output feature map 530-N−1, and 3-D filter 510-M may be applied to 3-D input 520-N to generate an output feature map 530-N-M. Thus, there are N 3-D inputs and N 3-D outputs, where each 3-D output includes M output feature maps.” N 3-D outputs (i.e., compensated data blocks). para [0073]-[0075] This sections described multiplying the 3-D inputs with filters to produce 3D outputs.); and
performing, by the element-wise layer, an element-wise operation on the n compensated data blocks to obtain an element-wise operation result and outputting the element-wise operation result (para [0059] “As shown in FIG. 2, an additional non-linear operation using an activation function (e.g., ReLU) may be used after every convolution operation. ReLU is an element-wise operation that replaces all negative pixel values in the feature map by zero.” And para [0071] “A non-linear activation function (e.g., ReLU, sigmoid, tan h, etc.) may then be applied to output matrix 430 to generate a matrix 440 as shown in FIG. 4D.” Matrix 430 is the output of the convolution multiplication operation.), in a case that n is an integer greater than 1 (para [0072] “As illustrated in the example, there may be multiple (e.g., N) 3-D inputs 520-1, . . . , and 520-N to the convolution layer.” multiple is greater than 1.).
wherein the obtaining the compensation factors corresponding to the channels of each of the n data blocks from the input data of the element-wise layer comprises:
obtaining the compensation factors corresponding to channels of a target data block from the input data of the element-wise layer, wherein the target data block is any one of the n data blocks (para [0073] “More specifically, as shown in FIG. 5, for a 3-D input 520-1, . . . , or 520-N and a 3-D filter 510-1, . . . , or 510-M, the C 2-D filters (each with dimensions R×S) in 3-D filter 510-m may correspond to the C channels of 2-D input feature maps (each with dimensions H×W) in the 3-D input, and the convolution operation between each 2-D filter of the C 2-D filters and the corresponding channel of the C channels of 2-D input feature maps may be performed.” 3-D filter corresponding to 3D-input (i.e., target block).).
Huang does not explicitly disclose
wherein all data in the n data blocks is fixed-point data, the element-wise layer processes the fixed-point data
wherein the compensation factors are used for scaling data ranges of data on channels of each of the n data blocks, and compensating for a data range difference of the data on the channels of each of the n data blocks;
Huang teaches that a convolutional layer (i.e., elementwise layer) can perform element-wise operations.
Martin further teaches that the convolutional layer can perform operations on fixed-point data.
Martin (US 20190087718 A1) teaches obtaining, by an element-wise layer of a neural network model, n data blocks inputted by a previous level network layer of the element-wise layer, wherein all data in the n data blocks is fixed-point data (para [0098] “The convolution engine 602 receives fixed point data from the input module 202 (e.g. input buffer) and fixed point weights from the coefficient buffer 616 and performs a convolution of the fixed point data and the fixed point weights to produce fixed point output data which is provided to the accumulation buffer 604.”), the element-wise layer processes the fixed-point data and n is a positive integer (para [0027] “As can be seen in FIG. 1, the format of data used in a DNN may be formed of a plurality of planes. The input data may be arranged as P planes of data, where each plane has a dimension x×y. A DNN may comprise one or more convolution layers each of which has associated therewith a plurality of filters formed by a plurality of weights w.sub.0 . . . w.sub.n. The filters (w.sub.0 . . . w.sub.n) each have a dimension m×n×P and are applied to the input data according to a convolution operation across several steps in direction s and t, as illustrated in FIG. 1.” ‘P’ represents the number of data blocks. para [0059] “As a result, it may be more efficient to represent the weights using a fixed point format. In some cases, a fixed point format defined by an exponent and an integer bit-length may be used. In these cases, the exponent and/or integer bit-length used to represent the weights may vary by layer. For example, the weights of a first convolution layer may be in a fixed point format defined by exponent A and integer bit-length B and the weights of a second convolution layer may be in a fixed point format defined by exponent C and integer bit-length D.”);
Huang and Martin are analogous because they are directed to convolutional neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the CNN of Huang with the fixed-point data format of Martin.
Using fixe-point data instead of floating-point data may reduce the power consumption and computational complexity of the neural network (Martin para [0089]).
Yeo (US 20210256258 A1) teaches
wherein the compensation factors are used for scaling data ranges of data on channels of each of the n data blocks, and compensating for a data range difference of the data on the channels of each of the n data blocks (para [0038] “The weight applying unit 130 may generate a weight filter by converting a size of a saliency map into a size of a first convolution layer (a convolution layer to which a weight is to be applied) included in the feature extracting model 145 and may apply a weight to the feature extracting model 145 by performing element-wise multiplication of the first convolution layer and the weight filter for each channel. …Next, the feature extracting model 145 may scale a value of each pixel in the resized saliency map. Here, scaling means a standardization operation of multiplying a value by an integer (magnification) to change the value so that a range of the value falls within a predetermined limit. For example, the weight applying unit 130 may scale values of the weight filter to values between 0 and 1 to generate a weight filter having a size of m×n that is equal to a size (m×n) of the first convolution layer.”);
Huang and Yeo are analogous because they are directed to convolutional neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the CNN of Huang with the scaling method of Yeo.
Doing so would allow for activating relevant features in the feature map more strongly (Yeo para [0038]).
Regarding Claim 7,
Haung, Martin, and Yeo teach the method according to claim 1. Huang further teaches after the multiplying, by the element-wise layer, the data on the channels of each of the n data blocks by the compensation factors corresponding to the channels respectively to obtain the n compensated data blocks, further comprising:
outputting the n compensated data blocks by the element-wise layer in a case that n is equal to1 (para [0072] “For example, 3-D filter 510-1 may be applied to 3-D input 520-N to generate an output feature map 530-N−1, and 3-D filter 510-M may be applied to 3-D input 520-N to generate an output feature map 530-N-M. Thus, there are N 3-D inputs and N 3-D outputs, where each 3-D output includes M output feature maps.” And para [0076] “FIG. 6 illustrates an example of a convolution operation involving one batch (N=1) of C channels (C=3) of input data 620 and M sets (M=2) of C filters (C=3). The example shown in FIG. 6 may be a specific example of model 500 described with respect to FIG. 5, where the number of batches N is one.”).
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Huang/Martin/Yeo, as applied above, and further in view of Shibata et al. (US-20210110236-A1).
Regarding Claim 3,
Huang, Martin, and Yeo teach the method according to claim 1.
Huang further teaches
wherein the multiplying the data on the channels of each of the n data blocks by the compensation factors corresponding to channels respectively to obtain n compensated data blocks comprises:
multiplying the data on the channels of each of the n data blocks by the compensation factors corresponding to the channels respectively (para [0072] “For example, 3-D filter 510-1 may be applied to 3-D input 520-N to generate an output feature map 530-N−1, and 3-D filter 510-M may be applied to 3-D input 520-N to generate an output feature map 530-N-M. Thus, there are N 3-D inputs and N 3-D outputs, where each 3-D output includes M output feature maps.” N 3-D outputs (i.e., compensated data blocks). para [0073]-[0075]),
Huang, Martin, and Yeo do not explicitly disclose
and rounding multiplication results to obtain the n compensated data blocks.
However, Shibata (US 20210110236 A1) teaches
and rounding multiplication results to obtain the n compensated data blocks (para [0177] “The result of the convolutional operation may be a floating point (real number) by the above addition process, however the real number is rounded by logarithmic quantization by the quantization part 44 and converted into an integer.”).
Huang, Martin, Yeo, and Shibata are analogous because they are directed towards the same field of endeavor of convolutional neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Huang, Martin, and Yeo e with the rounding operation of Shibata.
Doing so would allow for converting the convolutional result into a format that is compatible with the rest of the layers so that the convolutional operation circuit can be shared (Shibata para [0177]).
Claims 4 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Huang/Martin/Yeo, as applied above, and further in view of Lee et al. (US-20190042948-A1) and Nurvitadhi et al. (US-20190205746-A1).
Regarding Claim 4,
Huang, Martin, and Yeo teach the method according to claim 1.
Huang, Martin, and Yeo do not explicitly disclose
wherein in a case that n is an integer greater than 1, the n data blocks have different data accuracy, all data in the n compensated data blocks is fixed-point data, and the n compensated data blocks have the same data accuracy.
However, Lee teaches
wherein in a case that n is an integer greater than 1, all data in the n compensated data blocks is fixed-point data (fig. 1; para [0063] “The feature maps FM1 and FM2 may be high-dimensional matrices of two or more dimensions, and each may include activation parameters. When the feature maps FM1 and FM2 are for example, three-dimensional feature maps, each of the feature maps FM1 and FM2 a width W (or a number of columns), a height H (or a number of rows), and a depth D. The depth D may be correspond to a number of channels.” Each dimension of the FM2 is a compensated data block. There are two or more dimensions therefore n is greater than 1.), and the n compensated data blocks have the same data accuracy (para [0099] “FIG. 7 illustrates an example of a layer-wise quantization in which parameters of all channels of a layer of a neural network are quantized to the same fixed-point expression.” Same fixed-point expression (i.e., accuracy). para [0069]-[0072] This section describes what channels are.).
Huang, Martin, Yeo, and Lee are analogous because they are directed towards the same field of endeavor of convolutional neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Huang with the quantized fixed-precision of Lee.
Doing so would allow for retraining the fixed-point neural network to have a better accuracy compared to a floating-point network (Lee para [0166]).
Huang and Lee do not explicitly disclose
the n data blocks have different data accuracy,
However, Nurvitadhi (US 20190205746 A1) teaches
the n data blocks have different data accuracy (para [0254] “For fixed point representation 2820, the variables 1 and 2 each have a sign, a shared range (shared exponent), and different precision (mantissa). A NN 2850 has different ranges. For dynamic fixed point or block floating point representation 2860, the input (I) with variables 1 and 2 each have a sign, a shared range (shared exponent), and different precision (mantissa).” Different precision (i.e., different accuracy).),
Huang, Martin, Yeo, and Nurvitadhi are analogous because they are directed towards the same field of endeavor of convolutional neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Huang, Martin, and Yeo with the mixed precision of Nurvitadhi.
Doing so would allow for efficient support for dynamic adjustments in precision (Nurvitadhi para [0253]).
Claims 5 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Huang/Martin/Yeo, as applied above, and further in view of Gural et al. (US-11580191-B1).
Regarding Claim 5,
Huang, Martin, and Yeo teach the method according to claim 1.
Huang, Martin, and Yeo do not explicitly disclose
wherein the performing the element-wise operation on the n compensated data blocks to obtain the element-wise operation result comprises:
adding up or multiplying the n compensated data blocks to obtain the element-wise operation result; or
adding up or multiplying the n compensated data blocks to obtain a first operation result, and adding the first operation result with a bias factor to obtain the element-wise operation result.
However, Gural (US 11580191 B1) teaches
wherein the performing the element-wise operation on the n compensated data blocks to obtain the element-wise operation result comprises:
adding up or multiplying the n compensated data blocks to obtain the element-wise operation result (col. 5 lines 55-67; “In other words, each filter kernel may be thought of as a 3-dimensional (3D) kernel filter 225 of dimensions r×s×C, for r, s, and C all positive integers greater than zero, and each input matrix may be thought of as a 3D input block of matrices (“matrix block”) 226 of dimensions μ.sub.2×μ.sub.1×C. Such 3D filter kernel 225 may be element-wise multiplied with multipliers 205 and then summed with adders 228 such that a 3D input matrix block 226 of pixel input layer 210 generates an output matrix 223 in a 3D output matrix block 227 of dimension n×m×K, namely an output matrix 223 in an output channel of K output channels 221." The input matrix is transformed by equation (7) according to figure 3-1 to 3-4.); or
adding up or multiplying the n compensated data blocks to obtain a first operation result, and adding the first operation result with a bias factor to obtain the element-wise operation result.
Huang, Martin, Yeo and Gural are analogous because they are directed towards the same field of endeavor of convolutional neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Huang, Martin, and Yeo with the element-wise operation of Gural.
Doing so would allow for reducing the number of multiplication operations performed in a convolution resulting in fewer multiplications, fewer resources, less power consumption, and reduced latency (Gural col. 3 lines 29-35).
Claims 6 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Huang/Martin/Yeo, as applied above, and further in view of Jung et al. (US-20190347550-A1).
Regarding Claim 6,
Huang, Martin, and Yeo teach the method according to claim 1.
Huang, Martin, and Yeo do not explicitly disclose
wherein the outputting the element-wise operation result comprises:
quantizing the element-wise operation result to obtain first output data, a quantity of bits occupied by the first output data being a preset quantity of bits; and
outputting the first output data to a next network layer of the element-wise layer.
However, Jung (US 20190347550 A1) teaches
wherein the outputting the element-wise operation result comprises:
quantizing the element-wise operation result to obtain first output data, a quantity of bits occupied by the first output data being a preset quantity of bits (para [0086] “The data processing apparatus 100 may have a representation bit number for each layer as a parameter. The representation bit number is used to quantize a plurality of data distributed within a predetermined range. Here, a parameter indicating a predetermined range of data is referred to as a quantization parameter.” And para [0088] “The data processing apparatus 100 includes a representation bit number corresponding to an activation map output from each layer and an activation quantization parameter indicating a range of an activation map to be quantized.” And para [0089]); and
outputting the first output data to a next network layer of the element-wise layer (para [0100] “The processor 201 inputs the quantized first activation map into a second layer which is a subsequent layer. For example, the second layer may be a convolutional layer.”).
Huang, Martin, Yeo, and Jung are analogous because they are directed towards the same field of endeavor of convolutional neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Huang, Martin, and Yeo with the quantization of Jung.
Doing so would allow for improving the processing rate of the neural network while maintaining accuracy (Jung para [0074]).
Regarding Claim 8,
Huang, Martin, and Yeo teach the method according to claim 7.
Huang, Martin, and Yeo teach do not explicitly disclose
wherein the outputting the n compensated data blocks comprises:
quantizing data in the n compensated data blocks to obtain second output data, a quantity of bits occupied by the second output data being a preset quantity of bits; and
outputting the second output data to a next network layer of the element-wise layer.
However, Jung (US 20190347550 A1) teaches
wherein the outputting the n compensated data blocks comprises:
quantizing data in the n compensated data blocks to obtain second output data, a quantity of bits occupied by the second output data being a preset quantity of bits (para [0086] “The data processing apparatus 100 may have a representation bit number for each layer as a parameter. The representation bit number is used to quantize a plurality of data distributed within a predetermined range. Here, a parameter indicating a predetermined range of data is referred to as a quantization parameter.” And para [0088] “The data processing apparatus 100 includes a representation bit number corresponding to an activation map output from each layer and an activation quantization parameter indicating a range of an activation map to be quantized.” And para [0089]); and
outputting the second output data to a next network layer of the element-wise layer (para [0100] “The processor 201 inputs the quantized first activation map into a second layer which is a subsequent layer. For example, the second layer may be a convolutional layer.”).
Huang, Martin, Yeo, and Jung are analogous because they are directed towards the same field of endeavor of convolutional neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the convolutional neural network of Huang, Martin, and Yeo with the quantization of Jung.
Doing so would allow for improving the processing rate of the neural network while maintaining accuracy (Jung para [0074]).
Allowable Subject Matter
Claims 21-25 are allowed.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217. The examiner can normally be reached Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached at 5712723768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HENRY NGUYEN/Examiner, Art Unit 2121